PythonPandas.com

Adding New Column to Existing DataFrame in Pandas



Adding new columns is one of the most frequent operations in data manipulation. Whether you are creating derived metrics, filling default values, or merging additional data.
In this article, we will learn how to add New Column to Existing DataFrame in Pandas.

 # Example starting DataFrame 
import pandas as pd 
df = pd.DataFrame({ 'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35] }) 

# We want to add a new column "BirthYear", "IsAdult", etc. 

Method 1: Simple Assignment via Bracket Notation (Most Common)

import pandas as pd

df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})

# Add a new column with scalar (same value for all rows)
df['Country'] = 'USA'

# Add a new column from a list (must match length)
df['BirthYear'] = 2025 - df['Age']

print(df)

Output:

Name Age Country BirthYear
0 Alice 25 USA 2000 
1 Bob 30 USA 1995 
2 Charlie 35 USA 1990 

This method modifies the DataFrame in place.

Method 2: Using DataFrame.insert() (Insert at Specific Position)


import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3],
'B': [10, 20, 30]
})

# Insert a column 'C' at index position 1 (between A and B)

df.insert(1, 'C', [100, 200, 300])

print(df)

Why use this?
> You control where the new column appears (not just appended at end)
> Raises error if the column name already exists (unless allow_duplicates=True)

Method 3: Using DataFrame.assign() (Chaining-friendly, Returns New DF)


import pandas as pd

df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})

# Use assign, this returns a new DataFrame

new_df = df.assign(
BirthYear = 2025 - df['Age'],
IsAdult = df['Age'] >= 18
)

print(new_df)

Output:

 
  Name Age BirthYear IsAdult
 0 Alice 25 2000 True 
 1 Bob 30 1995 True  
 2 Charlie 35 1990 True 

Why use this?
> Good in method chaining pipelines
> No risk of modifying original unexpectedly
> You can define multiple new columns in one call using lambdas or expressions

Note: assign() creates a new DataFrame. If you want to modify the original, reassign it: df = df.assign(…).

Method 4: Using loc for Controlled Assignment

import pandas as pd

df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})

# Use loc to assign a new column (all rows)
df.loc[:, 'IsSenior'] = df['Age'] > 30

print(df)

Output:

 
   Name Age IsSenior
0 Alice 25 False 
1 Bob 30 False 
2 Charlie 35 True 

Why use this?
> Avoids SettingWithCopyWarning when df is a slice
> More explicit about rows and columns being modified

Method 5: From a Dictionary / Mapping / Another Series

You can use an external mapping or Series to populate values based on your DataFrame’s index or keys.

import pandas as pd

df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})

# Suppose we have a mapping from name to city

city_map = {
'Alice': 'NYC',
'Bob': 'LA',
'Charlie': 'Chicago'
}

df['City'] = df['Name'].map(city_map)

print(df)

Output:

 Name Age City 
0 Alice 25 NYC 
1 Bob 30 LA 
2 Charlie 35 Chicago 

Why use this?
> Great when adding a column from lookup or external data
> Aligns based on matching keys (e.g. names)
> If keys missing, produces NaN for unmatched rows

Method 6: Adding Multiple Columns via concat or pd.concat

import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3]
})

new_cols = pd.DataFrame({
'B': [10, 20, 30],
'C': [100, 200, 300]
}, index=df.index)

df = pd.concat([df, new_cols], axis=1)

print(df)

Output:

  A B C 
0 1 10 100 
1 2 20 200 
2 3 30 300 

Why use this?
> Useful when you already have a separate DataFrame or Series of new columns
> Works well for batch addition of multiple columns

Edge Cases & Tips

Mismatched length – If you assign a list / array whose length doesn’t match the number of rows, Pandas will raise a ValueError.

Index alignment for Series –
When you assign a Series, Pandas aligns by index. If your Series index doesn’t match the DataFrame’s, some rows may become NaN.

Using insert() overwrites existing column?
insert() by default raises a ValueError if the column name exists (unless allow_duplicates=True).

Performance with many columns – If you need to add many new columns (especially in loops), it can be more efficient to build a separate DataFrame or Series and then merge/concat (rather than repeatedly modifying df) to avoid repeated reallocation.

FAQs — Adding New Column to Existing DataFrame in Pandas

How to add a new column to an existing DataFrame in Pandas?

You can directly assign a new column name with a list or Series:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3]})
df['B'] = [4, 5, 6]
print(df)

This creates a new column B in the existing DataFrame.

How to add a new column to a Pandas DataFrame with a default value?

Assign a scalar value to the new column name:

df['new_col'] = 0

All rows in new_col will be initialized with 0.

How to add a new column in Pandas based on another column?

Perform operations on existing columns to create a new one:

df['C'] = df['A'] + df['B']

This adds a column C which is the sum of columns A and B.

How to insert a column at a specific position in a Pandas DataFrame?

Use DataFrame.insert() to place the column at a chosen index:

df.insert(1, 'new_col', [10, 20, 30])

This inserts new_col as the second column in the DataFrame.

How to add a new column to a DataFrame using assign() in Pandas?

assign() returns a new DataFrame with the added column:

df = df.assign(D=lambda x: x['A'] * 2)

This creates column D with double the values of A.

How to add a column to a Pandas DataFrame from a list?

Simply assign the list directly to a new column:

df['E'] = ['x', 'y', 'z']

The list length must match the number of rows in the DataFrame.

How to add a new column from another DataFrame in Pandas?

You can assign a column from another DataFrame using index alignment:

df['new_col'] = other_df['col_name']

Values are matched by index — use reset_index() if indices differ.

How to add a new column conditionally in Pandas?

Use numpy.where() for conditional column creation:

import numpy as np
df['flag'] = np.where(df['A'] > 1, 'High', 'Low')

This adds a new column based on a condition.

How to add a new column to a DataFrame inside a loop?

Use loc or simple assignment within your loop:

for i in range(len(df)):
    df.loc[i, 'LoopVal'] = df.loc[i, 'A'] * 2

However, vectorized operations are preferred for performance.


Related Post