Pandas KeyError When Using iterrows()
In Pandas, the iterrows() method is often used to iterate over rows of a DataFrame. However, you might encounter a KeyError while using this method, which typically happens when you’re trying to access a column that does not exist in the DataFrame or is incorrectly referenced. This article will explain the causes of this error and how to avoid it.
What Causes KeyError with iterrows()?
The KeyError when using iterrows() typically occurs due to one of the following reasons:
- Incorrect column name: You may be trying to access a column using a name that is misspelled or doesn’t exist.
- Indexing with a non-existent column: When iterating, you may be incorrectly trying to access a column that is not part of the row (for example, treating an index as a column).
- Trying to access the index as a column: The row’s index is not part of the row values, and accessing it as if it were can trigger a KeyError.
Example of KeyError with iterrows()
Here’s an example of how this error might occur:
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({
'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 22],
'Gender': ['Male', 'Female', 'Male']
})
# Attempt to iterate using iterrows() and access a non-existent column
for index, row in df.iterrows():
print(row['Salary'])
Output:
KeyError: 'Salary'
In this example, the column ‘Salary’ does not exist in the DataFrame, which leads to a KeyError when attempting to access it.
How to Fix KeyError with iterrows()
To fix this error, you need to ensure that the column you are trying to access exists in the DataFrame. Here are a few approaches:
1. Check Column Names Before Accessing
Before attempting to access a column, ensure that the column exists in the DataFrame:
# Check if 'Salary' exists in the DataFrame before accessing it
for index, row in df.iterrows():
if 'Salary' in row:
print(row['Salary'])
else:
print("Column 'Salary' not found.")
Output:
Column 'Salary' not found.
2. Access Columns Correctly Using Iterrows
Ensure you’re accessing the correct columns. If the column exists, access it by its actual name:
# Correct column access
for index, row in df.iterrows():
print(row['Name']) # Access 'Name' column correctly
print(row['Age']) # Access 'Age' column correctly
Output:
John
25
Alice
30
Bob
22
3. Using .loc for Row and Column Access
Instead of using iterrows(), you can use loc[] for accessing specific rows and columns, which is more efficient and avoids common pitfalls with iterrows():
# Access rows and columns with loc[]
for index, row in df.iterrows():
print(df.loc[index, 'Name']) # Access 'Name' for the given row index
Output:
John
Alice
Bob
Summary
The KeyError in Pandas when using iterrows() generally occurs due to incorrectly referencing a column that doesn’t exist or is misspelled. To resolve this error, ensure that the column names are correct, check if the column exists before accessing it, or consider using more efficient alternatives like loc[] for row and column access.
Frequently Asked Questions — How to Fix iterrows() KeyError in Pandas
Why do I get a KeyError when using iterrows() in Pandas?
KeyError usually occurs when the column name you’re trying to access doesn’t exist or contains hidden spaces, capitalization differences, or mismatched data types.
print(df.columns.tolist()) # Check exact column names
How to fix KeyError inside iterrows() when column name is correct?
Strip hidden whitespace and normalize column names before iterating:
df.columns = df.columns.str.strip()
Then verify again using print(df.columns).
Why does row['Column'] cause KeyError even if the column exists?
Each row from iterrows() is a Series that matches column labels exactly — even small typos, spaces, or case changes will trigger KeyError. Use exact column names.
How to handle KeyError caused by numeric column names in iterrows()?
Make sure you use the right key type — '0' (string) and 0 (integer) are different:
# Check column type
print(df.columns)
# Access correctly
print(row[0]) # integer key
print(row['0']) # string key
How to avoid KeyError while iterating through rows?
Always check if a column exists before accessing:
for i, row in df.iterrows():
if 'price' in df.columns:
print(row['price'])
or use row.get('price', default_value) for safer access.
Why do I get KeyError when using iterrows() after renaming columns?
If columns are renamed but the old names are still used in the loop, you’ll get KeyError. Confirm with:
print(df.columns)
and ensure you’re referencing the updated names.
Can NaN or missing data cause KeyError in iterrows()?
No. Missing or NaN values do not trigger KeyError — only missing keys (columns or labels) do. Verify column labels, not data.
How to fix KeyError when DataFrame has duplicate columns?
Rename columns to make them unique before looping:
df.columns = [f'col_{i}' for i in range(df.shape[1])]
How to prevent KeyError permanently when using iterrows()?
Use defensive coding: strip headers, check membership, and handle exceptions gracefully:
for i, row in df.iterrows():
try:
val = row['price']
except KeyError:
val = 0
What’s the best alternative to iterrows() for reliable column access?
Use itertuples() for faster, attribute-based access or vectorized operations for efficiency:
for row in df.itertuples():
print(row.price)