Pandas iterrows keyerror – How to Fix

| 0 Comments| 11:21 am


Pandas KeyError When Using iterrows()

In Pandas, the iterrows() method is often used to iterate over rows of a DataFrame. However, you might encounter a KeyError while using this method, which typically happens when you’re trying to access a column that does not exist in the DataFrame or is incorrectly referenced. This article will explain the causes of this error and how to avoid it.

What Causes KeyError with iterrows()?

The KeyError when using iterrows() typically occurs due to one of the following reasons:

  • Incorrect column name: You may be trying to access a column using a name that is misspelled or doesn’t exist.
  • Indexing with a non-existent column: When iterating, you may be incorrectly trying to access a column that is not part of the row (for example, treating an index as a column).
  • Trying to access the index as a column: The row’s index is not part of the row values, and accessing it as if it were can trigger a KeyError.

Example of KeyError with iterrows()

Here’s an example of how this error might occur:

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'Name': ['John', 'Alice', 'Bob'],
    'Age': [25, 30, 22],
    'Gender': ['Male', 'Female', 'Male']
})

# Attempt to iterate using iterrows() and access a non-existent column
for index, row in df.iterrows():
    print(row['Salary'])

Output:

KeyError: 'Salary'

In this example, the column ‘Salary’ does not exist in the DataFrame, which leads to a KeyError when attempting to access it.

How to Fix KeyError with iterrows()

To fix this error, you need to ensure that the column you are trying to access exists in the DataFrame. Here are a few approaches:

1. Check Column Names Before Accessing

Before attempting to access a column, ensure that the column exists in the DataFrame:

# Check if 'Salary' exists in the DataFrame before accessing it
for index, row in df.iterrows():
    if 'Salary' in row:
        print(row['Salary'])
    else:
        print("Column 'Salary' not found.")

Output:

Column 'Salary' not found.

2. Access Columns Correctly Using Iterrows

Ensure you’re accessing the correct columns. If the column exists, access it by its actual name:

# Correct column access
for index, row in df.iterrows():
    print(row['Name'])  # Access 'Name' column correctly
    print(row['Age'])   # Access 'Age' column correctly

Output:

John
25
Alice
30
Bob
22

3. Using .loc for Row and Column Access

Instead of using iterrows(), you can use loc[] for accessing specific rows and columns, which is more efficient and avoids common pitfalls with iterrows():

# Access rows and columns with loc[]
for index, row in df.iterrows():
    print(df.loc[index, 'Name'])  # Access 'Name' for the given row index

Output:

John
Alice
Bob

Summary

The KeyError in Pandas when using iterrows() generally occurs due to incorrectly referencing a column that doesn’t exist or is misspelled. To resolve this error, ensure that the column names are correct, check if the column exists before accessing it, or consider using more efficient alternatives like loc[] for row and column access.

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Post