Pandas: How to Access a Column Using iterrows()
In Pandas, iterrows()
is commonly used to iterate over the rows of a DataFrame as (index, Series) pairs. During iteration, you can access specific columns of the DataFrame by referencing them within the loop. In this article, we’ll show how to access a column in Pandas using iterrows()
and provide examples for better understanding.
Accessing Columns with iterrows()
To access a specific column in each row during iteration, you can reference the column name from the row
object, which is a Pandas Series. Here’s an example of how to access a column using iterrows()
.
Example: Accessing a Column Using iterrows()
Consider a DataFrame where we want to access the “Name” and “Age” columns using iterrows()
.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({
'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 22],
'Gender': ['Male', 'Female', 'Male']
})
# Accessing columns using iterrows
for index, row in df.iterrows():
print(f"Name: {row['Name']}, Age: {row['Age']}")
Output:
Name: John, Age: 25
Name: Alice, Age: 30
Name: Bob, Age: 22
In this example, the loop iterates over each row, and within each iteration, we access the “Name” and “Age” columns using the column names inside the row
object. The column values are printed for each row.
Why Use iterrows() to Access Columns?
iterrows()
is helpful when you need to iterate row by row and perform specific operations or access multiple columns for each row. However, it’s important to note that iterrows()
is generally slower for large DataFrames, and vectorized operations are preferred for performance optimization. However, for small to medium-sized datasets or certain row-wise operations, it can be quite useful.
Alternative: Using apply()
for Column Access
While iterrows()
works fine for small DataFrames, it can be inefficient for larger ones. A faster alternative for accessing columns in a vectorized manner is the apply()
method. Here’s how you can achieve the same result using apply()
:
# Using apply to access columns
df.apply(lambda row: print(f"Name: {row['Name']}, Age: {row['Age']}"), axis=1)
Output:
Name: John, Age: 25
Name: Alice, Age: 30
Name: Bob, Age: 22
In this example, apply()
is used to iterate through each row (set with axis=1
) and access the “Name” and “Age” columns more efficiently.
Summary
Using iterrows()
in Pandas allows you to iterate over rows and access specific columns easily. You can access column values from the row
object in the iteration, and it’s useful for row-based operations. However, for better performance with large DataFrames, consider using vectorized operations or apply()
for column access.