In Pandas, accessing columns by name is a very common operation. It’s simple and effective when you know the exact column name you’re working with. You can use the column name directly to access the data. This article will explore different ways to access columns by their names in a Pandas DataFrame.
The most straightforward way to access a column by name is to use the bracket notation. This method allows you to retrieve a single column or a group of columns by their names.
import pandas as pd
# Sample DataFrame
data = {'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 22],
'Gender': ['Male', 'Female', 'Male']}
df = pd.DataFrame(data)
# Access the 'Age' column
age_column = df['Age']
print(age_column)
Output:
0 25
1 30
2 22
Name: Age, dtype: int64
In this example, we use bracket notation df['Age']
to access the ‘Age’ column.
# Access 'Name' and 'Gender' columns
subset_columns = df[['Name', 'Gender']]
print(subset_columns)
Output:
Name Gender
0 John Male
1 Alice Female
2 Bob Male
In this example, we use bracket notation with a list of column names df[['Name', 'Gender']]
to access multiple columns.
loc[]
loc[]
is another way to access columns by name. While iloc[]
is used for position-based selection, loc[]
is label-based and allows you to select columns by their names. It’s useful when you need to select specific rows or columns based on labels.
loc[]
# Select 'Age' column using loc[]
age_column = df.loc[:, 'Age']
print(age_column)
Output:
0 25
1 30
2 22
Name: Age, dtype: int64
In this example, df.loc[:, 'Age']
selects all rows for the ‘Age’ column.
get()
MethodThe get()
method can be used to access a column by its name, similar to bracket notation, but it has the advantage of returning None
if the column doesn’t exist, instead of raising an error.
get()
# Use get() to access the 'Gender' column
gender_column = df.get('Gender')
print(gender_column)
Output:
0 Male
1 Female
2 Male
Name: Gender, dtype: object
In this example, df.get('Gender')
retrieves the ‘Gender’ column. If the column does not exist, it will return None
instead of raising an error.
columns
AttributeAnother way to access columns by name is by using the columns
attribute. This method allows you to first check the available column names and then access the column using either loc[]
or bracket notation.
columns
Attribute# Get the column name and access using loc[]
column_name = df.columns[1] # Access column name at index position 1
column = df[column_name]
print(column)
Output:
0 25
1 30
2 22
Name: Age, dtype: int64
In this example, we first access the column name at position 1 using df.columns[1]
, and then we access that column using df[column_name]
.
at[]
(For Single Value)If you want to access a single value in a DataFrame, you can use at[]
, which is more efficient for accessing a single cell. It is similar to iat[]
but works with labels instead of integer positions.
at[]
# Access the value at the first row and 'Age' column
value = df.at[0, 'Age']
print(value)
Output:
25
In this example, we use at[0, 'Age']
to access the value at the first row of the ‘Age’ column.
Accessing columns by name in Pandas is a common task when working with data. You can easily use bracket notation, loc[]
, or the get()
method to retrieve a single or multiple columns by name. If you need to access specific rows, you can combine these methods with row indices or labels. Using these techniques allows for efficient data manipulation and analysis.
Pandas: How to Access or Select Columns by Index, not by Name In Pandas, accessing…
Pandas: How to Access Row by Index In Pandas, you can access rows in a…
Pandas: How to Access a Column Using iterrows() In Pandas, iterrows() is commonly used to…
Pandas - How to Update Values in iterrows In Pandas, iterrows() is a popular method…
Pandas KeyError When Using iterrows() In Pandas, the iterrows() method is often used to iterate…
Pandas DataFrame KeyError: 0 - Trying to access column or index that does not exist…