How to Access Column by Name in Pandas

Pandas: How to Access Columns by Name

In Pandas, accessing columns by name is a very common operation. It’s simple and effective when you know the exact column name you’re working with. You can use the column name directly to access the data. This article will explore different ways to access columns by their names in a Pandas DataFrame.

Method 1: Access Column by Name Using Bracket Notation

The most straightforward way to access a column by name is to use the bracket notation. This method allows you to retrieve a single column or a group of columns by their names.

Example: Access Single Column by Name

import pandas as pd
# Sample DataFrame
data = {'Name': ['John', 'Alice', 'Bob'],
        'Age': [25, 30, 22],
        'Gender': ['Male', 'Female', 'Male']}
df = pd.DataFrame(data)
# Access the 'Age' column
age_column = df['Age']
print(age_column)

Output:

0    25
1    30
2    22
Name: Age, dtype: int64

In this example, we use bracket notation df['Age'] to access the ‘Age’ column.

Example: Access Multiple Columns by Name

# Access 'Name' and 'Gender' columns
subset_columns = df[['Name', 'Gender']]
print(subset_columns)

Output:

Name  Gender
0   John    Male
1  Alice  Female
2    Bob    Male

In this example, we use bracket notation with a list of column names df[['Name', 'Gender']] to access multiple columns.

Method 2: Access Column by Name Using loc[]

loc[] is another way to access columns by name. While iloc[] is used for position-based selection, loc[] is label-based and allows you to select columns by their names. It’s useful when you need to select specific rows or columns based on labels.

Example: Access Column Using loc[]

# Select 'Age' column using loc[]
age_column = df.loc[:, 'Age']
print(age_column)

Output:

0    25
1    30
2    22
Name: Age, dtype: int64

In this example, df.loc[:, 'Age'] selects all rows for the ‘Age’ column.

Method 3: Access Column Using get() Method

The get() method can be used to access a column by its name, similar to bracket notation, but it has the advantage of returning None if the column doesn’t exist, instead of raising an error.

Example: Access Column Using get()

# Use get() to access the 'Gender' column
gender_column = df.get('Gender')
print(gender_column)

Output:

0      Male
1    Female
2      Male
Name: Gender, dtype: object

In this example, df.get('Gender') retrieves the ‘Gender’ column. If the column does not exist, it will return None instead of raising an error.

Method 4: Access Column Using columns Attribute

Another way to access columns by name is by using the columns attribute. This method allows you to first check the available column names and then access the column using either loc[] or bracket notation.

Example: Access Column Using columns Attribute

# Get the column name and access using loc[]
column_name = df.columns[1]  # Access column name at index position 1
column = df[column_name]
print(column)

Output:

0    25
1    30
2    22
Name: Age, dtype: int64

In this example, we first access the column name at position 1 using df.columns[1], and then we access that column using df[column_name].

Method 5: Access Column Using at[] (For Single Value)

If you want to access a single value in a DataFrame, you can use at[], which is more efficient for accessing a single cell. It is similar to iat[] but works with labels instead of integer positions.

Example: Access a Value Using at[]

# Access the value at the first row and 'Age' column
value = df.at[0, 'Age']
print(value)

Output:

25

In this example, we use at[0, 'Age'] to access the value at the first row of the ‘Age’ column.

Methods to access columns by name in Pandas

  • Using bracket notation (df[‘column_name’])
  • Using loc[] for label-based selection
  • Using get() to avoid errors when a column doesn’t exist
  • Using columns to access columns by their index position
  • Using at[] for accessing a single value from a specific column

Summary

Accessing columns by name in Pandas is a common task when working with data. You can easily use bracket notation, loc[], or the get() method to retrieve a single or multiple columns by name. If you need to access specific rows, you can combine these methods with row indices or labels. Using these techniques allows for efficient data manipulation and analysis.

Frequently Asked Questions — How to Access Column by Name in Pandas

How do I access a column by name in Pandas?

You can access a column by name using square brackets [] with the column label:

import pandas as pd
df = pd.DataFrame({'Name': ['A', 'B'], 'Age': [20, 25]})
print(df['Name'])

This returns the Name column as a Pandas Series.

Can I access multiple columns by name?

Yes. Pass a list of column names inside double brackets:

df[['Name', 'Age']]

This returns a new DataFrame with the selected columns.

How to access a column using dot notation in Pandas?

If the column name is a valid Python identifier (no spaces or special characters), you can use dot notation:

df.Name

However, it’s safer to use df['Name'] since not all column names support dot access.

How to access a column name stored in a variable?

Use a variable instead of a hardcoded string:

col = 'Age'
print(df[col])
How do I check if a column name exists before accessing it?

Use the in keyword:

if 'Salary' in df.columns:
    print(df['Salary'])

This prevents KeyError if the column doesn’t exist.

How to access column values as a list?

Convert the column Series to a list using tolist():

names = df['Name'].tolist()

This gives a standard Python list of values.

How to access a column by name using loc?

Use the label-based indexer loc:

df.loc[:, 'Age']

This is equivalent to df['Age'] but supports advanced label selection.

How to access multiple columns by name using loc?

Provide a list of column names:

df.loc[:, ['Name', 'Age']]
How to rename or change column names in Pandas?

Use the rename() method:

df.rename(columns={'Name': 'Full_Name'}, inplace=True)

This updates the column name in place.

How to access columns dynamically inside a loop?

You can iterate through column names using df.columns and access them inside the loop:

for col in df.columns:
    print(col, df[col].head())
Exit mobile version