Pandas Dataframe – How to Rename Index

| 0 Comments| 6:28 pm


Mastering Index Renaming in Pandas DataFrame

The index in a Pandas DataFrame plays a critical role in structuring and accessing your data. While Pandas assigns a default sequential integer index to rows, customizing or renaming the index can make your DataFrame more descriptive and better aligned with the context of your analysis.
In this tutorial, we’ll explore various methods to rename, reset, and manipulate the index, with examples to demonstrate their usage.

Sample DataFrame

 

We will use the following DataFrame throughout this article:

import pandas as pd

# Sample DataFrame
data = {
    'Employee ID': ['E001', 'E002', 'E003', 'E004', 'E005'],
    'Name': ['John Doe', 'Alice Smith', 'Bob Johnson', 'Eve Davis', 'Charlie Brown'],
    'Department': ['HR', 'IT', 'Finance', 'Marketing', 'Operations'],
    'Age': [28, 34, 29, 42, 31],
    'Salary': [50000, 60000, 52000, 75000, 49000]
}

df = pd.DataFrame(data)
print(df)

Output:

  Employee ID           Name   Department  Age  Salary
0        E001      John Doe           HR   28   50000
1        E002   Alice Smith           IT   34   60000
2        E003   Bob Johnson       Finance   29   52000
3        E004     Eve Davis     Marketing   42   75000
4        E005  Charlie Brown    Operations   31   49000

1. Renaming the Index Axis Using rename_axis()

The rename_axis() method lets you assign a descriptive label to the index axis.
This method is particularly useful when you want to clarify the context of the index.

# Rename the index axis to 'Record ID'
df_renamed_axis = df.rename_axis('Record ID')
print(df_renamed_axis)

Output:

Record ID
  Employee ID           Name   Department  Age  Salary
0        E001      John Doe           HR   28   50000
1        E002   Alice Smith           IT   34   60000
2        E003   Bob Johnson       Finance   29   52000
3        E004     Eve Davis     Marketing   42   75000
4        E005  Charlie Brown    Operations   31   49000

2. Renaming Index Values Using set_index()

You can replace the default integer index with the values from a specific column using the set_index() method.

# Set 'Employee ID' as the new index
df_with_new_index = df.set_index('Employee ID')
print(df_with_new_index)

Output:

              Name   Department  Age  Salary
Employee ID                             
E001     John Doe           HR   28   50000
E002  Alice Smith           IT   34   60000
E003  Bob Johnson       Finance   29   52000
E004    Eve Davis     Marketing   42   75000
E005 Charlie Brown    Operations   31   49000

Example: Multiple Column Index

You can set multiple columns as a multi-level index:

# Set 'Department' and 'Name' as a multi-level index
df_multi_index = df.set_index(['Department', 'Name'])
print(df_multi_index)

Output:

                        Employee ID  Age  Salary
Department  Name                            
HR          John Doe           E001   28   50000
IT          Alice Smith        E002   34   60000
Finance     Bob Johnson        E003   29   52000
Marketing   Eve Davis          E004   42   75000
Operations  Charlie Brown      E005   31   49000

3. Renaming the Index In-Place Using index.name

If you want to directly rename the index axis without creating a new DataFrame, use the index.name attribute.

# Rename the index in place
df.index.name = 'Record ID'
print(df)

Output:

Record ID
  Employee ID           Name   Department  Age  Salary
0        E001      John Doe           HR   28   50000
1        E002   Alice Smith           IT   34   60000
2        E003   Bob Johnson       Finance   29   52000
3        E004     Eve Davis     Marketing   42   75000
4        E005  Charlie Brown    Operations   31   49000

4. Resetting the Index Using reset_index()

The reset_index() method restores the default integer index while optionally retaining the previous index as a column.

# Set 'Employee ID' as the index and reset it
df_with_new_index = df.set_index('Employee ID')
df_reset_index = df_with_new_index.reset_index()
print(df_reset_index)

Output:

  Employee ID           Name   Department  Age  Salary
0        E001      John Doe           HR   28   50000
1        E002   Alice Smith           IT   34   60000
2        E003   Bob Johnson       Finance   29   52000
3        E004     Eve Davis     Marketing   42   75000
4        E005  Charlie Brown    Operations   31   49000

5. Renaming Index Values Directly

You can directly rename the index values by modifying the index attribute of the DataFrame.

# Rename index values directly
df_renamed_index_values = df.copy()
df_renamed_index_values.index = ['Row1', 'Row2', 'Row3', 'Row4', 'Row5']
print(df_renamed_index_values)

Output:

      Employee ID           Name   Department  Age  Salary
Row1        E001      John Doe           HR   28   50000
Row2        E002   Alice Smith           IT   34   60000
Row3        E003   Bob Johnson       Finance   29   52000
Row4        E004     Eve Davis     Marketing   42   75000
Row5        E005  Charlie Brown    Operations   31   49000

6. Renaming Columns and Index Together

Using the rename() method, you can rename both the columns and the index simultaneously:

# Rename columns and index together
df_renamed = df.rename(columns={'Name': 'Employee Name', 'Salary': 'Monthly Salary'}, index={0: 'A', 1: 'B', 2: 'C'})
print(df_renamed)

Output:

      Employee ID Employee Name   Department  Age  Monthly Salary
A          E001      John Doe           HR   28          50000
B          E002   Alice Smith           IT   34          60000
C          E003   Bob Johnson       Finance   29          52000
D          E004     Eve Davis     Marketing   42          75000
E          E005  Charlie Brown    Operations   31          49000

Summary

Renaming the index and columns of a Pandas DataFrame provides flexibility and clarity in data analysis.
Here’s a quick summary of the methods:

  • rename_axis(): Assign a descriptive label to the index axis.
  • set_index(): Replace the default index with column values or create a multi-level index.
  • index.name: Rename the index axis in place.
  • reset_index(): Revert to the default integer index.
  • Direct Index Renaming: Modify index values directly.
  • rename(): Rename columns and index simultaneously.

With these techniques, you can handle DataFrame indices effectively for any data analysis task.

Reference: https://pandas.pydata.org/docs/reference/api/pandas.Index.rename.html

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Post