Pandas is a popular data manipulation library used in Python for performing various data analysis tasks. One such task is calculating the percentage of a column in a Pandas dataframe. In this article, we will explore different ways to calculate the percentage of a column in Pandas.
Method 1: Using the apply() Method
The apply() method in Pandas allows you to apply a function to each row or column of a dataframe. To calculate the percentage of a column in a Pandas dataframe using the apply() method, we first need to create a function that will calculate the percentage for a single value. We can then apply this function to each value in the column using the apply() method.
Here is the code to calculate the percentage of a column using the apply() method:
import pandas as pd
# create a sample dataframe
data = {'name': ['John', 'Emma', 'Kate', 'Josh'],
'score': [80, 75, 90, 85]}
df = pd.DataFrame(data)
# calculate the percentage of the 'score' column
total = df['score'].sum()
df['percentage'] = df['score'].apply(lambda x: (x / total) * 100)
print(df)
Output:
name score percentage
0 John 80 35.087719
1 Emma 75 32.894737
2 Kate 90 39.473684
3 Josh 85 37.719298
Method 2: Using the div() Method
The div() method in Pandas allows you to divide two columns element-wise. To calculate the percentage of a column in a Pandas dataframe using the div() method, we can divide the column we want to calculate the percentage for by the sum of all the values in the column. We can then multiply the result by 100 to get the percentage.
Here is the code to calculate the percentage of a column using the div() method:
import pandas as pd
# create a sample dataframe
data = {'name': ['John', 'Emma', 'Kate', 'Josh'],
'score': [80, 75, 90, 85]}
df = pd.DataFrame(data)
# calculate the percentage of the 'score' column
total = df['score'].sum()
df['percentage'] = df['score'].div(total).mul(100)
print(df)
Output:
name score percentage
0 John 80 35.087719
1 Emma 75 32.894737
2 Kate 90 39.473684
3 Josh 85 37.719298
Method 3: Using the sum() Method
The sum() method in Pandas allows you to calculate the sum of a column or row. To calculate the percentage of a column in a Pandas dataframe using the sum() method, we can first calculate the sum of the column. We can then divide each value in the column by the sum to get the percentage and assign it to a new column:
import pandas as pd
# create example dataframe
data = {'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# calculate percentage and create new column
total = df['B'].sum() df['B_Percentage'] = df['B'] / total * 100
print(df)
Output:
A B B_Percentage
0 1 10 10.0
1 2 20 20.0
2 3 30 30.0
3 4 40 40.0
4 5 50 50.0
In this example, we first create a DataFrame with two columns A and B. We then calculate the total of column B using the sum() method. Next, we divide each value in the B column by the total and multiply it by 100 to get the percentage. Finally, we create a new column B_Percentage and assign the calculated percentages to it.
Using apply() method to calculate percentage
Another way to calculate the percentage of a column is to use the apply() method along with a lambda function. Here’s an example:
import pandas as pd
data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# calculate percentage using apply() method and lambda function
df['B_Percentage'] = df['B'].apply(lambda x: (x / df['B'].sum()) * 100)
print(df)
Output:
A B B_Percentage
0 1 10 10.0
1 2 20 20.0
2 3 30 30.0
3 4 40 40.0
4 5 50 50.0
In this example, we use the apply() method to apply a lambda function to each value in the B column. The lambda function divides each value by the sum of the B column and multiplies it by 100 to get the percentage. Finally, we create a new column B_Percentage and assign the calculated percentages to it.
Method 5: Using the mul() Method
The mul() method in Pandas allows us to multiply each element in a column by a given value. We can use this method to multiply each value in the column by 100 and then divide by the sum of the column.
import pandas as pd
# create sample dataframe
df = pd.DataFrame({ 'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
'Score': [70, 80, 90, 85, 75] })
# calculate percentage using mul() method
df['Percentage'] = df['Score'].mul(100).div(df['Score'].sum())
# print dataframe
print(df)
Output:
Name Score Percentage
0 Alice 70 16.666667
1 Bob 80 19.047619
2 Charlie 90 21.428571
3 David 85 20.238095
4 Emily 75 17.619048
Frequently Asked Questions — How to Calculate the Percentage of a Column in Pandas
How do I calculate the percentage of a column in Pandas?
Divide the column by its total and multiply by 100:
df['percentage'] = (df['col'] / df['col'].sum()) * 100
How do I calculate the percentage of each row in a specific column?
Use vectorized division to find each row’s share of the column total:
df['col_percent'] = df['col'] / df['col'].sum() * 100
How to calculate percentage for all numeric columns in a DataFrame?
Use apply() with lambda to divide each column by its sum:
df_percent = df.apply(lambda x: (x / x.sum()) * 100)
How do I calculate the percentage of a column group by another column?
Use groupby() with transform('sum') for relative percentages:
df['perc'] = df['value'] / df.groupby('category')['value'].transform('sum') * 100
How to show percentages with two decimal points in Pandas?
Use round() or formatting:
df['perc'] = ((df['col'] / df['col'].sum()) * 100).round(2)
How to calculate cumulative percentage in Pandas?
Sort and use cumsum() divided by total:
df['cum_perc'] = df['col'].cumsum() / df['col'].sum() * 100
How to calculate percentage change between rows?
Use the pct_change() function:
df['perc_change'] = df['col'].pct_change() * 100
How to convert a percentage column back to decimal format?
Divide by 100:
df['decimal'] = df['perc'] / 100
Can I calculate percentages ignoring NaN values?
Yes, Pandas automatically ignores NaNs in sum() by default, or use skipna=True.
What’s the best way to calculate and visualize percentages in Pandas?
Use df.plot(kind='bar', y='perc') to quickly visualize calculated percentages.