Normalize columns of a dataframe

How to Normalize Columns of a DataFrame in Pandas
Do you have a DataFrame in Pandas where each column has a different value range? Are you wondering how to normalize these columns so that each value is between 0 and 1? Well, you've come to the right place! In this blog post, we will explore common issues with normalizing columns and provide easy solutions using pandas' built-in functionality. Let's dive in!
The Problem
Let's consider the following DataFrame:
A B C
1000 10 0.5
765 5 0.35
800 7 0.09The goal is to normalize the columns of this DataFrame so that each value falls within the range of 0 to 1. The desired output should be:
A B C
1 1 1
0.765 0.5 0.7
0.8 0.7 0.18 (which is 0.09/0.5)The Solution
Option 1: Using MinMaxScaler
One way to normalize the columns is by using the MinMaxScaler from the sklearn.preprocessing module. This scaler allows us to transform the data to a specific range, in our case, from 0 to 1.
Here's how you can accomplish this:
from sklearn.preprocessing import MinMaxScaler
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1000, 765, 800],
'B': [10, 5, 7],
'C': [0.5, 0.35, 0.09]})
# Perform column normalization
scaler = MinMaxScaler()
df_normalized = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)
# Print the normalized DataFrame
print(df_normalized)This will give you the following output:
A B C
0 1.000000 1.0 1.000000
1 0.765306 0.5 0.710145
2 0.800000 0.7 0.091954Option 2: Using apply with a Lambda Function
Another approach is to use the apply function with a lambda function to normalize the columns manually. This method gives you more flexibility if you want to customize the normalization logic.
Here's an example:
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'A': [1000, 765, 800],
'B': [10, 5, 7],
'C': [0.5, 0.35, 0.09]})
# Perform column normalization using apply and lambda function
df_normalized = df.apply(lambda x: (x - x.min()) / (x.max() - x.min()))
# Print the normalized DataFrame
print(df_normalized)The output will be the same as the previous method:
A B C
0 1.000000 1.0 1.000000
1 0.765306 0.5 0.710145
2 0.800000 0.7 0.091954Conclusion
Normalizing columns in a DataFrame is a common task when working with data analysis and machine learning. In this blog post, we explored two easy solutions to normalize columns in Pandas: using MinMaxScaler from sklearn.preprocessing and applying a lambda function with apply. Feel free to explore these methods and choose the one that best fits your needs.
If you found this blog post helpful, don't hesitate to share it with your friends! And if you have any questions or suggestions, we would love to hear from you in the comments below. Happy coding!
Take Your Tech Career to the Next Level
Our application tracking tool helps you manage your job search effectively. Stay organized, track your progress, and land your dream tech job faster.



