How to Use MinMaxScaler in sklearn with Python
Use
MinMaxScaler from sklearn.preprocessing to scale features to a given range, usually 0 to 1. Fit the scaler on your training data using fit() or fit_transform(), then transform your data with transform().Syntax
The MinMaxScaler scales each feature to a given range, defaulting to 0 and 1. You create an instance, optionally set the feature_range, then fit it to your data and transform it.
MinMaxScaler(feature_range=(min, max)): creates the scaler with desired output range.fit(X): computes min and max values from dataX.transform(X): scales dataXusing the fitted min and max.fit_transform(X): fits and transforms in one step.
python
from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler(feature_range=(0, 1)) # Create scaler scaler.fit(X) # Learn min and max from data X X_scaled = scaler.transform(X) # Scale data # Or combine fit and transform X_scaled = scaler.fit_transform(X)
Example
This example shows how to scale a small dataset with MinMaxScaler. The original data is scaled to the range 0 to 1.
python
from sklearn.preprocessing import MinMaxScaler import numpy as np # Sample data: 3 samples, 2 features X = np.array([[10, 200], [15, 300], [20, 400]]) scaler = MinMaxScaler() # Default range 0 to 1 X_scaled = scaler.fit_transform(X) print("Original data:\n", X) print("Scaled data:\n", X_scaled)
Output
Original data:
[[ 10 200]
[ 15 300]
[ 20 400]]
Scaled data:
[[0. 0. ]
[0.5 0.5]
[1. 1. ]]
Common Pitfalls
Common mistakes include:
- Not fitting the scaler on training data before transforming test data, which causes errors or wrong scaling.
- Fitting and transforming test data separately, which leaks information from test to train.
- Forgetting to transform new data with the same scaler used on training data.
Always fit the scaler only on training data, then use transform() on test or new data.
python
from sklearn.preprocessing import MinMaxScaler import numpy as np # Training data X_train = np.array([[10, 200], [15, 300], [20, 400]]) # Test data X_test = np.array([[12, 250], [18, 350]]) scaler = MinMaxScaler() scaler.fit(X_train) # Fit only on training data X_train_scaled = scaler.transform(X_train) # Transform training data X_test_scaled = scaler.transform(X_test) # Transform test data with same scaler print("Scaled training data:\n", X_train_scaled) print("Scaled test data:\n", X_test_scaled)
Output
Scaled training data:
[[0. 0. ]
[0.5 0.5]
[1. 1. ]]
Scaled test data:
[[0.1 0.25]
[0.8 0.75]]
Quick Reference
Summary tips for using MinMaxScaler:
- Use
fit()on training data only. - Use
transform()on test or new data. - Default scaling range is 0 to 1, but you can change it with
feature_range. - Use
fit_transform()to fit and scale training data in one step.
Key Takeaways
Fit MinMaxScaler only on training data to avoid data leakage.
Transform test and new data using the scaler fitted on training data.
Default scaling range is 0 to 1 but can be customized with feature_range.
Use fit_transform() to fit and scale training data in one step.
MinMaxScaler scales each feature independently to the specified range.