0
0
ML Pythonprogramming~5 mins

Feature importance in regression in ML Python

Choose your learning style9 modes available
Introduction

Feature importance helps us understand which input factors affect the prediction the most. It shows what matters in the data for the model.

You want to know which features influence house prices the most.
You want to simplify a model by keeping only important features.
You want to explain to others why the model makes certain predictions.
You want to detect if some features are not useful or redundant.
You want to improve model performance by focusing on key features.
Syntax
ML Python
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(X_train, y_train)
importances = model.feature_importances_

feature_importances_ is an attribute that gives importance scores for each feature after training.

This example uses a Random Forest model, which naturally provides feature importance.

Examples
Linear regression uses coefficients as a measure of feature importance.
ML Python
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
coefficients = model.coef_
Random Forest provides feature importance based on how much each feature reduces error.
ML Python
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=100)
model.fit(X_train, y_train)
importances = model.feature_importances_
Plotting feature importance helps visualize which features matter most.
ML Python
import matplotlib.pyplot as plt
plt.bar(feature_names, importances)
plt.title('Feature Importance')
plt.show()
Sample Program

This code trains a Random Forest regressor on the Boston housing dataset and prints the importance of each feature.

ML Python
from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
import numpy as np

# Load data
boston = load_boston()
X, y = boston.data, boston.target
feature_names = boston.feature_names

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)

# Get feature importance
importances = model.feature_importances_

# Print feature importance
for name, importance in zip(feature_names, importances):
    print(f"{name}: {importance:.3f}")
OutputSuccess
Important Notes

Feature importance values are relative and sum to 1.

Different models calculate importance differently; Random Forest uses how much each feature reduces error.

High importance means the feature strongly influences predictions.

Summary

Feature importance shows which inputs affect the model most.

Random Forest models provide easy access to feature importance.

Use feature importance to explain, simplify, or improve models.