0
0
MlopsHow-ToBeginner · 4 min read

How to Evaluate Regression Model in Python with sklearn

To evaluate a regression model in Python, use sklearn.metrics functions like mean_squared_error, mean_absolute_error, and r2_score. These metrics measure how close your model's predictions are to the actual values, helping you understand its accuracy.
📐

Syntax

Here are the main functions to evaluate regression models in sklearn:

  • mean_squared_error(y_true, y_pred): Calculates the average squared difference between actual and predicted values.
  • mean_absolute_error(y_true, y_pred): Calculates the average absolute difference between actual and predicted values.
  • r2_score(y_true, y_pred): Measures how well the model explains the variance in the data (1 is perfect, 0 means no explanation).
python
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# y_true: actual values
# y_pred: predicted values

mse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
💻

Example

This example shows how to train a simple linear regression model and evaluate it using MSE, MAE, and R2 score.

python
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split

# Create sample data
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate predictions
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}")
print(f"Mean Absolute Error: {mae:.2f}")
print(f"R2 Score: {r2:.2f}")
Output
Mean Squared Error: 92.69 Mean Absolute Error: 7.68 R2 Score: 0.89
⚠️

Common Pitfalls

Common mistakes when evaluating regression models include:

  • Using classification metrics like accuracy instead of regression metrics.
  • Not splitting data into train and test sets, leading to overly optimistic results.
  • Ignoring the scale of errors; for example, MSE squares errors so large errors impact it more.
  • Misinterpreting R2 score: a negative R2 means the model is worse than predicting the mean.
python
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Wrong: Using accuracy for regression
# accuracy = accuracy_score(y_test, y_pred)  # This will raise an error or give meaningless result

# Right: Use regression metrics
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
📊

Quick Reference

MetricDescriptionIdeal Value
Mean Squared Error (MSE)Average of squared differences between actual and predicted values0 (lower is better)
Mean Absolute Error (MAE)Average of absolute differences between actual and predicted values0 (lower is better)
R2 ScoreProportion of variance explained by the model1 (higher is better)

Key Takeaways

Use sklearn.metrics functions like mean_squared_error, mean_absolute_error, and r2_score to evaluate regression models.
Always split your data into training and testing sets to get realistic evaluation results.
Mean Squared Error penalizes larger errors more than Mean Absolute Error.
R2 score shows how well your model explains the data variance; closer to 1 is better.
Avoid using classification metrics like accuracy for regression problems.