Residual analysis helps us check how well a model fits data by looking at the differences between actual and predicted values.
0
0
Residual analysis in ML Python
Introduction
After training a regression model to see if predictions are accurate.
To find patterns in errors that suggest the model is missing something.
When deciding if a model is good enough for making decisions.
To check if assumptions about data (like constant error size) hold true.
When comparing different models to pick the best one.
Syntax
ML Python
residuals = actual_values - predicted_values
Residuals are simply the difference between what really happened and what the model guessed.
They help us find if the model is making consistent mistakes.
Examples
This example calculates residuals for three points and prints them.
ML Python
actual = [3, 5, 7] predicted = [2.5, 5.5, 6.8] residuals = [a - p for a, p in zip(actual, predicted)] print(residuals)
Using numpy arrays to find residuals for faster calculations on bigger data.
ML Python
import numpy as np actual = np.array([10, 15, 20]) predicted = np.array([9, 14, 22]) residuals = actual - predicted print(residuals)
Sample Program
This program trains a simple linear regression model, predicts values, calculates residuals, and shows the mean squared error to measure overall error size.
ML Python
import numpy as np from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error # Sample data X = np.array([[1], [2], [3], [4], [5]]) y = np.array([3, 4, 2, 5, 6]) # Train a simple linear regression model model = LinearRegression() model.fit(X, y) # Predict values predictions = model.predict(X) # Calculate residuals residuals = y - predictions # Calculate mean squared error mse = mean_squared_error(y, predictions) print(f"Predictions: {predictions}") print(f"Residuals: {residuals}") print(f"Mean Squared Error: {mse:.3f}")
OutputSuccess
Important Notes
Residuals close to zero mean the model predicts well for those points.
Look for patterns in residuals; random scatter means good fit, patterns mean problems.
Residual analysis is mostly used for regression, not classification.
Summary
Residuals show the difference between actual and predicted values.
They help check if a model fits data well or misses patterns.
Mean squared error summarizes the average size of residuals.