0
0
ML Pythonprogramming~5 mins

Learning curves in ML Python

Choose your learning style9 modes available
Introduction

Learning curves help us see how well a model learns over time. They show if the model is improving or stuck.

When you want to check if your model is learning well with more data.
When you want to find out if your model is too simple or too complex.
When you want to decide if adding more training data will help your model.
When you want to compare different models or settings to pick the best one.
Syntax
ML Python
from sklearn.model_selection import learning_curve

train_sizes, train_scores, test_scores = learning_curve(
    estimator, X, y, cv=5, train_sizes=[0.1, 0.3, 0.5, 0.7, 1.0])

estimator is your machine learning model.

X and y are your data and labels.

Examples
Basic example using logistic regression with default settings.
ML Python
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import learning_curve

model = LogisticRegression()
train_sizes, train_scores, test_scores = learning_curve(model, X, y)
Specifying 3-fold cross-validation and custom training sizes.
ML Python
train_sizes, train_scores, test_scores = learning_curve(
    model, X, y, cv=3, train_sizes=[0.2, 0.4, 0.6, 0.8, 1.0])
Sample Program

This code loads the iris flower data, trains a logistic regression model on different amounts of data, and prints the average training and testing accuracy for each size.

ML Python
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import learning_curve
import numpy as np

# Load data
X, y = load_iris(return_X_y=True)

# Create model
model = LogisticRegression(max_iter=200)

# Get learning curve data
train_sizes, train_scores, test_scores = learning_curve(
    model, X, y, cv=5, train_sizes=np.linspace(0.1, 1.0, 5))

# Calculate mean scores
train_mean = np.mean(train_scores, axis=1)
test_mean = np.mean(test_scores, axis=1)

# Print results
for size, train_score, test_score in zip(train_sizes, train_mean, test_mean):
    print(f"Training size: {size}, Train score: {train_score:.3f}, Test score: {test_score:.3f}")
OutputSuccess
Important Notes

Learning curves show training and testing scores as the training size grows.

If training and testing scores are close and high, the model is good.

If training score is high but testing score is low, the model may be overfitting.

Summary

Learning curves help check how well a model learns with more data.

They show if a model is too simple, too complex, or just right.

Use them to decide if you need more data or a different model.