What is Learning curves in ML Python?

ML Pythonprogramming~5 mins

Learning curves in ML Python

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Learning curves help us see how well a model learns over time. They show if the model is improving or stuck.

When you want to check if your model is learning well with more data.

When you want to find out if your model is too simple or too complex.

When you want to decide if adding more training data will help your model.

When you want to compare different models or settings to pick the best one.

Syntax

ML Python

from sklearn.model_selection import learning_curve

train_sizes, train_scores, test_scores = learning_curve(
    estimator, X, y, cv=5, train_sizes=[0.1, 0.3, 0.5, 0.7, 1.0])

estimator is your machine learning model.

X and y are your data and labels.

Examples

Basic example using logistic regression with default settings.

ML Python

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import learning_curve

model = LogisticRegression()
train_sizes, train_scores, test_scores = learning_curve(model, X, y)

Specifying 3-fold cross-validation and custom training sizes.

ML Python

train_sizes, train_scores, test_scores = learning_curve(
    model, X, y, cv=3, train_sizes=[0.2, 0.4, 0.6, 0.8, 1.0])

Sample Program

This code loads the iris flower data, trains a logistic regression model on different amounts of data, and prints the average training and testing accuracy for each size.

ML Python

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import learning_curve
import numpy as np

# Load data
X, y = load_iris(return_X_y=True)

# Create model
model = LogisticRegression(max_iter=200)

# Get learning curve data
train_sizes, train_scores, test_scores = learning_curve(
    model, X, y, cv=5, train_sizes=np.linspace(0.1, 1.0, 5))

# Calculate mean scores
train_mean = np.mean(train_scores, axis=1)
test_mean = np.mean(test_scores, axis=1)

# Print results
for size, train_score, test_score in zip(train_sizes, train_mean, test_mean):
    print(f"Training size: {size}, Train score: {train_score:.3f}, Test score: {test_score:.3f}")

OutputSuccess

Important Notes

Learning curves show training and testing scores as the training size grows.

If training and testing scores are close and high, the model is good.

If training score is high but testing score is low, the model may be overfitting.

Summary

Learning curves help check how well a model learns with more data.

They show if a model is too simple, too complex, or just right.

Use them to decide if you need more data or a different model.