0
0
MlopsHow-ToBeginner · 3 min read

How to Compare ML Models in Python Using sklearn

To compare machine learning models in Python, use sklearn to train each model on the same data, then evaluate their performance using metrics like accuracy_score or mean_squared_error. You can compare results side-by-side to choose the best model for your task.
📐

Syntax

To compare ML models, follow these steps:

  • fit(): Train each model on training data.
  • predict(): Get predictions on test data.
  • Use evaluation metrics like accuracy_score for classification or mean_squared_error for regression.
  • Compare metric values to decide which model performs better.
python
from sklearn.metrics import accuracy_score

# Train model
model.fit(X_train, y_train)

# Predict
predictions = model.predict(X_test)

# Evaluate
score = accuracy_score(y_test, predictions)
💻

Example

This example compares two classification models, Logistic Regression and Decision Tree, on the Iris dataset using accuracy score.

python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize models
log_reg = LogisticRegression(max_iter=200)
dec_tree = DecisionTreeClassifier()

# Train models
log_reg.fit(X_train, y_train)
dec_tree.fit(X_train, y_train)

# Predict
pred_log_reg = log_reg.predict(X_test)
pred_dec_tree = dec_tree.predict(X_test)

# Evaluate
acc_log_reg = accuracy_score(y_test, pred_log_reg)
acc_dec_tree = accuracy_score(y_test, pred_dec_tree)

print(f"Logistic Regression Accuracy: {acc_log_reg:.2f}")
print(f"Decision Tree Accuracy: {acc_dec_tree:.2f}")
Output
Logistic Regression Accuracy: 1.00 Decision Tree Accuracy: 0.98
⚠️

Common Pitfalls

Common mistakes when comparing ML models include:

  • Using different train/test splits for each model, which makes comparison unfair.
  • Comparing models with different evaluation metrics that don't fit the task.
  • Ignoring randomness by not setting random_state, causing inconsistent results.
  • Overfitting by evaluating on training data instead of separate test data.
python
from sklearn.model_selection import train_test_split

# Wrong: Different splits for each model
X_train1, X_test1, y_train1, y_test1 = train_test_split(X, y, test_size=0.3, random_state=42)
X_train2, X_test2, y_train2, y_test2 = train_test_split(X, y, test_size=0.3, random_state=42)

# Right: Use same split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
📊

Quick Reference

StepDescriptionExample Function
Train modelFit model on training datamodel.fit(X_train, y_train)
PredictGet predictions on test datamodel.predict(X_test)
EvaluateCalculate performance metricaccuracy_score(y_test, y_pred)
CompareCheck metric values side-by-sideCompare accuracy or error values

Key Takeaways

Train all models on the same train/test split for fair comparison.
Use appropriate metrics like accuracy for classification or MSE for regression.
Set random_state to ensure reproducible splits and results.
Evaluate models on unseen test data to avoid overfitting bias.
Compare metric scores side-by-side to select the best model.