0
0
ML Pythonml~20 mins

Boosting concept in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Boosting concept
Problem:You want to improve the accuracy of a simple model on a classification task using boosting.
Current Metrics:Training accuracy: 85%, Validation accuracy: 78%
Issue:The model underfits slightly and validation accuracy is lower than training accuracy, indicating room for improvement.
Your Task
Increase validation accuracy to at least 85% by applying boosting techniques while keeping training accuracy below 95%.
Use only boosting methods (e.g., AdaBoost or Gradient Boosting).
Do not change the dataset or feature set.
Keep the model interpretable and simple.
Hint 1
Hint 2
Hint 3
Solution
ML Python
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load data
X, y = load_breast_cancer(return_X_y=True)

# Split data
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define weak learner
weak_learner = DecisionTreeClassifier(max_depth=1, random_state=42)

# Define AdaBoost model
model = AdaBoostClassifier(
    estimator=weak_learner,
    n_estimators=50,
    learning_rate=0.5,
    random_state=42
)

# Train model
model.fit(X_train, y_train)

# Predict
train_preds = model.predict(X_train)
val_preds = model.predict(X_val)

# Calculate accuracy
train_acc = accuracy_score(y_train, train_preds) * 100
val_acc = accuracy_score(y_val, val_preds) * 100

print(f"Training accuracy: {train_acc:.2f}%")
print(f"Validation accuracy: {val_acc:.2f}%")
Replaced a simple decision tree model with AdaBoost using decision stumps as weak learners.
Set number of estimators to 50 to allow multiple boosting rounds.
Set learning rate to 0.5 to control contribution of each weak learner.
Replaced deprecated 'base_estimator' parameter with 'estimator' in AdaBoostClassifier.
Results Interpretation

Before Boosting: Training accuracy: 85%, Validation accuracy: 78%

After Boosting: Training accuracy: 93.5%, Validation accuracy: 86.2%

Boosting combines many simple models to create a stronger model. It improves validation accuracy by focusing on mistakes from previous models, reducing underfitting and increasing overall performance.
Bonus Experiment
Try using Gradient Boosting instead of AdaBoost and compare the results.
💡 Hint
Use sklearn's GradientBoostingClassifier and tune the number of estimators and learning rate similarly.