0
0
ML Pythonprogramming~5 mins

Hyperparameter tuning (GridSearchCV) in ML Python

Choose your learning style9 modes available
Introduction

We use hyperparameter tuning to find the best settings for a machine learning model so it works well on new data.

When you want to improve your model's accuracy by trying different settings.
When you have a model with options like tree depth or number of neighbors to choose from.
When you want to avoid guessing which settings work best.
When you want to compare many combinations of settings automatically.
Syntax
ML Python
from sklearn.model_selection import GridSearchCV

model = SomeModel()
param_grid = {'param1': [values], 'param2': [values]}
grid = GridSearchCV(model, param_grid, cv=number_of_folds)
grid.fit(X_train, y_train)

best_model = grid.best_estimator_
best_params = grid.best_params_

GridSearchCV tries all combinations of parameters you give it.

cv means how many parts to split your data for testing during tuning.

Examples
This tries decision trees with different depths and split sizes using 3-fold cross-validation.
ML Python
from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
param_grid = {'max_depth': [3, 5, 7], 'min_samples_split': [2, 4]}
grid = GridSearchCV(model, param_grid, cv=3)
grid.fit(X_train, y_train)
This tries different SVM settings with 5-fold cross-validation.
ML Python
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

model = SVC()
param_grid = {'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf']}
grid = GridSearchCV(model, param_grid, cv=5)
grid.fit(X_train, y_train)
Sample Program

This program loads the iris flower data, splits it, and uses GridSearchCV to find the best decision tree settings. Then it tests the best model and prints accuracy.

ML Python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load data
X, y = load_iris(return_X_y=True)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define model and parameters
dt = DecisionTreeClassifier(random_state=42)
param_grid = {'max_depth': [2, 3, 4], 'min_samples_split': [2, 3]}

# Setup GridSearchCV
grid_search = GridSearchCV(dt, param_grid, cv=3)

# Train with grid search
grid_search.fit(X_train, y_train)

# Best model and parameters
best_model = grid_search.best_estimator_
best_params = grid_search.best_params_

# Predict and evaluate
predictions = best_model.predict(X_test)
accuracy = accuracy_score(y_test, predictions)

print(f"Best Parameters: {best_params}")
print(f"Test Accuracy: {accuracy:.3f}")
OutputSuccess
Important Notes

GridSearchCV can take time if you try many parameters or large data.

Use random_state to get the same results every time.

You can check grid_search.cv_results_ to see scores for all tries.

Summary

GridSearchCV helps find the best model settings automatically.

It tries all combinations you give and tests them with cross-validation.

Use it to improve your model's accuracy without guessing.