MlopsHow-ToBeginner · 4 min read

How to Tune SVM Hyperparameters in Python with sklearn

To tune SVM hyperparameters in Python, use GridSearchCV from sklearn.model_selection to search over parameters like C, kernel, and gamma. This automates testing combinations to find the best settings for your data.

📐

Syntax

Use GridSearchCV to tune SVM hyperparameters by specifying the model, parameter grid, and cross-validation settings.

estimator: The SVM model, e.g., svm.SVC().
param_grid: Dictionary of hyperparameters to try.
cv: Number of folds for cross-validation.
scoring: Metric to evaluate performance, e.g., 'accuracy'.
fit(): Runs the search on training data.

python

from sklearn import svm
from sklearn.model_selection import GridSearchCV

param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf'],
    'gamma': ['scale', 'auto']
}

svc = svm.SVC()
grid_search = GridSearchCV(estimator=svc, param_grid=param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

best_params = grid_search.best_params_

💻

Example

This example shows how to tune C, kernel, and gamma for an SVM on the iris dataset using GridSearchCV. It prints the best parameters and accuracy score.

python

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.metrics import accuracy_score

# Load data
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define parameter grid
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf'],
    'gamma': ['scale', 'auto']
}

# Create SVM
svc = svm.SVC()

# Setup GridSearchCV
grid_search = GridSearchCV(estimator=svc, param_grid=param_grid, cv=5, scoring='accuracy')

# Fit model
grid_search.fit(X_train, y_train)

# Best parameters
print('Best parameters:', grid_search.best_params_)

# Predict and evaluate
y_pred = grid_search.predict(X_test)
print('Test accuracy:', accuracy_score(y_test, y_pred))

Output

Best parameters: {'C': 1, 'gamma': 'scale', 'kernel': 'linear'} Test accuracy: 1.0

⚠️

Common Pitfalls

Common mistakes when tuning SVM hyperparameters include:

Not scaling features before training, which can hurt SVM performance.
Using too large or too small C values without testing a range.
Ignoring the choice of kernel and gamma, which greatly affect results.
Not using cross-validation, leading to overfitting on training data.

Always preprocess data and use cross-validation to get reliable tuning results.

python

from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# Wrong way: no scaling
svc = svm.SVC()
grid_search = GridSearchCV(svc, param_grid={'C':[1,10]}, cv=5)
grid_search.fit(X_train, y_train)

# Right way: scale features inside a pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('svc', svm.SVC())
])
param_grid = {'svc__C': [1, 10]}
grid_search = GridSearchCV(pipeline, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)

📊

Quick Reference

Summary tips for tuning SVM hyperparameters:

C: Controls trade-off between smooth decision boundary and classifying training points correctly.
kernel: Choose 'linear' for linearly separable data, 'rbf' for non-linear.
gamma: Defines influence of a single training example; 'scale' is a good default.
Always scale features before training SVM.
Use GridSearchCV with cross-validation for reliable tuning.

✅

Key Takeaways

Use GridSearchCV to systematically test SVM hyperparameters like C, kernel, and gamma.

Always scale your features before training an SVM to improve performance.

Cross-validation helps avoid overfitting when tuning hyperparameters.

Start with a small grid of parameters and expand based on results.

The choice of kernel and gamma significantly impacts SVM accuracy.