How to Tune SVM Hyperparameters in Python with sklearn
To tune
SVM hyperparameters in Python, use GridSearchCV from sklearn.model_selection to search over parameters like C, kernel, and gamma. This automates testing combinations to find the best settings for your data.Syntax
Use GridSearchCV to tune SVM hyperparameters by specifying the model, parameter grid, and cross-validation settings.
estimator: The SVM model, e.g.,svm.SVC().param_grid: Dictionary of hyperparameters to try.cv: Number of folds for cross-validation.scoring: Metric to evaluate performance, e.g., 'accuracy'.fit(): Runs the search on training data.
python
from sklearn import svm from sklearn.model_selection import GridSearchCV param_grid = { 'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf'], 'gamma': ['scale', 'auto'] } svc = svm.SVC() grid_search = GridSearchCV(estimator=svc, param_grid=param_grid, cv=5, scoring='accuracy') grid_search.fit(X_train, y_train) best_params = grid_search.best_params_
Example
This example shows how to tune C, kernel, and gamma for an SVM on the iris dataset using GridSearchCV. It prints the best parameters and accuracy score.
python
from sklearn import svm, datasets from sklearn.model_selection import GridSearchCV, train_test_split from sklearn.metrics import accuracy_score # Load data iris = datasets.load_iris() X, y = iris.data, iris.target # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Define parameter grid param_grid = { 'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf'], 'gamma': ['scale', 'auto'] } # Create SVM svc = svm.SVC() # Setup GridSearchCV grid_search = GridSearchCV(estimator=svc, param_grid=param_grid, cv=5, scoring='accuracy') # Fit model grid_search.fit(X_train, y_train) # Best parameters print('Best parameters:', grid_search.best_params_) # Predict and evaluate y_pred = grid_search.predict(X_test) print('Test accuracy:', accuracy_score(y_test, y_pred))
Output
Best parameters: {'C': 1, 'gamma': 'scale', 'kernel': 'linear'}
Test accuracy: 1.0
Common Pitfalls
Common mistakes when tuning SVM hyperparameters include:
- Not scaling features before training, which can hurt SVM performance.
- Using too large or too small
Cvalues without testing a range. - Ignoring the choice of
kernelandgamma, which greatly affect results. - Not using cross-validation, leading to overfitting on training data.
Always preprocess data and use cross-validation to get reliable tuning results.
python
from sklearn.preprocessing import StandardScaler from sklearn.pipeline import Pipeline # Wrong way: no scaling svc = svm.SVC() grid_search = GridSearchCV(svc, param_grid={'C':[1,10]}, cv=5) grid_search.fit(X_train, y_train) # Right way: scale features inside a pipeline pipeline = Pipeline([ ('scaler', StandardScaler()), ('svc', svm.SVC()) ]) param_grid = {'svc__C': [1, 10]} grid_search = GridSearchCV(pipeline, param_grid=param_grid, cv=5) grid_search.fit(X_train, y_train)
Quick Reference
Summary tips for tuning SVM hyperparameters:
- C: Controls trade-off between smooth decision boundary and classifying training points correctly.
- kernel: Choose 'linear' for linearly separable data, 'rbf' for non-linear.
- gamma: Defines influence of a single training example; 'scale' is a good default.
- Always scale features before training SVM.
- Use
GridSearchCVwith cross-validation for reliable tuning.
Key Takeaways
Use GridSearchCV to systematically test SVM hyperparameters like C, kernel, and gamma.
Always scale your features before training an SVM to improve performance.
Cross-validation helps avoid overfitting when tuning hyperparameters.
Start with a small grid of parameters and expand based on results.
The choice of kernel and gamma significantly impacts SVM accuracy.