How to Use SVM Classifier in sklearn with Python
To use an
SVM classifier in sklearn, import SVC from sklearn.svm, create an instance with desired parameters, then fit it to your training data using fit(). After training, use predict() to get predictions on new data.Syntax
The basic syntax to use the SVM classifier in sklearn is:
from sklearn.svm import SVC: imports the SVM classifier class.model = SVC(kernel='linear', C=1.0): creates the SVM model with a linear kernel and regularization parameter C.model.fit(X_train, y_train): trains the model on training featuresX_trainand labelsy_train.predictions = model.predict(X_test): predicts labels for new dataX_test.
python
from sklearn.svm import SVC # Create SVM classifier with linear kernel model = SVC(kernel='linear', C=1.0) # Train the model model.fit(X_train, y_train) # Predict new data predictions = model.predict(X_test)
Example
This example shows how to train an SVM classifier on the Iris dataset and predict the class of test samples. It also prints the accuracy score.
python
from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.svm import SVC from sklearn.metrics import accuracy_score # Load Iris dataset iris = datasets.load_iris() X = iris.data y = iris.target # Split data into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create SVM classifier with RBF kernel model = SVC(kernel='rbf', C=1.0, gamma='scale') # Train the model model.fit(X_train, y_train) # Predict on test data predictions = model.predict(X_test) # Calculate accuracy accuracy = accuracy_score(y_test, predictions) print(f"Accuracy: {accuracy:.2f}")
Output
Accuracy: 1.00
Common Pitfalls
Common mistakes when using SVM classifier include:
- Not scaling features: SVMs work best when features are scaled (e.g., using
StandardScaler). - Choosing wrong kernel: Linear kernel is good for linearly separable data; RBF kernel works for more complex data.
- Ignoring parameter tuning: Parameters like
Candgammaaffect performance and need tuning. - Using
predictbeforefit: The model must be trained before predicting.
python
from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC from sklearn.pipeline import make_pipeline # Wrong way: no scaling model_wrong = SVC() # model_wrong.fit(X_train, y_train) # If data not scaled, performance may suffer # Right way: scale features before SVM model_right = make_pipeline(StandardScaler(), SVC()) model_right.fit(X_train, y_train)
Quick Reference
Here is a quick reference for key SVM classifier parameters:
| Parameter | Description | Default |
|---|---|---|
| kernel | Specifies the kernel type: 'linear', 'poly', 'rbf', 'sigmoid' | 'rbf' |
| C | Regularization parameter; controls trade-off between smooth decision boundary and classifying training points correctly | 1.0 |
| gamma | Kernel coefficient for 'rbf', 'poly' and 'sigmoid'; controls influence of single training examples | 'scale' |
| degree | Degree of the polynomial kernel function ('poly'); ignored by other kernels | 3 |
| probability | Enable probability estimates (slower) | False |
Key Takeaways
Import SVC from sklearn.svm and create a model with desired kernel and parameters.
Always fit the model with training data before predicting.
Scale features for better SVM performance using StandardScaler or pipelines.
Tune parameters like C and gamma to improve accuracy.
Use predict() to get class predictions after training.