How to use svm classifier sklearn in python

MlopsHow-ToBeginner · 3 min read

How to Use SVM Classifier in sklearn with Python

To use an SVM classifier in sklearn, import SVC from sklearn.svm, create an instance with desired parameters, then fit it to your training data using fit(). After training, use predict() to get predictions on new data.

📐

Syntax

The basic syntax to use the SVM classifier in sklearn is:

from sklearn.svm import SVC: imports the SVM classifier class.
model = SVC(kernel='linear', C=1.0): creates the SVM model with a linear kernel and regularization parameter C.
model.fit(X_train, y_train): trains the model on training features X_train and labels y_train.
predictions = model.predict(X_test): predicts labels for new data X_test.

python

from sklearn.svm import SVC

# Create SVM classifier with linear kernel
model = SVC(kernel='linear', C=1.0)

# Train the model
model.fit(X_train, y_train)

# Predict new data
predictions = model.predict(X_test)

💻

Example

This example shows how to train an SVM classifier on the Iris dataset and predict the class of test samples. It also prints the accuracy score.

python

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create SVM classifier with RBF kernel
model = SVC(kernel='rbf', C=1.0, gamma='scale')

# Train the model
model.fit(X_train, y_train)

# Predict on test data
predictions = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")

Output

Accuracy: 1.00

⚠️

Common Pitfalls

Common mistakes when using SVM classifier include:

Not scaling features: SVMs work best when features are scaled (e.g., using StandardScaler).
Choosing wrong kernel: Linear kernel is good for linearly separable data; RBF kernel works for more complex data.
Ignoring parameter tuning: Parameters like C and gamma affect performance and need tuning.
Using predict before fit: The model must be trained before predicting.

python

from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.pipeline import make_pipeline

# Wrong way: no scaling
model_wrong = SVC()
# model_wrong.fit(X_train, y_train)  # If data not scaled, performance may suffer

# Right way: scale features before SVM
model_right = make_pipeline(StandardScaler(), SVC())
model_right.fit(X_train, y_train)

📊

Quick Reference

Here is a quick reference for key SVM classifier parameters:

Parameter	Description	Default
kernel	Specifies the kernel type: 'linear', 'poly', 'rbf', 'sigmoid'	'rbf'
C	Regularization parameter; controls trade-off between smooth decision boundary and classifying training points correctly	1.0
gamma	Kernel coefficient for 'rbf', 'poly' and 'sigmoid'; controls influence of single training examples	'scale'
degree	Degree of the polynomial kernel function ('poly'); ignored by other kernels	3
probability	Enable probability estimates (slower)	False

✅

Key Takeaways

Import SVC from sklearn.svm and create a model with desired kernel and parameters.

Always fit the model with training data before predicting.

Scale features for better SVM performance using StandardScaler or pipelines.

Tune parameters like C and gamma to improve accuracy.

Use predict() to get class predictions after training.