0
0
MlopsHow-ToBeginner · 3 min read

How to Use SVM Classifier in sklearn with Python

To use an SVM classifier in sklearn, import SVC from sklearn.svm, create an instance with desired parameters, then fit it to your training data using fit(). After training, use predict() to get predictions on new data.
📐

Syntax

The basic syntax to use the SVM classifier in sklearn is:

  • from sklearn.svm import SVC: imports the SVM classifier class.
  • model = SVC(kernel='linear', C=1.0): creates the SVM model with a linear kernel and regularization parameter C.
  • model.fit(X_train, y_train): trains the model on training features X_train and labels y_train.
  • predictions = model.predict(X_test): predicts labels for new data X_test.
python
from sklearn.svm import SVC

# Create SVM classifier with linear kernel
model = SVC(kernel='linear', C=1.0)

# Train the model
model.fit(X_train, y_train)

# Predict new data
predictions = model.predict(X_test)
💻

Example

This example shows how to train an SVM classifier on the Iris dataset and predict the class of test samples. It also prints the accuracy score.

python
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create SVM classifier with RBF kernel
model = SVC(kernel='rbf', C=1.0, gamma='scale')

# Train the model
model.fit(X_train, y_train)

# Predict on test data
predictions = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")
Output
Accuracy: 1.00
⚠️

Common Pitfalls

Common mistakes when using SVM classifier include:

  • Not scaling features: SVMs work best when features are scaled (e.g., using StandardScaler).
  • Choosing wrong kernel: Linear kernel is good for linearly separable data; RBF kernel works for more complex data.
  • Ignoring parameter tuning: Parameters like C and gamma affect performance and need tuning.
  • Using predict before fit: The model must be trained before predicting.
python
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.pipeline import make_pipeline

# Wrong way: no scaling
model_wrong = SVC()
# model_wrong.fit(X_train, y_train)  # If data not scaled, performance may suffer

# Right way: scale features before SVM
model_right = make_pipeline(StandardScaler(), SVC())
model_right.fit(X_train, y_train)
📊

Quick Reference

Here is a quick reference for key SVM classifier parameters:

ParameterDescriptionDefault
kernelSpecifies the kernel type: 'linear', 'poly', 'rbf', 'sigmoid''rbf'
CRegularization parameter; controls trade-off between smooth decision boundary and classifying training points correctly1.0
gammaKernel coefficient for 'rbf', 'poly' and 'sigmoid'; controls influence of single training examples'scale'
degreeDegree of the polynomial kernel function ('poly'); ignored by other kernels3
probabilityEnable probability estimates (slower)False

Key Takeaways

Import SVC from sklearn.svm and create a model with desired kernel and parameters.
Always fit the model with training data before predicting.
Scale features for better SVM performance using StandardScaler or pipelines.
Tune parameters like C and gamma to improve accuracy.
Use predict() to get class predictions after training.