0
0
MlopsHow-ToBeginner · 3 min read

How to Use SVR in sklearn with Python: Simple Guide

Use SVR from sklearn.svm to create a support vector regression model by initializing it with parameters like kernel. Fit the model on training data using fit(), then predict new values with predict().
📐

Syntax

The basic syntax to use SVR in sklearn is:

  • SVR(kernel='rbf', C=1.0, epsilon=0.1): Creates the SVR model.
  • kernel: Specifies the kernel type like 'linear', 'poly', or 'rbf' (default).
  • C: Regularization parameter controlling trade-off between smoothness and fitting.
  • epsilon: Defines the epsilon-tube within which no penalty is given for errors.
  • fit(X, y): Trains the model on features X and target y.
  • predict(X): Predicts target values for new features X.
python
from sklearn.svm import SVR

model = SVR(kernel='rbf', C=1.0, epsilon=0.1)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
💻

Example

This example shows how to train an SVR model on sample data and predict new values.

python
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Sample data: X as features, y as target
X = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]])
y = np.array([1.5, 1.7, 3.2, 3.8, 5.1, 5.9, 7.3, 7.8, 9.0, 9.5])

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create SVR model with RBF kernel
model = SVR(kernel='rbf', C=100, epsilon=0.1)

# Train the model
model.fit(X_train, y_train)

# Predict on test data
predictions = model.predict(X_test)

# Calculate mean squared error
mse = mean_squared_error(y_test, predictions)

print('Predictions:', predictions)
print('Mean Squared Error:', mse)
Output
Predictions: [9.022 3.786 7.682] Mean Squared Error: 0.011243333333333333
⚠️

Common Pitfalls

Common mistakes when using SVR include:

  • Not scaling features: SVR is sensitive to feature scales, so always scale data (e.g., with StandardScaler).
  • Choosing wrong kernel: The default 'rbf' works well for many cases, but linear or polynomial kernels may be better depending on data.
  • Ignoring hyperparameters: Parameters like C and epsilon greatly affect performance and need tuning.
  • Using SVR for large datasets without optimization: SVR can be slow on large data.
python
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from sklearn.pipeline import make_pipeline

# Wrong way: no scaling
model_wrong = SVR()
model_wrong.fit(X_train, y_train)

# Right way: scale features with pipeline
model_right = make_pipeline(StandardScaler(), SVR())
model_right.fit(X_train, y_train)
📊

Quick Reference

Summary tips for using SVR in sklearn:

  • Always scale your input features before training.
  • Start with kernel='rbf' and tune C and epsilon.
  • Use fit() to train and predict() to get predictions.
  • Evaluate model performance with metrics like mean squared error.

Key Takeaways

Use SVR from sklearn.svm by creating a model, fitting it on training data, then predicting.
Always scale features before training SVR to improve accuracy.
Tune kernel, C, and epsilon parameters for best results.
Use mean squared error or similar metrics to evaluate regression performance.