How to Use SVR in sklearn with Python: Simple Guide
Use
SVR from sklearn.svm to create a support vector regression model by initializing it with parameters like kernel. Fit the model on training data using fit(), then predict new values with predict().Syntax
The basic syntax to use SVR in sklearn is:
SVR(kernel='rbf', C=1.0, epsilon=0.1): Creates the SVR model.kernel: Specifies the kernel type like 'linear', 'poly', or 'rbf' (default).C: Regularization parameter controlling trade-off between smoothness and fitting.epsilon: Defines the epsilon-tube within which no penalty is given for errors.fit(X, y): Trains the model on featuresXand targety.predict(X): Predicts target values for new featuresX.
python
from sklearn.svm import SVR model = SVR(kernel='rbf', C=1.0, epsilon=0.1) model.fit(X_train, y_train) predictions = model.predict(X_test)
Example
This example shows how to train an SVR model on sample data and predict new values.
python
from sklearn.svm import SVR from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error import numpy as np # Sample data: X as features, y as target X = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]) y = np.array([1.5, 1.7, 3.2, 3.8, 5.1, 5.9, 7.3, 7.8, 9.0, 9.5]) # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create SVR model with RBF kernel model = SVR(kernel='rbf', C=100, epsilon=0.1) # Train the model model.fit(X_train, y_train) # Predict on test data predictions = model.predict(X_test) # Calculate mean squared error mse = mean_squared_error(y_test, predictions) print('Predictions:', predictions) print('Mean Squared Error:', mse)
Output
Predictions: [9.022 3.786 7.682]
Mean Squared Error: 0.011243333333333333
Common Pitfalls
Common mistakes when using SVR include:
- Not scaling features: SVR is sensitive to feature scales, so always scale data (e.g., with
StandardScaler). - Choosing wrong kernel: The default 'rbf' works well for many cases, but linear or polynomial kernels may be better depending on data.
- Ignoring hyperparameters: Parameters like
Candepsilongreatly affect performance and need tuning. - Using SVR for large datasets without optimization: SVR can be slow on large data.
python
from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from sklearn.pipeline import make_pipeline # Wrong way: no scaling model_wrong = SVR() model_wrong.fit(X_train, y_train) # Right way: scale features with pipeline model_right = make_pipeline(StandardScaler(), SVR()) model_right.fit(X_train, y_train)
Quick Reference
Summary tips for using SVR in sklearn:
- Always scale your input features before training.
- Start with
kernel='rbf'and tuneCandepsilon. - Use
fit()to train andpredict()to get predictions. - Evaluate model performance with metrics like mean squared error.
Key Takeaways
Use SVR from sklearn.svm by creating a model, fitting it on training data, then predicting.
Always scale features before training SVR to improve accuracy.
Tune kernel, C, and epsilon parameters for best results.
Use mean squared error or similar metrics to evaluate regression performance.