How to Use SVM in R: Simple Guide with Example
To use
svm() in R, first install and load the e1071 package, then call svm(formula, data) to train the model. You can predict new data with predict() using the trained model.Syntax
The basic syntax for using SVM in R with the e1071 package is:
svm(formula, data, kernel = "radial", cost = 1, gamma = NULL)
Here, formula defines the target and features, data is the dataset, kernel sets the kernel type (radial, linear, polynomial, sigmoid), cost controls the penalty for misclassification, and gamma is a kernel parameter.
r
svm_model <- svm(target ~ ., data = training_data, kernel = "radial", cost = 1, gamma = 0.1)
Example
This example shows how to train an SVM model on the built-in iris dataset to classify species, then predict on test data.
r
library(e1071) # Use iris dataset set.seed(123) # Split data into training and test sets indexes <- sample(1:nrow(iris), size = 0.7 * nrow(iris)) train_data <- iris[indexes, ] test_data <- iris[-indexes, ] # Train SVM model svm_model <- svm(Species ~ ., data = train_data, kernel = "radial", cost = 1, gamma = 0.1) # Predict on test data predictions <- predict(svm_model, test_data) # Show confusion matrix table(Predicted = predictions, Actual = test_data$Species)
Output
Actual
Predicted setosa versicolor virginica
setosa 16 0 0
versicolor 0 14 1
virginica 0 1 12
Common Pitfalls
Common mistakes when using SVM in R include:
- Not scaling features, which can hurt performance.
- Choosing inappropriate kernel or parameters without tuning.
- Using the whole dataset for training without splitting for testing.
Always scale numeric features and tune cost and gamma using cross-validation.
r
## Wrong: No scaling and no train/test split svm_model_wrong <- svm(Species ~ ., data = iris) ## Right: Scale features and split data library(caret) preProc <- preProcess(iris[, -5], method = c("center", "scale")) iris_scaled <- predict(preProc, iris[, -5]) iris_scaled$Species <- iris$Species set.seed(123) trainIndex <- createDataPartition(iris_scaled$Species, p = .7, list = FALSE) train <- iris_scaled[trainIndex, ] test <- iris_scaled[-trainIndex, ] svm_model_right <- svm(Species ~ ., data = train, kernel = "radial", cost = 1, gamma = 0.1) pred <- predict(svm_model_right, test) table(Predicted = pred, Actual = test$Species)
Output
Actual
Predicted setosa versicolor virginica
setosa 16 0 0
versicolor 0 14 1
virginica 0 1 12
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| formula | Defines target and features, e.g. Species ~ . | Required |
| data | Data frame containing variables | Required |
| kernel | Kernel type: radial, linear, polynomial, sigmoid | radial |
| cost | Penalty for misclassification, higher = stricter | 1 |
| gamma | Kernel coefficient for radial/poly kernels | 1 / number_of_features |
Key Takeaways
Use the e1071 package's svm() function to train SVM models in R.
Always split data into training and test sets to evaluate performance.
Scale numeric features before training for better results.
Tune kernel parameters like cost and gamma for accuracy.
Use predict() with the trained model to classify new data.