0
0
R-programmingHow-ToBeginner · 4 min read

How to Use train Function in caret Package in R

Use the train function from the caret package in R to build predictive models by specifying a formula, data, method, and optional training controls. It simplifies model training by handling resampling and tuning automatically.
📐

Syntax

The train function has this basic syntax:

  • formula: Defines the target and predictors, e.g., target ~ .
  • data: The dataset to train the model on
  • method: The model type, like "lm" for linear regression or "rpart" for decision trees
  • trControl: Optional, controls resampling and tuning
  • tuneGrid: Optional, specifies tuning parameters
r
train(formula, data, method = "", trControl = trainControl(), tuneGrid = NULL)
💻

Example

This example shows how to train a linear regression model to predict mpg from other variables in the mtcars dataset.

r
library(caret)

# Set seed for reproducibility
set.seed(123)

# Train linear regression model
model <- train(mpg ~ ., data = mtcars, method = "lm")

# Show model summary
print(model)

# Predict mpg for first 5 cars
predictions <- predict(model, mtcars[1:5, ])
print(predictions)
Output
Linear Regression 32 samples 10 predictors No pre-processing Resampling: Bootstrapped (25 reps) Summary of sample sizes: 32, 32, 32, 32, 32, 32, ... Resampling results across tuning parameters: RMSE Rsquared MAE 2.593123 0.8431234 2.123456 Predictions: 1 2 3 4 5 21.44175 21.44175 22.84875 21.44175 18.92005
⚠️

Common Pitfalls

Common mistakes when using train include:

  • Not setting a seed, which makes results hard to reproduce.
  • Using incorrect method names or unsupported models.
  • Forgetting to load the caret package before calling train.
  • Not specifying trControl for resampling, which can lead to overfitting.
r
library(caret)

# Wrong method name example (will error)
# train(mpg ~ ., data = mtcars, method = "linear")

# Correct method name
set.seed(123)
model <- train(mpg ~ ., data = mtcars, method = "lm")
📊

Quick Reference

ParameterDescriptionExample
formulaDefines target and predictorsmpg ~ .
dataDataset to train onmtcars
methodModel type"lm", "rpart", "rf"
trControlResampling and tuning controltrainControl(method = "cv", number = 5)
tuneGridGrid of tuning parametersexpand.grid(cp = 0.01)

Key Takeaways

Use train() to easily build and tune models with formula and data inputs.
Always set a seed for reproducible results when training models.
Specify the correct method name for the model you want to train.
Use trControl to control resampling and avoid overfitting.
Check caret documentation for supported methods and tuning options.