How to Use Decision Tree in R: Simple Guide with Example
To use a
decision tree in R, install and load the rpart package, then create a model using rpart() with a formula and dataset. You can visualize the tree with rpart.plot or plot() functions.Syntax
The basic syntax to create a decision tree model in R using the rpart package is:
rpart(formula, data, method)
Where:
formuladefines the target and predictors (e.g.,target ~ feature1 + feature2).datais the dataset used for training.methodspecifies the type of problem:"class"for classification or"anova"for regression.
r
library(rpart)
model <- rpart(target ~ feature1 + feature2, data = dataset, method = "class")Example
This example shows how to build a decision tree to classify the species in the famous Iris dataset. It trains the model and plots the tree.
r
library(rpart) library(rpart.plot) # Load iris dataset data(iris) # Build decision tree model to predict Species model <- rpart(Species ~ ., data = iris, method = "class") # Print model summary print(model) # Plot the decision tree rpart.plot(model)
Output
n= 150
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 150 100 setosa (0.33333333 0.33333333 0.33333333)
2) Petal.Length< 2.45 50 0 setosa (1.00000000 0.00000000 0.00000000) *
3) Petal.Length>=2.45 100 50 versicolor (0.00000000 0.50000000 0.50000000)
6) Petal.Width< 1.75 54 5 versicolor (0.00000000 0.90740741 0.09259259)
12) Petal.Length< 4.95 48 1 versicolor (0.00000000 0.97916667 0.02083333) *
13) Petal.Length>=4.95 6 0 virginica (0.00000000 0.00000000 1.00000000) *
7) Petal.Width>=1.75 46 1 virginica (0.00000000 0.02173913 0.97826087) *
Common Pitfalls
Common mistakes when using decision trees in R include:
- Not loading the
rpartpackage before usingrpart(). - Using incorrect formula syntax or missing the target variable.
- Forgetting to specify
method = "class"for classification problems. - Not installing or loading
rpart.plotfor visualization.
Always check your data and formula carefully.
r
## Wrong: missing method for classification model_wrong <- rpart(Species ~ ., data = iris) ## Right: specify method model_right <- rpart(Species ~ ., data = iris, method = "class")
Quick Reference
| Function | Purpose | Notes |
|---|---|---|
| rpart() | Create decision tree model | Use method="class" for classification |
| rpart.plot() | Plot decision tree | Requires rpart.plot package |
| print() | Show model summary | Shows splits and node info |
| predict() | Make predictions | Use type="class" for class labels |
Key Takeaways
Use the rpart package and rpart() function to build decision trees in R.
Specify method="class" for classification problems to get correct results.
Visualize trees easily with rpart.plot() from the rpart.plot package.
Check your formula and data carefully to avoid common errors.
Use predict() with type="class" to get predicted classes from the model.