0
0
R-programmingHow-ToBeginner · 4 min read

How to Evaluate Model in R: Syntax, Example, and Tips

To evaluate a model in R, use functions like summary() for model details, predict() to get predictions, and metrics such as confusionMatrix() from the caret package or RMSE() from the caret package for regression. These tools help measure how well your model performs on test data.
📐

Syntax

Here are common functions used to evaluate models in R:

  • summary(model): Shows model details and statistics.
  • predict(model, newdata): Generates predictions on new data.
  • confusionMatrix(predictions, actuals): Calculates classification accuracy and other metrics (from caret package).
  • RMSE(predictions, actuals): Computes root mean squared error for regression (from caret package).
r
summary(model)
predictions <- predict(model, newdata)
confusionMatrix(predictions, actuals)  # classification
RMSE(predictions, actuals)             # regression
💻

Example

This example shows how to train a logistic regression model on the built-in iris dataset and evaluate it using a confusion matrix.

r
library(caret)

# Prepare data: binary classification (setosa vs others)
iris$IsSetosa <- ifelse(iris$Species == "setosa", "Yes", "No")
iris$IsSetosa <- as.factor(iris$IsSetosa)

# Split data into training and testing
set.seed(123)
trainIndex <- createDataPartition(iris$IsSetosa, p = 0.7, list = FALSE)
trainData <- iris[trainIndex, ]
testData <- iris[-trainIndex, ]

# Train logistic regression model
model <- train(IsSetosa ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
               data = trainData, method = "glm", family = "binomial")

# Predict on test data
predictions <- predict(model, testData)

# Evaluate with confusion matrix
confMat <- confusionMatrix(predictions, testData$IsSetosa)
print(confMat)
Output
Confusion Matrix and Statistics Reference Prediction No Yes No 39 0 Yes 1 31 Accuracy : 0.9857 95% CI : (0.9147, 0.9994) No Information Rate : 0.5429 P-Value [Acc > NIR] : 1.02e-11 Kappa : 0.9714 Mcnemar's Test P-Value : 1 Sensitivity : 0.9744 Specificity : 1.0000 Pos Pred Value : 1.0000 Neg Pred Value : 0.9750 Prevalence : 0.4571 Detection Rate : 0.4457 Detection Prevalence : 0.4457 Balanced Accuracy : 0.9872
⚠️

Common Pitfalls

Common mistakes when evaluating models in R include:

  • Using training data for evaluation, which causes overly optimistic results.
  • Not converting predicted probabilities to class labels before confusion matrix.
  • Ignoring class imbalance which can mislead accuracy interpretation.
  • Using wrong metric for the problem type (e.g., accuracy for regression).

Always split data into training and testing sets and choose metrics that fit your model type.

r
## Wrong: Evaluating on training data
# predictions <- predict(model, trainData)
# confusionMatrix(predictions, trainData$IsSetosa)

## Right: Evaluate on test data
# predictions <- predict(model, testData)
# confusionMatrix(predictions, testData$IsSetosa)
📊

Quick Reference

FunctionPurposeUse Case
summary(model)Show model details and statisticsAny model type
predict(model, newdata)Generate predictions on new dataAny model type
confusionMatrix(predictions, actuals)Evaluate classification accuracy and metricsClassification models
RMSE(predictions, actuals)Calculate root mean squared errorRegression models
createDataPartition()Split data into training and testing setsData preparation

Key Takeaways

Always split your data into training and testing sets before evaluation.
Use predict() to get model predictions on new data.
Choose evaluation metrics that match your model type: classification or regression.
Use confusionMatrix() for classification and RMSE() for regression.
Avoid evaluating your model on the same data it was trained on to prevent biased results.