0
0
R-programmingHow-ToBeginner · 3 min read

How to Use cor Function in R: Syntax and Examples

In R, use the cor() function to calculate the correlation between two numeric vectors or columns in a data frame. It returns a value between -1 and 1 indicating the strength and direction of the linear relationship. You can specify the method like "pearson", "spearman", or "kendall".
📐

Syntax

The basic syntax of the cor() function is:

  • cor(x, y = NULL, method = "pearson", use = "everything")

Where:

  • x: a numeric vector, matrix, or data frame.
  • y: an optional numeric vector or matrix to correlate with x. If omitted, correlation is computed between columns of x.
  • method: the correlation method; can be "pearson" (default), "spearman", or "kendall".
  • use: how to handle missing values; options include "everything", "complete.obs", "pairwise.complete.obs".
r
cor(x, y = NULL, method = "pearson", use = "everything")
💻

Example

This example shows how to calculate the Pearson correlation between two numeric vectors and also between columns of a data frame.

r
x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 6, 8, 10)

# Correlation between two vectors
correlation_xy <- cor(x, y)

# Correlation matrix of a data frame
df <- data.frame(height = c(150, 160, 170, 180, 190), weight = c(65, 70, 75, 80, 85))
correlation_df <- cor(df)

correlation_xy
correlation_df
Output
[1] 1 height weight height 1.0000000 1.00000 weight 1.0000000 1.00000
⚠️

Common Pitfalls

Common mistakes when using cor() include:

  • Passing non-numeric data, which causes errors.
  • Not handling missing values, which can result in NA output.
  • Confusing the correlation method; for example, using Pearson when data is not linear.

Always check your data type and consider using the use parameter to handle missing values properly.

r
x <- c(1, 2, NA, 4, 5)
y <- c(2, 4, 6, 8, 10)

# Wrong: missing values cause NA result
cor(x, y)

# Right: ignore missing values
cor(x, y, use = "complete.obs")
Output
[1] NA [1] 1
📊

Quick Reference

ParameterDescriptionDefault
xNumeric vector, matrix, or data frameRequired
yOptional numeric vector or matrix to correlate with xNULL
methodCorrelation method: "pearson", "spearman", or "kendall""pearson"
useHow to handle missing values: "everything", "complete.obs", "pairwise.complete.obs""everything"

Key Takeaways

Use cor() to find the correlation between numeric vectors or data frame columns in R.
Specify the method parameter to choose Pearson, Spearman, or Kendall correlation.
Handle missing values with the use parameter to avoid NA results.
cor() returns values between -1 and 1 indicating strength and direction of correlation.
Always ensure your data is numeric before using cor() to prevent errors.