How to Use cor Function in R: Syntax and Examples
In R, use the
cor() function to calculate the correlation between two numeric vectors or columns in a data frame. It returns a value between -1 and 1 indicating the strength and direction of the linear relationship. You can specify the method like "pearson", "spearman", or "kendall".Syntax
The basic syntax of the cor() function is:
cor(x, y = NULL, method = "pearson", use = "everything")
Where:
x: a numeric vector, matrix, or data frame.y: an optional numeric vector or matrix to correlate withx. If omitted, correlation is computed between columns ofx.method: the correlation method; can be"pearson"(default),"spearman", or"kendall".use: how to handle missing values; options include"everything","complete.obs","pairwise.complete.obs".
r
cor(x, y = NULL, method = "pearson", use = "everything")
Example
This example shows how to calculate the Pearson correlation between two numeric vectors and also between columns of a data frame.
r
x <- c(1, 2, 3, 4, 5) y <- c(2, 4, 6, 8, 10) # Correlation between two vectors correlation_xy <- cor(x, y) # Correlation matrix of a data frame df <- data.frame(height = c(150, 160, 170, 180, 190), weight = c(65, 70, 75, 80, 85)) correlation_df <- cor(df) correlation_xy correlation_df
Output
[1] 1
height weight
height 1.0000000 1.00000
weight 1.0000000 1.00000
Common Pitfalls
Common mistakes when using cor() include:
- Passing non-numeric data, which causes errors.
- Not handling missing values, which can result in
NAoutput. - Confusing the correlation method; for example, using Pearson when data is not linear.
Always check your data type and consider using the use parameter to handle missing values properly.
r
x <- c(1, 2, NA, 4, 5) y <- c(2, 4, 6, 8, 10) # Wrong: missing values cause NA result cor(x, y) # Right: ignore missing values cor(x, y, use = "complete.obs")
Output
[1] NA
[1] 1
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| x | Numeric vector, matrix, or data frame | Required |
| y | Optional numeric vector or matrix to correlate with x | NULL |
| method | Correlation method: "pearson", "spearman", or "kendall" | "pearson" |
| use | How to handle missing values: "everything", "complete.obs", "pairwise.complete.obs" | "everything" |
Key Takeaways
Use cor() to find the correlation between numeric vectors or data frame columns in R.
Specify the method parameter to choose Pearson, Spearman, or Kendall correlation.
Handle missing values with the use parameter to avoid NA results.
cor() returns values between -1 and 1 indicating strength and direction of correlation.
Always ensure your data is numeric before using cor() to prevent errors.