R-programmingHow-ToBeginner · 3 min read

How to Calculate Correlation in R: Simple Guide

In R, you calculate correlation between two numeric vectors using the cor() function. This function returns the correlation coefficient, which measures the strength and direction of a linear relationship between variables.

📐

Syntax

The basic syntax to calculate correlation in R is:

cor(x, y, method = "pearson")

Where:

x and y are numeric vectors or columns.
method specifies the correlation type: "pearson" (default), "spearman", or "kendall".

cor(x, y, method = "pearson")

💻

Example

This example shows how to calculate the Pearson correlation between two numeric vectors a and b.

a <- c(1, 2, 3, 4, 5)
b <- c(2, 4, 6, 8, 10)
correlation <- cor(a, b)
print(correlation)

Output

[1] 1

⚠️

Common Pitfalls

Common mistakes when calculating correlation in R include:

Passing non-numeric data, which causes errors.
Ignoring missing values (NA), which can result in NA output.
Using the wrong method for your data type.

To handle missing values, use the use argument like use = "complete.obs" to ignore pairs with missing data.

x <- c(1, 2, NA, 4)
y <- c(2, 4, 6, NA)

# Wrong: returns NA because of missing values
cor(x, y)

# Right: ignore missing pairs
cor(x, y, use = "complete.obs")

Output

[1] NA [1] 1

📊

Quick Reference

Argument	Description
x, y	Numeric vectors or columns to compare
method	"pearson" (default), "spearman", or "kendall" correlation type
use	How to handle missing values: "everything", "all.obs", "complete.obs", "pairwise.complete.obs"

✅

Key Takeaways

Use the cor() function to calculate correlation between numeric vectors in R.

Choose the correlation method based on your data: Pearson for linear, Spearman or Kendall for rank-based.

Handle missing values with the use argument to avoid NA results.

Input data must be numeric; non-numeric inputs cause errors.

cor() returns a value between -1 and 1 indicating strength and direction of correlation.