What is Pearson correlation in SciPy?

SciPydata~5 mins

Pearson correlation in SciPy

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Pearson correlation helps us see how two things change together. It tells if one goes up when the other goes up or down.

Checking if hours studied and exam scores are related.

Seeing if temperature and ice cream sales move together.

Finding if exercise time and weight loss are connected.

Understanding if advertising budget and product sales increase together.

Syntax

SciPy

from scipy.stats import pearsonr
correlation_coefficient, p_value = pearsonr(x, y)

x and y are lists or arrays of numbers with the same length.

The function returns two values: the correlation number and a p-value to check significance.

Examples

This shows a perfect positive correlation of 1.0 because y doubles x.

SciPy

from scipy.stats import pearsonr
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
corr, p = pearsonr(x, y)
print(corr)

This shows a perfect negative correlation of -1.0 because y decreases as x increases.

SciPy

from scipy.stats import pearsonr
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
corr, p = pearsonr(x, y)
print(corr)

This shows no correlation (nan) because y does not change at all.

SciPy

from scipy.stats import pearsonr
x = [1, 2, 3, 4, 5]
y = [5, 5, 5, 5, 5]
corr, p = pearsonr(x, y)
print(corr)

Sample Program

This program calculates how strongly hours studied and exam scores are related. It prints the correlation number and the p-value to check if the result is meaningful.

SciPy

from scipy.stats import pearsonr

# Example data: hours studied and exam scores
hours_studied = [1, 2, 3, 4, 5]
exam_scores = [50, 55, 65, 70, 80]

corr, p_value = pearsonr(hours_studied, exam_scores)

print(f"Pearson correlation coefficient: {corr:.2f}")
print(f"P-value: {p_value:.4f}")

OutputSuccess

Important Notes

The correlation value ranges from -1 to 1.

A value close to 1 means strong positive relation, close to -1 means strong negative relation, and around 0 means no relation.

The p-value helps decide if the correlation is likely real or by chance. A small p-value (like less than 0.05) means it is probably real.

Summary

Pearson correlation measures how two sets of numbers move together.

Use pearsonr from scipy.stats to get the correlation and p-value.

Values near 1 or -1 show strong relationships; near 0 means weak or no relationship.