0
0
NumPydata~5 mins

Correlation coefficient with np.corrcoef() in NumPy

Choose your learning style9 modes available
Introduction

We use correlation coefficient to see how two sets of numbers move together. It tells us if they go up and down together or not.

Checking if hours studied and exam scores are related.
Seeing if temperature and ice cream sales change together.
Finding out if advertising budget and product sales move in sync.
Understanding if two stocks tend to rise or fall together.
Syntax
NumPy
np.corrcoef(x, y=None, rowvar=True)

x and y are arrays of numbers you want to compare.

The function returns a matrix showing correlation coefficients between inputs.

Examples
Calculate correlation between two lists of numbers.
NumPy
import numpy as np
x = [1, 2, 3]
y = [4, 5, 6]
np.corrcoef(x, y)
Calculate correlation matrix of one list with itself.
NumPy
np.corrcoef([1, 2, 3])
Calculate correlation matrix for two rows of data.
NumPy
np.corrcoef([[1, 2, 3], [4, 5, 6]])
Sample Program

This program shows how hours studied and exam scores are related using correlation coefficient.

NumPy
import numpy as np

# Two sets of data: hours studied and exam scores
hours_studied = [1, 2, 3, 4, 5]
exam_scores = [50, 55, 65, 70, 80]

# Calculate correlation coefficient matrix
corr_matrix = np.corrcoef(hours_studied, exam_scores)

print("Correlation coefficient matrix:")
print(corr_matrix)

# Extract the correlation coefficient between the two variables
corr_value = corr_matrix[0, 1]
print(f"Correlation coefficient between hours studied and exam scores: {corr_value:.2f}")
OutputSuccess
Important Notes

The correlation coefficient ranges from -1 to 1.

1 means perfect positive relation, -1 means perfect negative relation, 0 means no relation.

np.corrcoef returns a matrix; the off-diagonal values show correlation between different inputs.

Summary

Correlation coefficient shows how two data sets move together.

Use np.corrcoef() to calculate it easily.

Look at the off-diagonal value in the result matrix for the correlation between two variables.