Concept Flow - Correlation coefficient with np.corrcoef()

Start with two data arrays

↓

Call np.corrcoef(x, y)

↓

Calculate means of x and y

↓

Calculate deviations from means

↓

Calculate covariance matrix

↓

Normalize by std deviations

↓

Return 2x2 correlation matrix

↓

Extract correlation coefficient value

This flow shows how np.corrcoef() takes two data arrays, computes their covariance matrix normalized by standard deviations, and returns the correlation matrix.

Execution Sample

NumPy

import numpy as np
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
result = np.corrcoef(x, y)
print(result)

Calculate the correlation matrix between two lists x and y using np.corrcoef.

Execution Table

Step	Action	Intermediate Value	Result
1	Input arrays x and y	x=[1,2,3,4,5], y=[5,4,3,2,1]	Arrays ready
2	Calculate mean of x	mean_x = 3.0	Mean computed
3	Calculate mean of y	mean_y = 3.0	Mean computed
4	Calculate deviations from mean	x_dev=[-2,-1,0,1,2], y_dev=[2,1,0,-1,-2]	Deviations computed
5	Calculate covariance matrix	cov = [[2.5, -2.5], [-2.5, 2.5]]	Covariance matrix computed
6	Calculate std deviations	std_x = 1.58, std_y = 1.58	Standard deviations computed
7	Normalize covariance by std devs	corr = cov / (std_x * std_y)	Correlation matrix computed
8	Return correlation matrix	[[1.0, -1.0], [-1.0, 1.0]]	Correlation matrix returned
9	Extract correlation coefficient	corr_xy = -1.0	Correlation coefficient between x and y

💡 Finished computing correlation matrix and extracted coefficient

Variable Tracker

Variable	Start	After Step 2	After Step 3	After Step 4	After Step 5	After Step 6	After Step 7	Final
x	[1,2,3,4,5]	[1,2,3,4,5]	[1,2,3,4,5]	[-2,-1,0,1,2]	[-2,-1,0,1,2]	[-2,-1,0,1,2]	[-2,-1,0,1,2]	[-2,-1,0,1,2]
y	[5,4,3,2,1]	[5,4,3,2,1]	[5,4,3,2,1]	[2,1,0,-1,-2]	[2,1,0,-1,-2]	[2,1,0,-1,-2]	[2,1,0,-1,-2]	[2,1,0,-1,-2]
mean_x	None	3.0	3.0	3.0	3.0	3.0	3.0	3.0
mean_y	None	None	3.0	3.0	3.0	3.0	3.0	3.0
covariance matrix	None	None	None	None	[[2.5, -2.5], [-2.5, 2.5]]	[[2.5, -2.5], [-2.5, 2.5]]	[[2.5, -2.5], [-2.5, 2.5]]	[[2.5, -2.5], [-2.5, 2.5]]
std_x	None	None	None	None	None	1.58	1.58	1.58
std_y	None	None	None	None	None	1.58	1.58	1.58
correlation matrix	None	None	None	None	None	None	[[1.0, -1.0], [-1.0, 1.0]]	[[1.0, -1.0], [-1.0, 1.0]]
correlation coefficient (x,y)	None	None	None	None	None	None	None	-1.0

Key Moments - 3 Insights

Why does np.corrcoef return a 2x2 matrix instead of a single number?

Why is the correlation coefficient between x and y negative here?

What does the value 1.58 for std_x and std_y represent?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution table at step 5. What is the covariance between x and y?

A2.5

B-2.5

C0

D1.0

Concept Snapshot

np.corrcoef(x, y) computes the correlation matrix between arrays x and y.
Returns a 2x2 matrix with 1s on the diagonal and correlation coefficients off-diagonal.
Correlation measures linear relationship: +1 perfect positive, -1 perfect negative, 0 none.
Calculates covariance normalized by standard deviations.
Use result[0,1] or result[1,0] to get correlation coefficient between x and y.

Full Transcript

This visual execution traces how numpy's np.corrcoef function calculates the correlation coefficient between two data arrays. Starting with input arrays x and y, it computes their means, then deviations from those means. Next, it calculates the covariance matrix, which measures how the variables vary together. Then it finds the standard deviations of each array to normalize the covariance. Dividing covariance by the product of standard deviations produces the correlation matrix. The diagonal values are always 1, showing perfect correlation of each array with itself. The off-diagonal values are the correlation coefficients between x and y. In the example, x increases while y decreases, so their correlation coefficient is -1, indicating a perfect negative linear relationship. This step-by-step trace helps beginners see how np.corrcoef works internally and what each intermediate value means.