0
0
Data Analysis Pythondata~10 mins

Correlation with corr() in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Correlation with corr()
Start with DataFrame
Call corr() method
Calculate pairwise correlations
Return correlation matrix
Use or visualize results
This flow shows how calling corr() on a DataFrame calculates pairwise correlations and returns a matrix.
Execution Sample
Data Analysis Python
import pandas as pd

data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
correlation_matrix = df.corr()
print(correlation_matrix)
This code creates a DataFrame and calculates the correlation matrix between its columns.
Execution Table
StepActionCalculationResult
1Create DataFrameColumns A, B, C with valuesdf with 3 rows and 3 columns
2Call df.corr()Calculate pairwise Pearson correlationsCompute correlation for (A,B), (A,C), (B,C) and diagonals
3Correlation A vs ACorrelation of A with itself1.0
4Correlation A vs BCalculate correlation coefficient1.0
5Correlation A vs CCalculate correlation coefficient1.0
6Correlation B vs BCorrelation of B with itself1.0
7Correlation B vs CCalculate correlation coefficient1.0
8Correlation C vs CCorrelation of C with itself1.0
9Return matrixAssemble all correlations into matrix3x3 matrix with all 1.0 values
💡 All pairwise correlations computed; matrix returned.
Variable Tracker
VariableStartAfter corr() callFinal
df{'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]}Same DataFrameSame DataFrame
correlation_matrixNone3x3 matrix with 1.0 in all cells3x3 matrix with 1.0 in all cells
Key Moments - 2 Insights
Why are all correlation values 1.0 in the matrix?
Because columns A, B, and C increase perfectly together, their Pearson correlation is 1.0 as shown in steps 4, 5, and 7 of the execution_table.
Does corr() change the original DataFrame?
No, corr() only calculates and returns a new correlation matrix without modifying the original DataFrame, as seen in variable_tracker where df remains unchanged.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 4, what is the correlation between columns A and B?
A0.5
B1.0
C-1.0
D0.0
💡 Hint
Refer to execution_table row with Step 4 showing correlation A vs B.
At which step does the code calculate the correlation of column B with itself?
AStep 6
BStep 7
CStep 3
DStep 9
💡 Hint
Look at execution_table row labeled 'Correlation B vs B'.
If column C had random values instead of increasing values, how would the correlation_matrix change?
AAll values would still be 1.0
BDiagonal values would be 0
CSome off-diagonal values would be less than 1.0
DMatrix would be empty
💡 Hint
Correlation depends on how columns relate; see variable_tracker and execution_table for perfect correlation case.
Concept Snapshot
DataFrame.corr() computes pairwise correlation coefficients.
Returns a matrix showing correlation between each pair of columns.
Values range from -1 (inverse) to 1 (perfect positive).
Diagonal values are always 1 (column with itself).
Useful to find relationships between numeric columns.
Full Transcript
We start with a DataFrame containing numeric columns A, B, and C. Calling the corr() method calculates the Pearson correlation coefficient between each pair of columns. The method returns a new DataFrame showing these correlations. In this example, since all columns increase together perfectly, all correlation values are 1.0. The original DataFrame remains unchanged. This process helps us understand how strongly columns relate to each other.