0
0
Data Analysis Pythondata~5 mins

Correlation with corr() in Data Analysis Python

Choose your learning style9 modes available
Introduction

Correlation helps us see how two things change together. It tells if one thing goes up when the other goes up or down.

To check if study time and test scores are related.
To find if temperature and ice cream sales move together.
To see if height and weight have a connection.
To understand if advertising budget affects sales.
To explore relationships between different features in a dataset before modeling.
Syntax
Data Analysis Python
DataFrame.corr(method='pearson')

This method calculates correlation between numeric columns in a DataFrame.

The default method is 'pearson', which measures linear correlation.

Examples
Calculate Pearson correlation between all numeric columns in the DataFrame df.
Data Analysis Python
df.corr()
Explicitly specify Pearson correlation (default).
Data Analysis Python
df.corr(method='pearson')
Calculate Kendall Tau correlation, useful for small datasets or non-linear relationships.
Data Analysis Python
df.corr(method='kendall')
Calculate Spearman rank correlation, which measures monotonic relationships.
Data Analysis Python
df.corr(method='spearman')
Sample Program

This code creates a small dataset with study hours, test scores, and sleep hours. Then it calculates the correlation matrix to see how these columns relate.

Data Analysis Python
import pandas as pd

# Create a simple dataset
data = {
    'study_hours': [2, 3, 5, 8, 10],
    'test_score': [50, 55, 65, 80, 90],
    'sleep_hours': [7, 6, 5, 6, 7]
}
df = pd.DataFrame(data)

# Calculate correlation matrix
corr_matrix = df.corr()
print(corr_matrix)
OutputSuccess
Important Notes

Correlation values range from -1 to 1.

A value close to 1 means strong positive relation, close to -1 means strong negative relation, and around 0 means no relation.

Only numeric columns are used in correlation calculation.

Summary

Use corr() to find relationships between numeric columns.

Correlation helps understand how variables move together.

Values near 1 or -1 show strong relationships; near 0 means weak or no relationship.