0
0
Data Analysis Pythondata~20 mins

Correlation with corr() in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Correlation Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this correlation calculation?
Given the following DataFrame, what is the output of df.corr()?
Data Analysis Python
import pandas as pd
import numpy as np

data = {'A': [1, 2, 3, 4, 5], 'B': [5, 4, 3, 2, 1], 'C': [2, 3, 2, 3, 2]}
df = pd.DataFrame(data)
print(df.corr())
A{'A': {'A': 1.0, 'B': -1.0, 'C': 0.0}, 'B': {'A': -1.0, 'B': 1.0, 'C': 0.0}, 'C': {'A': 0.0, 'B': 0.0, 'C': 1.0}}
B
     A    B    C
A  1.0 -1.0  0.0
B -1.0  1.0  0.0
C  0.0  0.0  1.0
C
     A    B    C
A  1.0  1.0  0.0
B  1.0  1.0  0.0
C  0.0  0.0  1.0
D
     A    B    C
A  1.0 -1.0  1.0
B -1.0  1.0 -1.0
C  1.0 -1.0  1.0
Attempts:
2 left
💡 Hint
Remember that correlation measures how two variables move together. Negative means opposite directions.
data_output
intermediate
2:00remaining
How many pairs have correlation above 0.5?
Using the DataFrame below, how many unique pairs of columns have a correlation greater than 0.5?
Data Analysis Python
import pandas as pd
import numpy as np
np.random.seed(0)
data = {'X': np.random.rand(10), 'Y': np.random.rand(10), 'Z': np.random.rand(10)}
df = pd.DataFrame(data)
corr_matrix = df.corr()
# Count pairs with correlation > 0.5 (excluding self-correlation)
count = 0
for i in corr_matrix.columns:
    for j in corr_matrix.columns:
        if i < j and corr_matrix.loc[i,j] > 0.5:
            count += 1
print(count)
A0
B1
C2
D3
Attempts:
2 left
💡 Hint
Random numbers usually have low correlation unless seeded or constructed.
🔧 Debug
advanced
2:00remaining
What error does this code raise?
What error will this code raise when trying to compute correlation?
Data Analysis Python
import pandas as pd
data = {'A': [1, 2, 3], 'B': ['x', 'y', 'z']}
df = pd.DataFrame(data)
print(df.corr())
ATypeError: unsupported operand type(s) for -: 'str' and 'int'
BValueError: could not convert string to float: 'x'
CEmpty DataFrame with columns and index but no data
DNo error, 1x1 correlation matrix with value 1.0
Attempts:
2 left
💡 Hint
Correlation only works on numeric columns; non-numeric columns are ignored.
🚀 Application
advanced
2:00remaining
Which column pair has the strongest positive correlation?
Given the DataFrame below, which pair of columns has the strongest positive correlation?
Data Analysis Python
import pandas as pd
data = {'P': [1, 2, 3, 4, 5], 'Q': [2, 4, 6, 8, 10], 'R': [5, 4, 3, 2, 1], 'S': [1, 3, 1, 7, 9]}
df = pd.DataFrame(data)
corr = df.corr()
print(corr)
AQ and R
BP and R
CP and Q
DP and S
Attempts:
2 left
💡 Hint
Look for pairs where one column is a perfect multiple of the other.
🧠 Conceptual
expert
2:00remaining
What does a correlation value of 0 indicate?
In the context of the corr() method, what does a correlation value of 0 between two variables mean?
AThe two variables have no linear relationship
BThe two variables are independent in all ways
CThe two variables have a perfect positive linear relationship
DThe two variables have a perfect negative linear relationship
Attempts:
2 left
💡 Hint
Correlation measures linear relationships only.