Challenge - 5 Problems
Heatmap Correlation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of correlation heatmap data
What is the shape of the correlation matrix produced by this code?
Data Analysis Python
import pandas as pd import numpy as np data = pd.DataFrame(np.random.rand(10, 5), columns=list('ABCDE')) corr_matrix = data.corr() print(corr_matrix.shape)
Attempts:
2 left
💡 Hint
The correlation matrix compares each column with every other column.
✗ Incorrect
The correlation matrix shape is (number_of_columns, number_of_columns). Here, 5 columns produce a 5x5 matrix.
❓ data_output
intermediate2:00remaining
Correlation values from heatmap data
What is the correlation value between columns 'A' and 'B' in this dataset?
Data Analysis Python
import pandas as pd import numpy as np np.random.seed(0) data = pd.DataFrame({ 'A': np.arange(5), 'B': np.arange(5) * 2, 'C': np.random.rand(5) }) corr = data.corr() print(round(corr.loc['A', 'B'], 2))
Attempts:
2 left
💡 Hint
Columns 'A' and 'B' have a perfect linear relationship.
✗ Incorrect
Since 'B' is exactly twice 'A', their correlation is 1.00.
❓ visualization
advanced2:00remaining
Identify the correct heatmap color scale
Which heatmap color scale correctly represents correlations from -1 (blue) to 1 (red) with 0 as white?
Data Analysis Python
import seaborn as sns import matplotlib.pyplot as plt import numpy as np import pandas as pd np.random.seed(1) data = pd.DataFrame(np.random.randn(10, 4), columns=list('WXYZ')) corr = data.corr() sns.heatmap(corr, cmap='coolwarm') plt.show()
Attempts:
2 left
💡 Hint
The color scale should show blue for negative, white for zero, and red for positive.
✗ Incorrect
'coolwarm' is a diverging color map that shows blue to red with white in the middle, ideal for correlation heatmaps.
🔧 Debug
advanced2:00remaining
Error in heatmap correlation calculation
What error does this code raise when trying to plot a heatmap of correlations?
Data Analysis Python
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt data = pd.DataFrame({'A': [1, 2, 3], 'B': ['x', 'y', 'z']}) corr = data.corr() sns.heatmap(corr) plt.show()
Attempts:
2 left
💡 Hint
data.corr() ignores non-numeric columns.
✗ Incorrect
No error is raised. Column 'B' is non-numeric, so data.corr() only uses numeric column 'A', producing a 1x1 correlation matrix that displays correctly.
🚀 Application
expert3:00remaining
Interpreting correlation heatmap for feature selection
Given this correlation heatmap matrix, which feature should be dropped to reduce multicollinearity?
Data Analysis Python
import pandas as pd import numpy as np np.random.seed(42) data = pd.DataFrame({ 'X1': np.random.rand(100), 'X2': np.random.rand(100), 'X3': np.random.rand(100) * 0.5 + np.random.rand(100) * 0.5, 'X4': np.random.rand(100) * 0.9 + np.random.rand(100) * 0.1 }) data['X3'] = data['X1'] * 0.95 + np.random.rand(100) * 0.05 corr = data.corr() print(corr.round(2))
Attempts:
2 left
💡 Hint
Look for pairs with correlation close to 1 or -1.
✗ Incorrect
X3 is almost perfectly correlated with X1 (around 0.95), causing multicollinearity. Dropping X3 reduces redundancy.