Challenge - 5 Problems
KDE Overlay Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this KDE overlay plot code?
Look at the code below that creates two KDE plots overlaid on the same axes. What will the plot show?
Matplotlib
import numpy as np import matplotlib.pyplot as plt from scipy.stats import gaussian_kde np.random.seed(0) data1 = np.random.normal(0, 1, 100) data2 = np.random.normal(2, 1, 100) kde1 = gaussian_kde(data1) kde2 = gaussian_kde(data2) x = np.linspace(-4, 6, 200) plt.plot(x, kde1(x), label='Data 1') plt.plot(x, kde2(x), label='Data 2') plt.legend() plt.show()
Attempts:
2 left
💡 Hint
Think about where the normal distributions are centered and how KDE estimates density.
✗ Incorrect
The code generates two normal distributions centered at 0 and 2. The KDE plots show smooth density curves with peaks near these centers, overlapping because the distributions are close.
❓ data_output
intermediate1:30remaining
How many peaks are visible in the KDE overlay plot?
Given two datasets with different centers, when overlaid using KDE plots, how many distinct peaks should appear?
Matplotlib
import numpy as np from scipy.stats import gaussian_kde np.random.seed(1) data1 = np.random.normal(-1, 0.5, 150) data2 = np.random.normal(3, 0.5, 150) kde1 = gaussian_kde(data1) kde2 = gaussian_kde(data2) x = np.linspace(-3, 5, 300) kde_values1 = kde1(x) kde_values2 = kde2(x) combined = kde_values1 + kde_values2 peaks = ((combined[1:-1] > combined[:-2]) & (combined[1:-1] > combined[2:])).sum()
Attempts:
2 left
💡 Hint
Each dataset is centered far apart, so each should create a peak.
✗ Incorrect
Since the two datasets are centered at -1 and 3, their KDEs produce two distinct peaks in the combined density.
❓ visualization
advanced2:30remaining
Which option correctly overlays KDE plots with different bandwidths?
You want to overlay KDE plots of two datasets but use a smaller bandwidth for the second dataset to get a sharper curve. Which code snippet does this correctly?
Attempts:
2 left
💡 Hint
Smaller bandwidth means sharper peaks. Assign smaller bandwidth to second dataset.
✗ Incorrect
Option C sets a larger bandwidth (0.5) for the first dataset and a smaller bandwidth (0.1) for the second, producing a sharper KDE curve for the second dataset as requested.
🔧 Debug
advanced2:00remaining
What error does this KDE overlay code raise?
This code tries to overlay KDE plots but raises an error. What is the error?
Matplotlib
import numpy as np import matplotlib.pyplot as plt from scipy.stats import gaussian_kde np.random.seed(0) data1 = np.random.normal(0, 1, 100) data2 = [] kde1 = gaussian_kde(data1) kde2 = gaussian_kde(data2) x = np.linspace(-4, 4, 100) plt.plot(x, kde1(x), label='Data 1') plt.plot(x, kde2(x), label='Data 2') plt.legend() plt.show()
Attempts:
2 left
💡 Hint
Check the shape and content of data2 before passing to gaussian_kde.
✗ Incorrect
The second dataset is empty, so gaussian_kde raises a ValueError because it needs at least one data point to estimate density.
🚀 Application
expert3:00remaining
How to combine and visualize KDE overlays for three datasets with different colors and transparency?
You have three datasets and want to plot their KDE overlays on the same plot. You want each KDE curve to have a distinct color and some transparency so overlaps are visible. Which code snippet achieves this?
Attempts:
2 left
💡 Hint
Use both color and alpha parameters in plt.plot for distinct colors and transparency.
✗ Incorrect
Option A sets distinct colors and alpha=0.5 for transparency, making overlaps visible and curves distinguishable.