Challenge - 5 Problems
Percentiles and Quantiles Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of percentile calculation with scipy
What is the output of this code snippet that calculates the 75th percentile of the data array?
SciPy
import numpy as np from scipy import stats data = np.array([10, 20, 30, 40, 50]) result = stats.scoreatpercentile(data, 75) print(result)
Attempts:
2 left
💡 Hint
Remember that the 75th percentile is the value below which 75% of the data fall. scipy's scoreatpercentile uses linear interpolation by default.
✗ Incorrect
The 75th percentile lies between 30 and 40. Using linear interpolation, it is 30 + 0.75*(40-30) = 37.5.
❓ data_output
intermediate2:00remaining
Number of elements below the 60th percentile
Given this dataset, how many elements are below the 60th percentile calculated using scipy?
SciPy
import numpy as np from scipy import stats data = np.array([5, 15, 25, 35, 45, 55, 65]) percentile_60 = stats.scoreatpercentile(data, 60) count_below = np.sum(data < percentile_60) print(count_below)
Attempts:
2 left
💡 Hint
Calculate the 60th percentile value first, then count how many data points are strictly less than that value.
✗ Incorrect
The 60th percentile is 41.0. Data points less than 41.0 are 5, 15, 25, 35 (4 elements).
❓ visualization
advanced3:00remaining
Visualizing quantiles with a boxplot
Which option produces a boxplot showing the quartiles of the data array using matplotlib and scipy?
SciPy
import numpy as np import matplotlib.pyplot as plt from scipy import stats data = np.array([12, 7, 3, 15, 8, 10, 6, 9, 11, 14]) q1 = stats.scoreatpercentile(data, 25) q2 = stats.scoreatpercentile(data, 50) q3 = stats.scoreatpercentile(data, 75) plt.boxplot(data) plt.title(f"Quartiles: Q1={q1}, Median={q2}, Q3={q3}") plt.show()
Attempts:
2 left
💡 Hint
Check the quartile values calculated by scipy and compare with the boxplot box edges and median.
✗ Incorrect
The 25th percentile (Q1) is 7.25, 50th percentile (median) is 9.5, and 75th percentile (Q3) is 11.75. The boxplot box edges correspond to Q1 and Q3, and the line inside is the median.
🧠 Conceptual
advanced2:00remaining
Understanding interpolation methods in percentile calculation
Which statement correctly describes the effect of changing the interpolation method in scipy's percentile calculation?
Attempts:
2 left
💡 Hint
Think about how each interpolation method chooses the percentile value relative to the data points.
✗ Incorrect
'Nearest' interpolation picks the closest actual data point to the percentile position. 'Linear' interpolates between points. 'Lower' picks the smaller data point at the percentile position. 'Higher' picks the larger data point.
🔧 Debug
expert2:00remaining
Identify the error in percentile calculation code
What error will this code raise when executed?
SciPy
import numpy as np from scipy import stats data = np.array([1, 2, 3, 4, 5]) result = stats.scoreatpercentile(data, 110) print(result)
Attempts:
2 left
💡 Hint
Percentiles must be between 0 and 100 inclusive.
✗ Incorrect
Passing a percentile value greater than 100 causes a ValueError because percentiles are defined only between 0 and 100.