0
0
SciPydata~10 mins

Percentiles and quantiles in SciPy - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Percentiles and quantiles
Start with data array
Sort data in ascending order
Choose percentile or quantile value
Calculate position in sorted data
Interpolate if needed
Return value at position
End
We start with data, sort it, pick a percentile or quantile, find its position, interpolate if needed, and return the value.
Execution Sample
SciPy
import numpy as np
from scipy import stats

data = np.array([10, 20, 30, 40, 50])
percentile_40 = np.percentile(data, 40)
quantile_0_4 = np.quantile(data, 0.4)
Calculate the 40th percentile and 0.4 quantile of a small data array.
Execution Table
StepActionData StateCalculationResult
1Start with data array[10, 20, 30, 40, 50]N/AN/A
2Sort data[10, 20, 30, 40, 50]Already sorted[10, 20, 30, 40, 50]
3Calculate 40th percentile positionN/Apos = (40/100)*(5-1) = 1.6Position 1.6
4Interpolate between index 1 and 2Values at 1=20, 2=3020 + 0.6*(30-20)26.0
5Calculate 0.4 quantile positionN/Apos = 0.4*(5-1) = 1.6Position 1.6
6Interpolate between index 1 and 2Values at 1=20, 2=3020 + 0.6*(30-20)26.0
7Return resultsN/AN/Apercentile_40 = 26.0, quantile_0_4 = 26.0
8EndN/AN/ACalculation complete
💡 Finished calculating requested percentile and quantile values.
Variable Tracker
VariableStartAfter Step 3After Step 4After Step 5After Step 6Final
data[10,20,30,40,50][10,20,30,40,50][10,20,30,40,50][10,20,30,40,50][10,20,30,40,50][10,20,30,40,50]
percentile_40N/AN/A26.026.026.026.0
quantile_0_4N/AN/AN/AN/A26.026.0
Key Moments - 2 Insights
Why do we interpolate between two data points for the percentile?
Because the calculated position is not an integer index, we find the value between two points to get a precise percentile value, as shown in steps 3 and 4.
Are percentile and quantile calculations different here?
No, both use the same position calculation and interpolation method, so the 40th percentile and 0.4 quantile give the same result, as seen in steps 3-6.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the calculated position for the 40th percentile?
A2.0
B1.6
C0.4
D4.0
💡 Hint
Check Step 3 in the execution table where position is calculated.
At which step do we interpolate between data points?
AStep 2
BStep 6
CStep 4
DStep 7
💡 Hint
Look for where values at indices 1 and 2 are used for interpolation.
If the data array had 10 elements instead of 5, how would the position for the 40th percentile change?
AIt would be 3.6
BIt would be 4.0
CIt would be 1.6
DIt would be 0.4
💡 Hint
Position = (percentile/100)*(n-1), so with n=10, calculate accordingly.
Concept Snapshot
Percentiles and quantiles find data values below a given percentage.
Sort data first.
Position = percentile/quantile * (n-1).
Interpolate if position is not whole.
Use numpy.percentile or numpy.quantile for easy calculation.
Full Transcript
We start with a data array and sort it. To find a percentile or quantile, we calculate its position in the sorted data using the formula position = (percentile/100) * (n-1) or position = quantile * (n-1). If the position is not an integer, we interpolate between the two closest data points. For example, the 40th percentile of [10,20,30,40,50] is at position 1.6, so we interpolate between the values at index 1 (20) and index 2 (30), resulting in 26.0. The same applies for the 0.4 quantile. This method gives a precise value representing the percentile or quantile in the data.