How to Calculate Percentile Using NumPy in Python
Use
numpy.percentile(array, percentile_value) to calculate the percentile of data in a NumPy array. The function returns the value below which the given percentage of data falls.Syntax
The basic syntax of the NumPy percentile function is:
numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False)
Where:
ais the input data array.qis the percentile or sequence of percentiles to compute (0-100).axisspecifies the axis along which to compute percentiles (default is flattened array).- Other parameters control output and interpolation method.
python
numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False)
Example
This example shows how to calculate the 25th, 50th, and 75th percentiles of a NumPy array.
python
import numpy as np data = np.array([10, 20, 30, 40, 50]) percentiles = np.percentile(data, [25, 50, 75]) print(percentiles)
Output
[20. 30. 40.]
Common Pitfalls
Common mistakes when using numpy.percentile include:
- Passing percentile values outside the 0-100 range, which causes errors.
- Not specifying the
axiswhen working with multi-dimensional arrays, leading to unexpected results. - Using deprecated interpolation methods; use the default or specify valid options like 'linear', 'nearest', 'midpoint', or 'lower'.
python
import numpy as np data = np.array([[10, 20, 30], [40, 50, 60]]) # Wrong: percentile value out of range # np.percentile(data, 110) # This will raise an error # Wrong: no axis specified, but want percentiles per column percentiles_wrong = np.percentile(data, 50) # Right: specify axis=0 to get median per column percentiles_right = np.percentile(data, 50, axis=0) print('Without axis:', percentiles_wrong) print('With axis=0:', percentiles_right)
Output
Without axis: 35.0
With axis=0: [25. 35. 45.]
Quick Reference
| Parameter | Description |
|---|---|
| a | Input data array |
| q | Percentile or list of percentiles (0-100) |
| axis | Axis along which to compute percentiles (default None) |
| method | Method to use when the desired percentile lies between two data points (default 'linear') |
| keepdims | If True, retains reduced dimensions with length 1 |
Key Takeaways
Use numpy.percentile(array, percentile) to find the value below which the given percent of data falls.
Percentile values must be between 0 and 100 inclusive.
Specify the axis parameter when working with multi-dimensional arrays to get correct results.
Avoid deprecated interpolation methods; use the default or valid options like 'linear' or 'nearest'.
The function works on flattened data by default if axis is not specified.