0
0
NumpyHow-ToBeginner ยท 3 min read

How to Calculate Percentile Using NumPy in Python

Use numpy.percentile(array, percentile_value) to calculate the percentile of data in a NumPy array. The function returns the value below which the given percentage of data falls.
๐Ÿ“

Syntax

The basic syntax of the NumPy percentile function is:

  • numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False)

Where:

  • a is the input data array.
  • q is the percentile or sequence of percentiles to compute (0-100).
  • axis specifies the axis along which to compute percentiles (default is flattened array).
  • Other parameters control output and interpolation method.
python
numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False)
๐Ÿ’ป

Example

This example shows how to calculate the 25th, 50th, and 75th percentiles of a NumPy array.

python
import numpy as np

data = np.array([10, 20, 30, 40, 50])
percentiles = np.percentile(data, [25, 50, 75])
print(percentiles)
Output
[20. 30. 40.]
โš ๏ธ

Common Pitfalls

Common mistakes when using numpy.percentile include:

  • Passing percentile values outside the 0-100 range, which causes errors.
  • Not specifying the axis when working with multi-dimensional arrays, leading to unexpected results.
  • Using deprecated interpolation methods; use the default or specify valid options like 'linear', 'nearest', 'midpoint', or 'lower'.
python
import numpy as np

data = np.array([[10, 20, 30], [40, 50, 60]])

# Wrong: percentile value out of range
# np.percentile(data, 110)  # This will raise an error

# Wrong: no axis specified, but want percentiles per column
percentiles_wrong = np.percentile(data, 50)

# Right: specify axis=0 to get median per column
percentiles_right = np.percentile(data, 50, axis=0)

print('Without axis:', percentiles_wrong)
print('With axis=0:', percentiles_right)
Output
Without axis: 35.0 With axis=0: [25. 35. 45.]
๐Ÿ“Š

Quick Reference

ParameterDescription
aInput data array
qPercentile or list of percentiles (0-100)
axisAxis along which to compute percentiles (default None)
methodMethod to use when the desired percentile lies between two data points (default 'linear')
keepdimsIf True, retains reduced dimensions with length 1
โœ…

Key Takeaways

Use numpy.percentile(array, percentile) to find the value below which the given percent of data falls.
Percentile values must be between 0 and 100 inclusive.
Specify the axis parameter when working with multi-dimensional arrays to get correct results.
Avoid deprecated interpolation methods; use the default or valid options like 'linear' or 'nearest'.
The function works on flattened data by default if axis is not specified.