How to Calculate Median Using NumPy in Python
Use
numpy.median() to calculate the median of an array or list. Pass your data as the first argument, and optionally specify the axis to calculate median along rows or columns.Syntax
The basic syntax of numpy.median() is:
numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
Where:
a: Input array or list of numbers.axis: Axis along which to compute the median. Default isNone(median of flattened array).out: Optional output array to store result.overwrite_input: IfTrue, allows modifying input array for speed.keepdims: IfTrue, retains reduced dimensions with size 1.
python
numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
Example
This example shows how to calculate the median of a 1D and 2D array using numpy.median(). It also demonstrates calculating median along a specific axis.
python
import numpy as np # 1D array median arr1d = np.array([10, 20, 30, 40, 50]) median_1d = np.median(arr1d) # 2D array median (flattened) arr2d = np.array([[1, 3, 5], [2, 4, 6]]) median_2d_flat = np.median(arr2d) # Median along axis 0 (columns) median_axis0 = np.median(arr2d, axis=0) # Median along axis 1 (rows) median_axis1 = np.median(arr2d, axis=1) print(f"Median of 1D array: {median_1d}") print(f"Median of 2D array (flattened): {median_2d_flat}") print(f"Median along axis 0: {median_axis0}") print(f"Median along axis 1: {median_axis1}")
Output
Median of 1D array: 30.0
Median of 2D array (flattened): 3.5
Median along axis 0: [1.5 3.5 5.5]
Median along axis 1: [3. 4.]
Common Pitfalls
Common mistakes when calculating median with NumPy include:
- Passing a list without converting to a NumPy array is allowed but slower.
- Not specifying
axiswhen working with multi-dimensional arrays can lead to unexpected results. - For even-length data, median is the average of the two middle values, which might be surprising.
- Using
overwrite_input=Truemodifies the original array, which can cause bugs if you reuse the data.
python
import numpy as np # Wrong: median of 2D array without axis might confuse beginners arr = np.array([[1, 2], [3, 4]]) median_wrong = np.median(arr) # median of flattened array # Right: specify axis to get median per row or column median_axis0 = np.median(arr, axis=0) # median per column median_axis1 = np.median(arr, axis=1) # median per row print(f"Median without axis (flattened): {median_wrong}") print(f"Median along axis 0: {median_axis0}") print(f"Median along axis 1: {median_axis1}")
Output
Median without axis (flattened): 2.5
Median along axis 0: [2. 3.]
Median along axis 1: [1.5 3.5]
Quick Reference
Summary tips for using numpy.median():
- Use
axis=Nonefor median of all elements. - Use
axis=0for median of each column in 2D arrays. - Use
axis=1for median of each row in 2D arrays. - Input can be a list or NumPy array; arrays are faster.
- Median of even-length data is the average of middle two values.
Key Takeaways
Use numpy.median() to find the median of data arrays easily.
Specify the axis parameter to calculate median along rows or columns in multi-dimensional arrays.
Median of even-length data is the average of the two middle values.
Passing a NumPy array is faster than a list but both work.
Avoid overwrite_input=True unless you want to modify the original data.