0
0
PandasHow-ToBeginner · 3 min read

How to Use Median in pandas: Syntax and Examples

In pandas, you can use the median() method on a DataFrame or Series to find the middle value of numeric data. This method ignores missing values by default and returns the median for each column or the entire Series.
📐

Syntax

The median() method can be used on a pandas DataFrame or Series. It calculates the median value of numeric data.

  • DataFrame.median(axis=0, skipna=True, numeric_only=None)
  • Series.median(skipna=True)

Parameters:

  • axis: 0 for columns (default), 1 for rows (only for DataFrame)
  • skipna: whether to ignore missing values (default is True)
  • numeric_only: include only numeric data (default is None)
python
df.median(axis=0, skipna=True, numeric_only=None)
series.median(skipna=True)
💻

Example

This example shows how to calculate the median of each column in a DataFrame and the median of a Series.

python
import pandas as pd

data = {'A': [1, 2, 3, 4, 5], 'B': [5, 6, None, 8, 9]}
df = pd.DataFrame(data)

# Median of each column
median_df = df.median()

# Median of a Series
series = pd.Series([10, 20, 30, None, 40])
median_series = series.median()

print('Median of DataFrame columns:\n', median_df)
print('\nMedian of Series:', median_series)
Output
Median of DataFrame columns: A 3.0 B 7.0 dtype: float64 Median of Series: 25.0
⚠️

Common Pitfalls

Common mistakes when using median() include:

  • Not handling missing values properly. By default, median() skips NaN values, but if skipna=False, it returns NaN if any missing values exist.
  • Using median() on non-numeric columns without setting numeric_only=True can cause errors.
  • Confusing the axis parameter when working with DataFrames.

Wrong way:

df.median(skipna=False)

This returns NaN if any missing values exist.

Right way:

df.median(skipna=True)
python
import pandas as pd

data = {'A': [1, 2, None, 4], 'B': [None, 6, 7, 8]}
df = pd.DataFrame(data)

# Wrong: skipna=False returns NaN if missing values exist
median_wrong = df.median(skipna=False)

# Right: skipna=True ignores missing values
median_right = df.median(skipna=True)

print('Median with skipna=False:\n', median_wrong)
print('\nMedian with skipna=True:\n', median_right)
Output
Median with skipna=False: A NaN B NaN dtype: float64 Median with skipna=True: A 2.0 B 7.0 dtype: float64
📊

Quick Reference

Summary tips for using median() in pandas:

  • Use median() on DataFrames or Series to get the middle value.
  • Set skipna=True (default) to ignore missing values.
  • Use axis=0 for column-wise median, axis=1 for row-wise median in DataFrames.
  • Ensure data is numeric or use numeric_only=True to avoid errors.

Key Takeaways

Use the pandas median() method to find the middle value of numeric data in Series or DataFrames.
By default, median() ignores missing values with skipna=True.
Set axis=0 for column medians and axis=1 for row medians in DataFrames.
Ensure your data is numeric or use numeric_only=True to avoid errors.
Be careful with missing values; setting skipna=False returns NaN if any missing data exists.