How to Use Median in pandas: Syntax and Examples
In pandas, you can use the
median() method on a DataFrame or Series to find the middle value of numeric data. This method ignores missing values by default and returns the median for each column or the entire Series.Syntax
The median() method can be used on a pandas DataFrame or Series. It calculates the median value of numeric data.
DataFrame.median(axis=0, skipna=True, numeric_only=None)Series.median(skipna=True)
Parameters:
axis: 0 for columns (default), 1 for rows (only for DataFrame)skipna: whether to ignore missing values (default is True)numeric_only: include only numeric data (default is None)
python
df.median(axis=0, skipna=True, numeric_only=None) series.median(skipna=True)
Example
This example shows how to calculate the median of each column in a DataFrame and the median of a Series.
python
import pandas as pd data = {'A': [1, 2, 3, 4, 5], 'B': [5, 6, None, 8, 9]} df = pd.DataFrame(data) # Median of each column median_df = df.median() # Median of a Series series = pd.Series([10, 20, 30, None, 40]) median_series = series.median() print('Median of DataFrame columns:\n', median_df) print('\nMedian of Series:', median_series)
Output
Median of DataFrame columns:
A 3.0
B 7.0
dtype: float64
Median of Series: 25.0
Common Pitfalls
Common mistakes when using median() include:
- Not handling missing values properly. By default,
median()skipsNaNvalues, but ifskipna=False, it returnsNaNif any missing values exist. - Using
median()on non-numeric columns without settingnumeric_only=Truecan cause errors. - Confusing the
axisparameter when working with DataFrames.
Wrong way:
df.median(skipna=False)
This returns NaN if any missing values exist.
Right way:
df.median(skipna=True)
python
import pandas as pd data = {'A': [1, 2, None, 4], 'B': [None, 6, 7, 8]} df = pd.DataFrame(data) # Wrong: skipna=False returns NaN if missing values exist median_wrong = df.median(skipna=False) # Right: skipna=True ignores missing values median_right = df.median(skipna=True) print('Median with skipna=False:\n', median_wrong) print('\nMedian with skipna=True:\n', median_right)
Output
Median with skipna=False:
A NaN
B NaN
dtype: float64
Median with skipna=True:
A 2.0
B 7.0
dtype: float64
Quick Reference
Summary tips for using median() in pandas:
- Use
median()on DataFrames or Series to get the middle value. - Set
skipna=True(default) to ignore missing values. - Use
axis=0for column-wise median,axis=1for row-wise median in DataFrames. - Ensure data is numeric or use
numeric_only=Trueto avoid errors.
Key Takeaways
Use the pandas
median() method to find the middle value of numeric data in Series or DataFrames.By default,
median() ignores missing values with skipna=True.Set
axis=0 for column medians and axis=1 for row medians in DataFrames.Ensure your data is numeric or use
numeric_only=True to avoid errors.Be careful with missing values; setting
skipna=False returns NaN if any missing data exists.