PandasHow-ToBeginner · 3 min read

How to Use std in pandas: Calculate Standard Deviation Easily

In pandas, you can use the std() method on a DataFrame or Series to calculate the standard deviation of numeric data. It measures how spread out the numbers are around the mean. Use df.std() for columns or series.std() for a single column.

📐

Syntax

The std() method calculates the standard deviation of data in a pandas Series or DataFrame.

DataFrame.std(axis=0, skipna=True, ddof=1)
Series.std(skipna=True, ddof=1)

Parameters:

axis: 0 for columns (default), 1 for rows (only for DataFrame)
skipna: True to ignore missing values (NaN)
ddof: Delta degrees of freedom, default 1 for sample std deviation

python

df.std(axis=0, skipna=True, ddof=1)

💻

Example

This example shows how to calculate the standard deviation of each column in a DataFrame and a single Series.

python

import pandas as pd

data = {'math': [90, 80, 70, 60], 'english': [85, 75, 65, 55]}
df = pd.DataFrame(data)

# Standard deviation of each column
std_columns = df.std()

# Standard deviation of the 'math' column (Series)
std_math = df['math'].std()

print('Standard deviation of each column:')
print(std_columns)
print('\nStandard deviation of math scores:')
print(std_math)

Output

Standard deviation of each column: math 12.909944 english 12.909944 dtype: float64 Standard deviation of math scores: 12.909944487358056

⚠️

Common Pitfalls

Common mistakes when using std() include:

Not handling missing values (NaN) which can cause unexpected results. Use skipna=True to ignore them.
Confusing population vs sample standard deviation. The default ddof=1 calculates sample std deviation. Use ddof=0 for population std deviation.
Applying std() on non-numeric columns will raise errors or produce NaNs.

python

import pandas as pd

data = {'scores': [10, 20, None, 40]}
df = pd.DataFrame(data)

# Wrong: missing values included (skipna=False)
std_wrong = df['scores'].std(skipna=False)

# Right: skip missing values
std_right = df['scores'].std(skipna=True)

print(f'Wrong std (skipna=False): {std_wrong}')
print(f'Right std (skipna=True): {std_right}')

Output

Wrong std (skipna=False): nan Right std (skipna=True): 15.275252316519467

📊

Quick Reference

Summary tips for using std() in pandas:

Use df.std() to get std deviation of each numeric column.
Use series.std() for a single column.
Set skipna=True to ignore missing values.
Set ddof=0 for population std deviation, ddof=1 for sample (default).
Only numeric data is considered; non-numeric columns are ignored or cause NaNs.

✅

Key Takeaways

Use std() on DataFrame or Series to calculate standard deviation of numeric data.

Set skipna=True to ignore missing values and avoid errors.

Default ddof=1 calculates sample standard deviation; use ddof=0 for population.

Non-numeric columns are ignored or cause NaN results when calculating std.

You can calculate std by columns (axis=0) or rows (axis=1) in DataFrames.