0
0
PandasHow-ToBeginner · 3 min read

How to Use std in pandas: Calculate Standard Deviation Easily

In pandas, you can use the std() method on a DataFrame or Series to calculate the standard deviation of numeric data. It measures how spread out the numbers are around the mean. Use df.std() for columns or series.std() for a single column.
📐

Syntax

The std() method calculates the standard deviation of data in a pandas Series or DataFrame.

  • DataFrame.std(axis=0, skipna=True, ddof=1)
  • Series.std(skipna=True, ddof=1)

Parameters:

  • axis: 0 for columns (default), 1 for rows (only for DataFrame)
  • skipna: True to ignore missing values (NaN)
  • ddof: Delta degrees of freedom, default 1 for sample std deviation
python
df.std(axis=0, skipna=True, ddof=1)
💻

Example

This example shows how to calculate the standard deviation of each column in a DataFrame and a single Series.

python
import pandas as pd

data = {'math': [90, 80, 70, 60], 'english': [85, 75, 65, 55]}
df = pd.DataFrame(data)

# Standard deviation of each column
std_columns = df.std()

# Standard deviation of the 'math' column (Series)
std_math = df['math'].std()

print('Standard deviation of each column:')
print(std_columns)
print('\nStandard deviation of math scores:')
print(std_math)
Output
Standard deviation of each column: math 12.909944 english 12.909944 dtype: float64 Standard deviation of math scores: 12.909944487358056
⚠️

Common Pitfalls

Common mistakes when using std() include:

  • Not handling missing values (NaN) which can cause unexpected results. Use skipna=True to ignore them.
  • Confusing population vs sample standard deviation. The default ddof=1 calculates sample std deviation. Use ddof=0 for population std deviation.
  • Applying std() on non-numeric columns will raise errors or produce NaNs.
python
import pandas as pd

data = {'scores': [10, 20, None, 40]}
df = pd.DataFrame(data)

# Wrong: missing values included (skipna=False)
std_wrong = df['scores'].std(skipna=False)

# Right: skip missing values
std_right = df['scores'].std(skipna=True)

print(f'Wrong std (skipna=False): {std_wrong}')
print(f'Right std (skipna=True): {std_right}')
Output
Wrong std (skipna=False): nan Right std (skipna=True): 15.275252316519467
📊

Quick Reference

Summary tips for using std() in pandas:

  • Use df.std() to get std deviation of each numeric column.
  • Use series.std() for a single column.
  • Set skipna=True to ignore missing values.
  • Set ddof=0 for population std deviation, ddof=1 for sample (default).
  • Only numeric data is considered; non-numeric columns are ignored or cause NaNs.

Key Takeaways

Use std() on DataFrame or Series to calculate standard deviation of numeric data.
Set skipna=True to ignore missing values and avoid errors.
Default ddof=1 calculates sample standard deviation; use ddof=0 for population.
Non-numeric columns are ignored or cause NaN results when calculating std.
You can calculate std by columns (axis=0) or rows (axis=1) in DataFrames.