How to Use std in pandas: Calculate Standard Deviation Easily
In pandas, you can use the
std() method on a DataFrame or Series to calculate the standard deviation of numeric data. It measures how spread out the numbers are around the mean. Use df.std() for columns or series.std() for a single column.Syntax
The std() method calculates the standard deviation of data in a pandas Series or DataFrame.
DataFrame.std(axis=0, skipna=True, ddof=1)Series.std(skipna=True, ddof=1)
Parameters:
axis: 0 for columns (default), 1 for rows (only for DataFrame)skipna: True to ignore missing values (NaN)ddof: Delta degrees of freedom, default 1 for sample std deviation
python
df.std(axis=0, skipna=True, ddof=1)
Example
This example shows how to calculate the standard deviation of each column in a DataFrame and a single Series.
python
import pandas as pd data = {'math': [90, 80, 70, 60], 'english': [85, 75, 65, 55]} df = pd.DataFrame(data) # Standard deviation of each column std_columns = df.std() # Standard deviation of the 'math' column (Series) std_math = df['math'].std() print('Standard deviation of each column:') print(std_columns) print('\nStandard deviation of math scores:') print(std_math)
Output
Standard deviation of each column:
math 12.909944
english 12.909944
dtype: float64
Standard deviation of math scores:
12.909944487358056
Common Pitfalls
Common mistakes when using std() include:
- Not handling missing values (NaN) which can cause unexpected results. Use
skipna=Trueto ignore them. - Confusing population vs sample standard deviation. The default
ddof=1calculates sample std deviation. Useddof=0for population std deviation. - Applying
std()on non-numeric columns will raise errors or produce NaNs.
python
import pandas as pd data = {'scores': [10, 20, None, 40]} df = pd.DataFrame(data) # Wrong: missing values included (skipna=False) std_wrong = df['scores'].std(skipna=False) # Right: skip missing values std_right = df['scores'].std(skipna=True) print(f'Wrong std (skipna=False): {std_wrong}') print(f'Right std (skipna=True): {std_right}')
Output
Wrong std (skipna=False): nan
Right std (skipna=True): 15.275252316519467
Quick Reference
Summary tips for using std() in pandas:
- Use
df.std()to get std deviation of each numeric column. - Use
series.std()for a single column. - Set
skipna=Trueto ignore missing values. - Set
ddof=0for population std deviation,ddof=1for sample (default). - Only numeric data is considered; non-numeric columns are ignored or cause NaNs.
Key Takeaways
Use
std() on DataFrame or Series to calculate standard deviation of numeric data.Set
skipna=True to ignore missing values and avoid errors.Default
ddof=1 calculates sample standard deviation; use ddof=0 for population.Non-numeric columns are ignored or cause NaN results when calculating std.
You can calculate std by columns (
axis=0) or rows (axis=1) in DataFrames.