How to Use ewm in pandas for Exponential Weighted Calculations
Use
pandas.DataFrame.ewm() or pandas.Series.ewm() to create an exponential weighted object, then apply functions like mean() to get exponentially weighted moving averages. This method gives more weight to recent data points, useful for smoothing time series data.Syntax
The ewm() method creates an exponential weighted object on a Series or DataFrame. You can specify parameters like span, com, or halflife to control the decay of weights.
- span: Specify the decay in terms of span, higher means slower decay.
- com: Center of mass, another way to specify decay.
- halflife: Time for the weight to reduce by half.
- adjust: If True (default), weights are adjusted for imbalance.
- ignore_na: If True, ignores missing values in calculation.
After calling ewm(), use aggregation functions like mean(), std(), or var() to get results.
python
df.ewm(span=span_value, adjust=True, ignore_na=False).mean()
Example
This example shows how to calculate the exponential weighted moving average (EWMA) of a simple numeric series using span=3. The EWMA gives more weight to recent values, smoothing the data.
python
import pandas as pd # Create a simple data series data = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) # Calculate EWMA with span=3 ewm_result = data.ewm(span=3, adjust=False).mean() print(ewm_result)
Output
0 1.000000
1 1.500000
2 2.250000
3 3.125000
4 4.062500
5 5.031250
6 6.015625
7 7.007812
8 8.003906
9 9.001953
dtype: float64
Common Pitfalls
Common mistakes when using ewm() include:
- Confusing
adjust=Trueandadjust=False.adjust=Truecalculates weighted averages with weights normalized, whileadjust=Falseuses recursive calculation which is faster but slightly different. - Not handling missing values properly. Use
ignore_na=Trueto skip NaNs in calculations. - Using incompatible parameters together, like setting both
spanandhalflifeat the same time.
Example of wrong and right usage:
python
import pandas as pd s = pd.Series([1, None, 3, 4, None, 6]) # Wrong: default adjust=True includes NaNs affecting results wrong_ewm = s.ewm(span=2).mean() # Right: ignore_na=True skips NaNs right_ewm = s.ewm(span=2, ignore_na=True).mean() print('Wrong EWMA with NaNs:') print(wrong_ewm) print('\nRight EWMA ignoring NaNs:') print(right_ewm)
Output
Wrong EWMA with NaNs:
0 1.000000
1 NaN
2 3.000000
3 3.500000
4 NaN
5 6.000000
dtype: float64
Right EWMA ignoring NaNs:
0 1.000000
1 1.000000
2 2.000000
3 3.000000
4 3.000000
5 4.500000
dtype: float64
Quick Reference
| Parameter | Description | Example |
|---|---|---|
| span | Decay in terms of span; higher means slower decay | ewm(span=3) |
| com | Center of mass, alternative decay control | ewm(com=0.5) |
| halflife | Time for weight to reduce by half | ewm(halflife=2) |
| adjust | If True, weights are normalized; if False, recursive calculation | ewm(adjust=False) |
| ignore_na | If True, skips NaN values in calculation | ewm(ignore_na=True) |
Key Takeaways
Use pandas ewm() to calculate exponential weighted statistics giving more weight to recent data.
Choose decay parameter like span, com, or halflife to control how fast weights decrease.
Set adjust=False for faster recursive calculation or adjust=True for normalized weights.
Use ignore_na=True to handle missing data properly in calculations.
After ewm(), apply aggregation functions like mean(), std(), or var() to get results.