Bird
Raised Fist0
ML Pythonml~8 mins

Time series components (trend, seasonality) in ML Python - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Time series components (trend, seasonality)
Which metric matters for this concept and WHY

For time series components like trend and seasonality, the key metric is Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE). These metrics measure how close our model's predictions are to actual values over time.

We focus on these because time series data changes over time, and we want to capture patterns like upward trends or repeating seasonal effects accurately.

Confusion matrix or equivalent visualization (ASCII)
Actual Values:      100, 120, 130, 150, 170, 160, 140, 130
Predicted Values:   105, 118, 128, 155, 165, 158, 138, 135

Error (Actual - Predicted): -5, 2, 2, -5, 5, 2, 2, -5

MAE = (|−5| + |2| + |2| + |−5| + |5| + |2| + |2| + |−5|) / 8 = 3.375
RMSE = sqrt((25 + 4 + 4 + 25 + 25 + 4 + 4 + 25) / 8) ≈ 4.33
    

This shows how well the model captures trend and seasonality by measuring prediction errors.

Precision vs Recall (or equivalent tradeoff) with concrete examples

In time series, instead of precision and recall, we balance bias and variance.

  • High bias (underfitting): Model misses trend or seasonality, so errors are large and consistent.
  • High variance (overfitting): Model fits noise, causing errors to vary wildly on new data.

Example: A sales forecast model that ignores holiday seasonality (high bias) will miss sales spikes. A model that fits every small fluctuation (high variance) will fail to predict future sales well.

What "good" vs "bad" metric values look like for this use case

Good: Low MAE and RMSE values close to zero, meaning predictions closely follow actual data including trend and seasonality.

Bad: High MAE and RMSE values, indicating the model misses important patterns like steady growth or repeating seasonal peaks.

For example, if monthly sales range from 100 to 200, an MAE of 5 is good, but an MAE of 50 is bad.

Metrics pitfalls (accuracy paradox, data leakage, overfitting indicators)
  • Ignoring seasonality: Leads to systematic errors during seasonal peaks or drops.
  • Data leakage: Using future data to train the model inflates performance metrics falsely.
  • Overfitting: Very low training error but high error on new data means the model learned noise, not true patterns.
  • Accuracy paradox: High overall accuracy can hide poor performance during important seasonal events.
Self-check

Your time series model has an MAE of 2 on training data but 20 on new data. Is it good? Why or why not?

Answer: No, this shows overfitting. The model fits training data well but fails to generalize to new data, missing true trend or seasonality patterns.

Key Result
Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) best measure how well a model captures trend and seasonality in time series.

Practice

(1/5)
1. Which component of a time series shows the long-term upward or downward movement over time?
easy
A. Trend
B. Seasonality
C. Noise
D. Residual

Solution

  1. Step 1: Understand the meaning of trend

    The trend component represents the overall direction or pattern in the data over a long period, such as increasing sales over years.
  2. Step 2: Differentiate from seasonality and noise

    Seasonality repeats in fixed cycles (like monthly), and noise is random variation. Trend is the smooth long-term movement.
  3. Final Answer:

    Trend -> Option A
  4. Quick Check:

    Long-term direction = Trend [OK]
Hint: Trend = overall direction over time, not repeating cycles [OK]
Common Mistakes:
  • Confusing seasonality with trend
  • Thinking noise is trend
  • Mixing residual with trend
2. Which of the following is the correct Python code to plot seasonality in a time series using pandas?
easy
A. df['value'].plot()
B. df['value'].rolling(window=12).mean().plot()
C. df['value'].groupby(df.index.month).mean().plot()
D. df['value'].diff().plot()

Solution

  1. Step 1: Identify how to extract seasonality

    Seasonality repeats in fixed intervals like months, so grouping by month and averaging shows seasonal pattern.
  2. Step 2: Check code options

    df['value'].groupby(df.index.month).mean().plot() groups by month and plots mean, revealing seasonality. Others plot raw data, trend (rolling mean), or differences.
  3. Final Answer:

    df['value'].groupby(df.index.month).mean().plot() -> Option C
  4. Quick Check:

    Group by time period for seasonality plot [OK]
Hint: Group data by time unit (month) to see seasonality [OK]
Common Mistakes:
  • Plotting raw data only
  • Using rolling mean for seasonality
  • Plotting differences instead of seasonal groups
3. Given this Python code snippet, what will be the output type of seasonal?
import pandas as pd
import numpy as np
index = pd.date_range('2023-01-01', periods=12, freq='M')
data = np.sin(np.linspace(0, 2 * np.pi, 12))
df = pd.Series(data, index=index)
seasonal = df.groupby(df.index.month).transform('mean')
medium
A. A numpy array of length 12
B. A pandas Series with same length as df
C. A pandas DataFrame with 12 rows and 1 column
D. A single float value representing mean

Solution

  1. Step 1: Understand groupby with transform

    Using groupby with transform('mean') returns a Series aligned with original index, same length as df.
  2. Step 2: Check output type

    Since df is a Series, seasonal is also a Series with same length, each value replaced by group mean.
  3. Final Answer:

    A pandas Series with same length as df -> Option B
  4. Quick Check:

    groupby + transform returns Series matching original length [OK]
Hint: groupby + transform keeps original length Series [OK]
Common Mistakes:
  • Thinking transform returns single value
  • Confusing transform with aggregate
  • Expecting DataFrame instead of Series
4. You have this code to extract trend using rolling mean:
trend = df['value'].rolling(window=3).mean()
But the output has many NaN values at the start. How can you fix this?
medium
A. Use diff() instead of rolling mean
B. Change window to 1
C. Drop NaN values after rolling mean
D. Use min_periods=1 in rolling to reduce NaNs

Solution

  1. Step 1: Understand rolling mean NaNs

    Rolling mean with window=3 needs 3 values to compute, so first 2 are NaN by default.
  2. Step 2: Use min_periods to allow fewer values

    Setting min_periods=1 lets rolling mean compute with fewer points, reducing NaNs at start.
  3. Final Answer:

    Use min_periods=1 in rolling to reduce NaNs -> Option D
  4. Quick Check:

    min_periods controls minimum data points for rolling [OK]
Hint: Set min_periods=1 in rolling to avoid initial NaNs [OK]
Common Mistakes:
  • Changing window to 1 loses smoothing
  • Dropping NaNs loses early data
  • Using diff() does not fix NaNs
5. You have monthly sales data with a strong yearly seasonality and an upward trend. Which method best separates trend and seasonality components?
hard
A. Use moving average with window=12 for trend, then subtract to get seasonality
B. Use differencing with lag=1 to remove seasonality
C. Apply Fourier transform to remove trend
D. Use rolling mean with window=3 to capture seasonality

Solution

  1. Step 1: Understand yearly seasonality and trend

    Yearly seasonality repeats every 12 months; trend is slow upward movement.
  2. Step 2: Choose method to separate components

    Moving average with window=12 smooths out seasonality, capturing trend. Subtracting trend leaves seasonality.
  3. Step 3: Evaluate other options

    Differencing with lag=1 removes short-term changes, not yearly seasonality. Fourier transform is complex. Rolling mean with window=3 is too short for yearly seasonality.
  4. Final Answer:

    Use moving average with window=12 for trend, then subtract to get seasonality -> Option A
  5. Quick Check:

    Window matches season length to isolate trend [OK]
Hint: Match moving average window to season length to isolate trend [OK]
Common Mistakes:
  • Using too short window for moving average
  • Confusing differencing lag with season length
  • Ignoring trend when extracting seasonality