Bird
Raised Fist0
ML Pythonml~5 mins

Stationarity and differencing in ML Python - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does it mean for a time series to be stationary?
A stationary time series has constant mean, constant variance, and constant autocovariance over time. This means its behavior does not change as time passes.
Click to reveal answer
beginner
Why is stationarity important in time series analysis?
Many forecasting models assume stationarity because it makes patterns stable and predictable. Without stationarity, models may give unreliable predictions.
Click to reveal answer
beginner
What is differencing in time series?
Differencing means subtracting the previous value from the current value to remove trends or seasonality, helping to make a time series stationary.
Click to reveal answer
beginner
How do you perform first-order differencing on a time series?
First-order differencing subtracts each value by the value immediately before it: new_value = current_value - previous_value.
Click to reveal answer
intermediate
What is the effect of over-differencing a time series?
Over-differencing can remove important information and add unnecessary noise, making the series harder to model and interpret.
Click to reveal answer
Which of the following is NOT a characteristic of a stationary time series?
AConstant autocorrelation structure
BConstant variance over time
CConstant mean over time
DIncreasing trend over time
What is the main purpose of differencing a time series?
ATo make the series stationary
BTo add noise to the series
CTo smooth the series
DTo increase the series length
How is first-order differencing calculated?
ASubtracting the previous value from the current value
BAdding the current and previous values
CSubtracting the current value from the next value
DDividing the current value by the previous value
If a time series is already stationary, what happens if you apply differencing?
AIt makes the series non-stationary
BIt improves the model accuracy
CIt may add unnecessary noise
DIt has no effect
Which test is commonly used to check stationarity?
AT-test
BAugmented Dickey-Fuller test
CChi-square test
DANOVA
Explain in your own words what stationarity means and why it matters in time series forecasting.
Think about how a series that changes its behavior over time might confuse a forecasting model.
You got /4 concepts.
    Describe how differencing helps to make a time series stationary and what risks come with over-differencing.
    Consider differencing as a way to flatten the series but too much can harm it.
    You got /4 concepts.

      Practice

      (1/5)
      1. What does it mean when a time series is stationary?
      easy
      A. It has missing values that need to be filled
      B. It has a clear upward or downward trend
      C. It contains seasonal patterns repeating over fixed intervals
      D. Its statistical properties like mean and variance do not change over time

      Solution

      1. Step 1: Understand stationarity definition

        Stationarity means the data's mean, variance, and other statistics stay constant over time.
      2. Step 2: Compare options to definition

        Only Its statistical properties like mean and variance do not change over time describes constant statistical properties; others describe trends, seasonality, or missing data.
      3. Final Answer:

        Its statistical properties like mean and variance do not change over time -> Option D
      4. Quick Check:

        Stationary = constant mean/variance [OK]
      Hint: Stationary means stats don't change over time [OK]
      Common Mistakes:
      • Confusing stationarity with trend presence
      • Thinking seasonality means stationarity
      • Assuming missing data affects stationarity
      2. Which Python code correctly applies first-order differencing to a pandas Series data?
      easy
      A. data.dropna()
      B. data.diff(1)
      C. data.cumsum()
      D. data.shift(1)

      Solution

      1. Step 1: Recall differencing method in pandas

        The diff(1) method calculates the difference between current and previous values, performing first-order differencing.
      2. Step 2: Check other options

        shift(1) shifts data, cumsum() sums cumulatively, and dropna() removes missing values, none perform differencing.
      3. Final Answer:

        data.diff(1) -> Option B
      4. Quick Check:

        First difference = diff(1) [OK]
      Hint: Use diff(1) for first-order differencing in pandas [OK]
      Common Mistakes:
      • Using shift instead of diff for differencing
      • Confusing cumulative sum with differencing
      • Dropping NaNs instead of differencing
      3. Given this code snippet:
      import pandas as pd
      series = pd.Series([10, 12, 15, 20, 25])
      diff_series = series.diff(1).dropna()
      print(diff_series.tolist())

      What is the output?
      medium
      A. [0, 2, 3, 5, 5]
      B. [10, 12, 15, 20, 25]
      C. [2.0, 3.0, 5.0, 5.0]
      D. [nan, 2, 3, 5, 5]

      Solution

      1. Step 1: Calculate first differences

        Differences: 12-10=2, 15-12=3, 20-15=5, 25-20=5.
      2. Step 2: Drop NaN and print list

        The first difference is NaN, dropped by dropna(), so output is [2.0, 3.0, 5.0, 5.0].
      3. Final Answer:

        [2.0, 3.0, 5.0, 5.0] -> Option C
      4. Quick Check:

        Diff values = [2.0,3.0,5.0,5.0] [OK]
      Hint: First diff drops first NaN, output is differences list [OK]
      Common Mistakes:
      • Including NaN in output list
      • Printing original series instead of differences
      • Confusing shift with diff output
      4. You applied first-order differencing to a time series but it still shows a trend. What is the likely issue?
      medium
      A. The series needs second-order differencing to remove the trend
      B. You should use cumulative sum instead of differencing
      C. The series is already stationary and differencing added noise
      D. You forgot to normalize the data before differencing

      Solution

      1. Step 1: Understand differencing orders

        First-order differencing removes linear trends; if trend remains, higher order differencing may be needed.
      2. Step 2: Evaluate other options

        Cumulative sum adds trend, normalization doesn't remove trend, and differencing adding noise means series was not stationary before.
      3. Final Answer:

        The series needs second-order differencing to remove the trend -> Option A
      4. Quick Check:

        Trend remains -> try second differencing [OK]
      Hint: If trend remains, increase differencing order [OK]
      Common Mistakes:
      • Using cumulative sum instead of differencing
      • Assuming normalization removes trend
      • Stopping at first differencing without checking stationarity
      5. You have a monthly sales time series with a yearly seasonal pattern and an upward trend. Which differencing approach should you apply to make it stationary?
      hard
      A. Apply first-order differencing followed by seasonal differencing with lag 12
      B. Apply only first-order differencing
      C. Apply only seasonal differencing with lag 12
      D. Apply logarithm transformation without differencing

      Solution

      1. Step 1: Identify components to remove

        The series has both trend and yearly seasonality, so both need to be removed for stationarity.
      2. Step 2: Choose differencing methods

        First-order differencing removes trend; seasonal differencing with lag 12 removes yearly seasonality.
      3. Step 3: Combine differencing steps

        Applying first-order differencing then seasonal differencing is the correct approach to achieve stationarity.
      4. Final Answer:

        Apply first-order differencing followed by seasonal differencing with lag 12 -> Option A
      5. Quick Check:

        Trend + seasonality -> first + seasonal differencing [OK]
      Hint: Remove trend then seasonality with two differencing steps [OK]
      Common Mistakes:
      • Applying only one differencing type ignoring trend or seasonality
      • Using log transform alone to fix non-stationarity
      • Confusing seasonal lag with differencing order