0
0
dbtdata~5 mins

Source freshness checks in dbt - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Source freshness checks
O(n)
Understanding Time Complexity

We want to understand how the time it takes to check source freshness grows as the amount of data increases.

How does dbt handle checking many sources and their freshness efficiently?

Scenario Under Consideration

Analyze the time complexity of the following dbt source freshness check snippet.

sources:
  - name: my_source
    freshness:
      warn_after:
        count: 12
        period: hour
      error_after:
        count: 24
        period: hour

# dbt runs freshness checks for each source table

This snippet defines freshness rules for a source. dbt will check each source table's last update time against these rules.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Checking the freshness timestamp for each source table.
  • How many times: Once per source table configured in dbt.
How Execution Grows With Input

As the number of source tables grows, dbt checks each one individually.

Input Size (n)Approx. Operations
1010 freshness checks
100100 freshness checks
10001000 freshness checks

Pattern observation: The number of freshness checks grows directly with the number of source tables.

Final Time Complexity

Time Complexity: O(n)

This means the time to check freshness grows linearly with the number of source tables.

Common Mistake

[X] Wrong: "Checking freshness is constant time no matter how many sources there are."

[OK] Correct: Each source table requires its own check, so more sources mean more checks and more time.

Interview Connect

Understanding how operations scale with input size helps you explain efficiency clearly and confidently in real projects.

Self-Check

"What if dbt cached freshness results and only checked sources updated recently? How would that affect time complexity?"