0
0
dbtdata~5 mins

Configuring sources in YAML in dbt - Performance & Efficiency

Choose your learning style9 modes available
Time Complexity: Configuring sources in YAML
O(n)
Understanding Time Complexity

When we configure sources in YAML for dbt, we define where our data comes from.

We want to understand how the time to process these configurations grows as we add more sources or tables.

Scenario Under Consideration

Analyze the time complexity of this YAML source configuration snippet.


sources:
  - name: sales_db
    tables:
      - name: customers
      - name: orders
      - name: products

This snippet defines one source with three tables listed under it.

Identify Repeating Operations

Look at what repeats when dbt reads this YAML configuration.

  • Primary operation: Reading each table entry under a source.
  • How many times: Once for each table listed in the source.
How Execution Grows With Input

As you add more tables to a source, dbt reads each one in turn.

Input Size (n)Approx. Operations
10 tables10 reads
100 tables100 reads
1000 tables1000 reads

Pattern observation: The work grows directly with the number of tables.

Final Time Complexity

Time Complexity: O(n)

This means the time to process source configurations grows linearly with the number of tables.

Common Mistake

[X] Wrong: "Adding more tables won't affect processing time much because YAML is just text."

[OK] Correct: Even though YAML is text, dbt must read and process each table entry, so more tables mean more work.

Interview Connect

Understanding how configuration size affects processing helps you explain efficiency in real projects.

Self-Check

What if we added multiple sources each with many tables? How would the time complexity change?