0
0
dbtdata~10 mins

Configuring sources in YAML in dbt - Visual Walkthrough

Choose your learning style9 modes available
Concept Flow - Configuring sources in YAML
Start YAML file
Define 'sources' key
Add source name
Add tables under source
Specify table details (name, description)
Save YAML
dbt reads source config
Sources available for models
This flow shows how to write a YAML file to define data sources and tables for dbt to use in models.
Execution Sample
dbt
sources:
  - name: raw_data
    tables:
      - name: users
        description: 'User data from app'
Defines a source named 'raw_data' with a table 'users' and a description.
Execution Table
StepYAML LineActionState ChangeResult
1sources:Start defining sourcesCreate 'sources' keyEmpty list for sources
2- name: raw_dataAdd source nameAppend source dict with name 'raw_data'sources = [{'name': 'raw_data'}]
3tables:Add tables keyAdd empty 'tables' list to sourcesources[0]['tables'] = []
4- name: usersAdd table nameAppend table dict with name 'users'sources[0]['tables'] = [{'name': 'users'}]
5description: 'User data from app'Add descriptionAdd description to table dictsources[0]['tables'][0]['description'] = 'User data from app'
6End of YAMLFinish parsingYAML fully parsedSource config ready for dbt
💡 Reached end of YAML file, source configuration complete.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4After Step 5Final
sourcesundefined[{'name': 'raw_data'}][{'name': 'raw_data', 'tables': []}][{'name': 'raw_data', 'tables': [{'name': 'users'}]}][{'name': 'raw_data', 'tables': [{'name': 'users', 'description': 'User data from app'}]}][{'name': 'raw_data', 'tables': [{'name': 'users', 'description': 'User data from app'}]}]
Key Moments - 2 Insights
Why do we indent 'tables' under the source name?
Because YAML uses indentation to show hierarchy. 'tables' belongs to the source 'raw_data', so it must be indented under it as shown in step 3 of the execution table.
What happens if we forget to add a description for a table?
The table will still be recognized by dbt, but it won't have a description metadata. Step 5 shows adding description is optional but helpful.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the state of 'sources' after step 4?
A[{'name': 'raw_data', 'tables': []}]
B[{'name': 'raw_data', 'tables': [{'name': 'users'}]}]
C[{'name': 'raw_data'}]
Dundefined
💡 Hint
Check the 'State Change' column at step 4 in the execution table.
At which step is the table description added?
AStep 3
BStep 2
CStep 5
DStep 6
💡 Hint
Look for the 'Add description' action in the execution table.
If we remove the 'tables' key, what would happen to the source configuration?
Adbt will have no tables under the source, so no tables to reference.
BThe source will be invalid and cause an error.
CThe source will automatically add default tables.
DThe source will be ignored completely.
💡 Hint
Think about the role of 'tables' in the variable_tracker and execution_table.
Concept Snapshot
Configuring sources in YAML for dbt:
- Use 'sources:' at top level
- Define each source with '- name:'
- Under each source, add 'tables:' list
- Each table has '- name:' and optional 'description:'
- Indentation shows hierarchy
- dbt reads this to know where data comes from
Full Transcript
This visual execution shows how to configure sources in YAML for dbt. We start by creating a 'sources' key, then add a source name. Under that source, we add a 'tables' list. Each table has a name and can have a description. Indentation is important to show the structure. The execution table traces each step of parsing the YAML lines and how the internal data structure changes. The variable tracker shows how the 'sources' variable builds up step by step. Key moments clarify why indentation matters and the role of descriptions. The quiz tests understanding of the state after steps and the effect of missing keys. This helps beginners see exactly how dbt reads source configs from YAML.