What if you never had to copy-paste static data again and could trust it was always correct?
Why Seeds for static reference data in dbt? - Purpose & Use Cases
Start learning this pattern below
Jump into concepts and practice - no test required
Imagine you have a list of country codes and names stored in a spreadsheet. Every time you build your data models, you manually copy and paste this list into your queries or scripts.
It feels like a small task, but it happens over and over, and you worry about typos or outdated info.
Manually copying static reference data is slow and error-prone. You might forget to update the list, causing wrong results.
It also makes your data models messy and hard to maintain because the same data is repeated in many places.
Using seeds in dbt lets you store static reference data as CSV files inside your project.
dbt automatically loads this data as tables you can join with your models, keeping everything clean, consistent, and easy to update.
SELECT * FROM sales JOIN (VALUES ('US', 'United States'), ('CA', 'Canada')) AS countries(code, name) ON sales.country_code = countries.code
SELECT * FROM sales JOIN {{ ref('countries_seed') }} AS countries_seed ON sales.country_code = countries_seed.codeYou can easily manage and reuse static reference data across your entire project without duplication or errors.
A retail company uses seeds to store product categories and tax rates as static data, ensuring all sales reports use the same consistent info.
Manual copying of static data is slow and risky.
Seeds let you store static data as CSV files inside dbt projects.
This keeps your data models clean, consistent, and easy to maintain.
Practice
seeds in dbt?Solution
Step 1: Understand what seeds are in dbt
Seeds are CSV files that contain static reference data you want to load into your database.Step 2: Identify the main use of seeds
Seeds let you easily add fixed data tables without writing SQL queries.Final Answer:
To load static reference data from CSV files into your database -> Option BQuick Check:
Seeds = static CSV data load [OK]
- Confusing seeds with models that run SQL
- Thinking seeds schedule dbt runs
- Assuming seeds are for dynamic data
Solution
Step 1: Recall dbt commands related to seeds
The commanddbt seedloads CSV seed files into the database as tables.Step 2: Differentiate from other commands
dbt runruns models,dbt testruns tests, anddbt compilecompiles SQL but does not load seeds.Final Answer:
dbt seed -> Option CQuick Check:
Load seeds = dbt seed [OK]
- Using 'dbt run' to load seeds
- Confusing 'dbt test' with loading data
- Thinking 'dbt compile' loads data
countries.csv with columns id and name, what will be the output of this dbt model SQL?select * from {{ ref('countries') }}Solution
Step 1: Understand how seeds are referenced in dbt
Seeds become tables in the database and can be referenced usingref()like models.Step 2: Predict the query output
The query selects all columns and rows from the seed tablecountries, so it returns the full CSV data.Final Answer:
A table with all rows and columns from countries.csv -> Option AQuick Check:
ref(seed) = full seed table [OK]
- Thinking seeds cannot be referenced
- Assuming seeds load empty tables
- Expecting partial columns only
dbt seed but your seed table did not update. Which of these is the most likely cause?Solution
Step 1: Check seed discovery mechanism
dbt automatically discovers and loads CSV files from theseeds/folder withdbt seed.Step 2: Identify why table doesn't update
If the CSV file is missing from theseeds/folder,dbt seedruns successfully but skips that seed, leaving the table unchanged.Final Answer:
You forgot to add the seed CSV file in the seeds folder -> Option AQuick Check:
Seeds folder missing CSV = no update [OK]
- Thinking seeds require config in dbt_project.yml
- Confusing dbt run with dbt seed
- CSV syntax errors (would cause explicit failure)
currencies.csv with columns code and symbol inside a model to join with a transactions table on currency_code. Which is the correct way to write the join in your model SQL?Solution
Step 1: Recall how to reference seeds in dbt models
Seeds are referenced using{{ ref('seed_name') }}to get the table name in SQL.Step 2: Identify the correct join syntax
Joiningtransactionswith{{ ref('currencies') }}correctly uses the seed table in the join.Final Answer:
select t.*, c.symbol from transactions t join {{ ref('currencies') }} c on t.currency_code = c.code -> Option DQuick Check:
Join seed with ref() = correct [OK]
- Using raw CSV filename in SQL
- Forgetting to use ref() for seeds
- Trying to use a non-existent seed() function
