Bird
Raised Fist0
dbtdata~20 mins

Seeds for static reference data in dbt - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Seed Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of a dbt seed CSV import
Given a seed CSV file named countries.csv with the following content:
country_code,country_name
US,United States
CA,Canada
MX,Mexico

and a dbt project configured to load this seed, what will be the output of the SQL query select * from {{ ref('countries') }}?
A[{'country_code': 'US', 'country_name': 'United States'}, {'country_code': 'CA', 'country_name': 'Canada'}, {'country_code': 'MX', 'country_name': 'Mexico'}]
B[{'country_code': 'US', 'country_name': 'United States'}, {'country_code': 'CA', 'country_name': 'Canada'}]
C[]
DSyntaxError: table 'countries' does not exist
Attempts:
2 left
💡 Hint
Think about what dbt seeds do with CSV files and how they become tables.
data_output
intermediate
1:00remaining
Number of rows in a seeded table after dbt seed
If you have a seed CSV file with 10 rows (excluding header) and run dbt seed, how many rows will the resulting table contain?
A9
B11
C10
D0
Attempts:
2 left
💡 Hint
Remember that the header row is not counted as data.
🔧 Debug
advanced
2:00remaining
Why does dbt seed fail to load a CSV with inconsistent columns?
You have a seed CSV file where some rows have fewer columns than the header row. Running dbt seed results in an error. What is the most likely cause?
Adbt seed requires the CSV file to be sorted alphabetically.
BThe CSV file has inconsistent number of columns per row causing parsing errors.
CThe CSV file must have no header row for dbt seed to work.
Ddbt seed only supports CSV files with exactly 5 columns.
Attempts:
2 left
💡 Hint
Think about how CSV parsers handle rows with missing columns.
🚀 Application
advanced
2:30remaining
Using seeds for static lookup tables in dbt models
You want to use a static lookup table for country codes in your dbt models. Which approach correctly uses a seed for this purpose?
ACreate a CSV file with country codes, add it to the seeds folder, run <code>dbt seed</code>, then reference it in models using <code>{{ ref('filename_without_extension') }}</code>.
BWrite the country codes directly inside the model SQL using a VALUES clause.
CCreate a separate model with hardcoded country codes and join it in other models.
DUse an external API call inside the dbt model to fetch country codes dynamically.
Attempts:
2 left
💡 Hint
Seeds are designed to load static CSV data as tables for easy reference.
🧠 Conceptual
expert
3:00remaining
Best practice for updating static reference data with dbt seeds
You have a seed CSV file used for static reference data in production. What is the best practice to update this data safely?
ARename the seed CSV file and add a new seed to avoid overwriting existing data.
BManually delete the seed table in the warehouse, then run <code>dbt seed</code>.
CEdit the seed table directly in the warehouse using SQL UPDATE statements.
DUpdate the CSV file, run <code>dbt seed --full-refresh</code> to reload the table with new data.
Attempts:
2 left
💡 Hint
Think about how dbt manages seed tables and refreshing data.

Practice

(1/5)
1. What is the main purpose of using seeds in dbt?
easy
A. To create dynamic tables based on SQL queries
B. To load static reference data from CSV files into your database
C. To schedule dbt runs automatically
D. To write Python scripts for data transformation

Solution

  1. Step 1: Understand what seeds are in dbt

    Seeds are CSV files that contain static reference data you want to load into your database.
  2. Step 2: Identify the main use of seeds

    Seeds let you easily add fixed data tables without writing SQL queries.
  3. Final Answer:

    To load static reference data from CSV files into your database -> Option B
  4. Quick Check:

    Seeds = static CSV data load [OK]
Hint: Seeds = fixed CSV data loaded as tables [OK]
Common Mistakes:
  • Confusing seeds with models that run SQL
  • Thinking seeds schedule dbt runs
  • Assuming seeds are for dynamic data
2. Which command do you run to load or refresh seed data in your database?
easy
A. dbt test
B. dbt run
C. dbt seed
D. dbt compile

Solution

  1. Step 1: Recall dbt commands related to seeds

    The command dbt seed loads CSV seed files into the database as tables.
  2. Step 2: Differentiate from other commands

    dbt run runs models, dbt test runs tests, and dbt compile compiles SQL but does not load seeds.
  3. Final Answer:

    dbt seed -> Option C
  4. Quick Check:

    Load seeds = dbt seed [OK]
Hint: Use 'dbt seed' to load CSV data tables [OK]
Common Mistakes:
  • Using 'dbt run' to load seeds
  • Confusing 'dbt test' with loading data
  • Thinking 'dbt compile' loads data
3. Given a seed CSV file countries.csv with columns id and name, what will be the output of this dbt model SQL?
select * from {{ ref('countries') }}
medium
A. A table with all rows and columns from countries.csv
B. Only the id column from countries.csv
C. An empty table because seeds are not loaded automatically
D. An error because seeds cannot be referenced

Solution

  1. Step 1: Understand how seeds are referenced in dbt

    Seeds become tables in the database and can be referenced using ref() like models.
  2. Step 2: Predict the query output

    The query selects all columns and rows from the seed table countries, so it returns the full CSV data.
  3. Final Answer:

    A table with all rows and columns from countries.csv -> Option A
  4. Quick Check:

    ref(seed) = full seed table [OK]
Hint: ref(seed_name) returns full seed table [OK]
Common Mistakes:
  • Thinking seeds cannot be referenced
  • Assuming seeds load empty tables
  • Expecting partial columns only
4. You ran dbt seed but your seed table did not update. Which of these is the most likely cause?
medium
A. You forgot to add the seed CSV file in the seeds folder
B. You ran dbt run instead of dbt seed
C. Your seed CSV file has syntax errors
D. You did not configure the seed in dbt_project.yml

Solution

  1. Step 1: Check seed discovery mechanism

    dbt automatically discovers and loads CSV files from the seeds/ folder with dbt seed.
  2. Step 2: Identify why table doesn't update

    If the CSV file is missing from the seeds/ folder, dbt seed runs successfully but skips that seed, leaving the table unchanged.
  3. Final Answer:

    You forgot to add the seed CSV file in the seeds folder -> Option A
  4. Quick Check:

    Seeds folder missing CSV = no update [OK]
Hint: Place seed CSVs in seeds/ folder for dbt seed [OK]
Common Mistakes:
  • Thinking seeds require config in dbt_project.yml
  • Confusing dbt run with dbt seed
  • CSV syntax errors (would cause explicit failure)
5. You want to use a seed file currencies.csv with columns code and symbol inside a model to join with a transactions table on currency_code. Which is the correct way to write the join in your model SQL?
hard
A. select t.*, c.symbol from transactions t join currencies c on t.currency_code = c.code
B. select t.*, c.symbol from transactions t join currencies.csv c on t.currency_code = c.code
C. select t.*, c.symbol from transactions t join seed('currencies') c on t.currency_code = c.code
D. select t.*, c.symbol from transactions t join {{ ref('currencies') }} c on t.currency_code = c.code

Solution

  1. Step 1: Recall how to reference seeds in dbt models

    Seeds are referenced using {{ ref('seed_name') }} to get the table name in SQL.
  2. Step 2: Identify the correct join syntax

    Joining transactions with {{ ref('currencies') }} correctly uses the seed table in the join.
  3. Final Answer:

    select t.*, c.symbol from transactions t join {{ ref('currencies') }} c on t.currency_code = c.code -> Option D
  4. Quick Check:

    Join seed with ref() = correct [OK]
Hint: Use ref('seed_name') to join seed tables in models [OK]
Common Mistakes:
  • Using raw CSV filename in SQL
  • Forgetting to use ref() for seeds
  • Trying to use a non-existent seed() function