0
0
DbtConceptBeginner · 3 min read

What is Seed in dbt: Definition, Example, and Use Cases

In dbt, a seed is a CSV file that you add to your project to load static data directly into your data warehouse. Seeds let you manage small reference tables or lookup data as part of your dbt workflow without needing external sources.
⚙️

How It Works

Think of a seed in dbt like a small spreadsheet you keep inside your project folder. Instead of connecting to an external database or API, you store this data as a CSV file. When you run dbt, it reads this CSV and loads the data into your warehouse as a table.

This is useful because it keeps your reference data version-controlled and easy to update alongside your transformations. It’s like having a mini database table that you manage directly in your code, making your project more self-contained and reproducible.

💻

Example

This example shows how to add a seed CSV file and load it into your warehouse using dbt.

plaintext
/* 1. Create a CSV file named countries.csv in the 'data' folder of your dbt project */

id,name,continent
1,Canada,North America
2,Germany,Europe
3,Japan,Asia

/* 2. In your dbt project, run the command to load the seed data */
dbt seed

/* 3. After running, dbt creates a table named 'countries' in your warehouse with the CSV data */
Output
Table 'countries' created with 3 rows: | id | name | continent | |----|--------|---------------| | 1 | Canada | North America | | 2 | Germany| Europe | | 3 | Japan | Asia |
🎯

When to Use

Use seeds when you have small, static datasets that rarely change but are needed for your transformations. Examples include country codes, product categories, or fixed lookup tables.

Seeds are great for keeping your project self-contained and avoiding dependencies on external data sources for reference data. They also help with version control since the CSV files live in your project repository.

Key Points

  • Seeds are CSV files stored in your dbt project under the data folder.
  • Running dbt seed loads these CSVs as tables in your data warehouse.
  • Seeds are ideal for small, static reference data.
  • They help keep your project version-controlled and self-contained.

Key Takeaways

A seed in dbt is a CSV file loaded as a table in your warehouse.
Seeds store small, static reference data inside your dbt project.
Use seeds to keep your project self-contained and version-controlled.
Run 'dbt seed' to load seed files into your warehouse.
Seeds simplify managing lookup tables without external dependencies.