What is dbt Project Structure: Overview and Example
dbt project structure is a specific folder and file layout that organizes your data transformation code, configurations, and tests in a dbt project. It includes folders like models, macros, and configuration files like dbt_project.yml to help dbt run and manage your analytics workflows.How It Works
Think of a dbt project structure like a well-organized kitchen where each tool and ingredient has its place. This structure helps you keep your SQL models, tests, and macros tidy and easy to find. When dbt runs, it looks into these folders to know what transformations to apply and how to build your data tables.
The main folder is models, where you write SQL files that define how raw data should be transformed. Other folders like macros hold reusable snippets of SQL code, similar to kitchen recipes you use often. The dbt_project.yml file acts like a kitchen blueprint, telling dbt how to organize and run your project.
Example
my_dbt_project/
├── dbt_project.yml
├── models/
│ ├── staging/
│ │ └── stg_customers.sql
│ └── marts/
│ └── customers.sql
├── macros/
│ └── my_macros.sql
└── tests/
└── unique_customer_id.sqlWhen to Use
Use a dbt project structure whenever you want to build, organize, and maintain data transformations in a clear and scalable way. It is especially useful in teams where multiple people work on analytics code, as it keeps everything consistent and easy to understand.
For example, if you are building a data warehouse and need to transform raw data into clean tables for reporting, dbt’s project structure helps you separate raw staging models from business logic models. It also supports adding tests and documentation to ensure data quality.
Key Points
- The
modelsfolder contains SQL files for data transformations. dbt_project.ymlconfigures the project settings and folder paths.macroshold reusable SQL snippets to avoid repetition.- Tests and documentation can be organized alongside models for data quality.
- The structure helps teams collaborate and maintain analytics code efficiently.