Overview - dbt project structure

What is it?

A dbt project structure is the organized way files and folders are arranged to build, test, and document data transformations using dbt. It includes folders for models, tests, macros, and configurations that work together to create a clear, maintainable data pipeline. This structure helps teams collaborate and ensures data workflows are easy to understand and update. It acts like a blueprint for how dbt runs and manages your data transformations.

Why it matters

Without a clear dbt project structure, data transformations become messy and hard to manage, leading to errors and confusion. A well-organized structure saves time, reduces mistakes, and makes it easier for teams to work together on data projects. It also helps ensure data quality and consistency, which is critical for making reliable business decisions. Imagine trying to build a house without a blueprint; the project would be chaotic and inefficient.

Where it fits

Before learning dbt project structure, you should understand basic SQL and the concept of data transformation. After mastering the structure, you can learn advanced dbt features like hooks, packages, and deployment automation. This topic fits early in the dbt learning path, right after setting up dbt and before building complex models and tests.

Mental Model

Core Idea

A dbt project structure is like a well-organized kitchen where every tool and ingredient has its place, making cooking (data transformation) efficient and error-free.

Think of it like...

Think of a dbt project structure as a kitchen layout: the stove is where cooking happens (models folder), the pantry stores ingredients (data sources), the recipe book holds instructions (macros), and the cleaning supplies (tests) ensure everything stays clean and safe. If these are scattered randomly, cooking becomes slow and mistakes happen.

┌─────────────────────────────┐
│         dbt Project          │
├──────────────┬──────────────┤
│ models/      │ SQL files    │
│              │ (transform)  │
├──────────────┼──────────────┤
│ tests/       │ Data checks  │
├──────────────┼──────────────┤
│ macros/      │ Reusable     │
│              │ SQL snippets │
├──────────────┼──────────────┤
│ snapshots/   │ Data version │
│              │ history      │
├──────────────┼──────────────┤
│ seeds/       │ Raw data     │
├──────────────┼──────────────┤
│ dbt_project.yml │ Config file│
└─────────────────────────────┘

Build-Up - 6 Steps

1

FoundationUnderstanding dbt Project Basics

Concept: Learn what a dbt project is and its main components.

A dbt project is a folder containing all files needed to build your data transformations. The main file is dbt_project.yml, which tells dbt how to run your project. Inside the project, you have folders like models for SQL files that transform data, and tests to check data quality.

Result

You can identify the key files and folders in a dbt project and understand their basic roles.

Knowing the basic layout helps you navigate and organize your work efficiently from the start.

2

FoundationRole of the Models Folder

3

IntermediateUsing Tests and Seeds Folders

4

IntermediateMacros Folder and Reusable SQL

5

AdvancedConfiguring dbt_project.yml

6

ExpertAdvanced Folder Structures and Modularization

Under the Hood

dbt reads the project files and compiles SQL models into executable queries. It uses the dbt_project.yml to understand configurations and folder paths. When running, dbt processes models in dependency order, applies tests, and manages snapshots and seeds. Macros are expanded inline during compilation, allowing dynamic SQL generation. This process ensures transformations are reproducible and version-controlled.

Why designed this way?

dbt was designed to bring software engineering best practices to data transformation. The project structure enforces organization, modularity, and clarity, making complex data workflows manageable. Alternatives like unstructured SQL scripts were error-prone and hard to maintain, so dbt’s structure solves these problems by standardizing project layout and behavior.

┌───────────────────────────────┐
│         dbt CLI Command       │
└──────────────┬────────────────┘
               │
       Reads dbt_project.yml
               │
┌──────────────▼───────────────┐
│   Loads Models, Macros, Tests │
│   from project folders        │
└──────────────┬───────────────┘
               │
       Compiles SQL with macros
               │
┌──────────────▼───────────────┐
│ Executes SQL in Data Warehouse│
│ Runs Tests and Snapshots      │
└───────────────────────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think the models folder can contain any file type, like images or text? Commit to yes or no.

Common Belief:The models folder can hold any files related to the project.

Tap to reveal reality

Quick: Do you think tests in dbt only run once during project setup? Commit to yes or no.

Common Belief:Tests are one-time checks to validate data when the project is created.

Tap to reveal reality

Quick: Do you think macros are just comments or documentation? Commit to yes or no.

Common Belief:Macros are only for documenting SQL code and do not affect execution.

Tap to reveal reality

Quick: Do you think dbt_project.yml only sets folder names and nothing else? Commit to yes or no.

Common Belief:dbt_project.yml is just a simple file to tell dbt where folders are.

Tap to reveal reality

Expert Zone

1

Model folder structure impacts compilation time; flatter structures compile faster but may reduce clarity.

2

Macros can accept arguments and use Jinja control flow, enabling complex dynamic SQL generation beyond simple reuse.

3

dbt_project.yml supports environment-specific overrides, allowing different behaviors in development vs production.

When NOT to use

For very simple or one-off SQL scripts, using dbt and its project structure may be overkill. Alternatives like direct SQL scripts or lightweight ETL tools might be better. Also, if your data transformations require complex procedural logic, a full ETL tool or custom code might be more suitable.

Production Patterns

In production, teams use modular folder structures to separate business domains, enforce strict testing in tests folders, and use macros for common logic like date handling. They automate dbt runs with CI/CD pipelines and use dbt packages to share reusable code across projects.

Connections

Software Engineering Project Structure

dbt project structure builds on the same principles of organizing code and resources for clarity and maintainability.

Understanding software project organization helps grasp why dbt enforces a clear folder and file layout.

Modular Programming

Macros and folder modularization in dbt mirror modular programming concepts in software development.

Knowing modular programming clarifies how to write reusable and maintainable SQL code in dbt.

Kitchen Organization

Like organizing a kitchen for efficient cooking, dbt project structure organizes files for efficient data transformation.

This cross-domain connection highlights the universal value of good organization for complex tasks.

Common Pitfalls

#1Placing non-SQL files inside the models folder.

Wrong approach:models/readme.txt models/image.png

Correct approach:docs/readme.txt assets/image.png

Root cause:Misunderstanding that models folder is only for SQL transformation files.

#2Not running tests regularly, assuming data is always clean.

Wrong approach:dbt run # never runs dbt test

Correct approach:dbt run dbt test

Root cause:Underestimating the importance of continuous data quality checks.

#3Writing repeated SQL code instead of using macros.

Wrong approach:SELECT date_trunc('month', order_date) FROM orders -- repeated in many models

Correct approach:{% macro month_start(date) %} date_trunc('month', {{ date }}) {% endmacro %} SELECT {{ month_start('order_date') }} FROM orders

Root cause:Not knowing how to create and use macros for reusable SQL.

Key Takeaways

A clear dbt project structure organizes your data transformation files for easy understanding and maintenance.

Models folder holds SQL files that define how raw data becomes useful insights.

Tests and seeds folders help ensure data quality and provide static data inputs.

Macros enable reusable SQL code, reducing repetition and errors.

The dbt_project.yml file controls project-wide settings and behavior, making your project flexible and powerful.

dbt project structure - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand folder purposes

Step 2: Identify correct folder for SQL models

Final Answer:

Quick Check:

Solution

Step 1: Recall main config file name

Step 2: Verify other options

Final Answer:

Quick Check:

Solution

Step 1: Identify purpose of macros/ folder

Step 2: Locate reusable function file

Final Answer:

Quick Check:

Solution

Step 1: Understand error meaning

Step 2: Check model file location

Final Answer:

Quick Check:

Solution

Step 1: Understand folder-specific config in dbt_project.yml

Step 2: Define separate configs for customers and orders folders

Final Answer:

Quick Check: