0
0
dbtdata~15 mins

Organizing models in directories in dbt - Deep Dive

Choose your learning style9 modes available
Overview - Organizing models in directories
What is it?
Organizing models in directories means arranging your dbt models into folders inside your project. Each folder can hold related SQL files that build parts of your data pipeline. This helps keep your project tidy and easier to understand. Instead of one big folder with many files, you group models by theme or function.
Why it matters
Without organizing models in directories, your dbt project can become messy and hard to navigate as it grows. This slows down development and increases mistakes. Good organization saves time, helps teams collaborate, and makes it easier to find and update models. It also helps dbt understand dependencies and run models efficiently.
Where it fits
Before this, you should know basic dbt model creation and how dbt runs SQL files. After learning this, you can explore advanced dbt features like model configurations, macros, and testing. Organizing models is a foundational skill that supports scaling your dbt projects.
Mental Model
Core Idea
Organizing models in directories is like sorting your tools into labeled drawers so you can quickly find and use the right one when building your data pipeline.
Think of it like...
Imagine a kitchen drawer where all your cooking tools are mixed together. Finding a whisk or a spatula takes time. But if you have separate drawers for utensils, knives, and gadgets, cooking becomes faster and less frustrating. Similarly, organizing dbt models into folders groups related work together for easy access.
dbt_project/
├── models/
│   ├── staging/
│   │   ├── customers.sql
│   │   └── orders.sql
│   ├── marts/
│   │   ├── sales/
│   │   │   └── sales_summary.sql
│   │   └── finance/
│   │       └── revenue.sql
│   └── README.md
Build-Up - 7 Steps
1
FoundationWhat is a dbt model file?
🤔
Concept: Introduce the basic unit of work in dbt: the model SQL file.
A dbt model is a single SQL file that defines a transformation query. When you run dbt, it turns these SQL files into tables or views in your database. Each model file lives inside the 'models' folder by default.
Result
You understand that each SQL file is a building block of your data pipeline.
Knowing that models are individual SQL files helps you see why organizing them matters as projects grow.
2
FoundationDefault model folder structure
🤔
Concept: Explain the default location and flat structure of models in dbt.
By default, dbt looks for model files inside the 'models' folder at the root of your project. If you put all models directly here, they all live together without subfolders. This works for small projects but can get cluttered.
Result
You see how a flat folder can quickly become hard to manage.
Understanding the default helps you appreciate the need for better organization.
3
IntermediateCreating subdirectories for model grouping
🤔Before reading on: do you think dbt automatically recognizes models inside subfolders without extra config? Commit to yes or no.
Concept: Show how to create folders inside 'models' to group related models.
You can create folders inside the 'models' directory, like 'staging' or 'marts/sales'. Put related SQL files inside these folders. dbt automatically finds models in subdirectories without extra setup.
Result
Your project structure becomes hierarchical, making it easier to find models by category.
Knowing dbt auto-discovers models in subfolders lets you organize freely without config overhead.
4
IntermediateUsing path-based model selection
🤔Before reading on: can you run only models in a specific folder using dbt commands? Commit to yes or no.
Concept: Explain how to run or test models selectively by folder path.
dbt lets you run models in a folder using selectors like 'dbt run --select staging'. This helps focus runs on parts of your project, speeding up development and testing.
Result
You can efficiently work on subsets of your models without running the entire project.
Understanding path-based selection improves your workflow and saves time.
5
IntermediateConfiguring models by directory
🤔
Concept: Introduce how to apply configurations to all models in a folder using dbt_project.yml.
In your dbt_project.yml file, you can set configs for all models in a folder. For example, you can set materializations or tags for all models in 'models/marts/sales'. This avoids repeating configs in each model file.
Result
Your project config becomes cleaner and consistent across related models.
Knowing folder-level configs helps maintain standards and reduces errors.
6
AdvancedHandling dependencies across directories
🤔Before reading on: do you think models in different folders can depend on each other without extra setup? Commit to yes or no.
Concept: Explain how dbt resolves dependencies between models in different folders.
Models can reference each other across folders using the {{ ref('model_name') }} function. dbt builds a dependency graph regardless of folder location, ensuring models run in the right order.
Result
You can organize models freely without breaking dependencies.
Understanding dependency resolution prevents confusion about model execution order.
7
ExpertAdvanced directory patterns for large projects
🤔Before reading on: do you think deeply nested directories always improve clarity? Commit to yes or no.
Concept: Discuss best practices and tradeoffs for deeply nested directories and modular design.
In very large projects, teams create multi-level folders by domain, function, or team ownership. While this improves clarity, too much nesting can make navigation complex. Experts balance depth and simplicity, sometimes using naming conventions or dbt packages to modularize.
Result
You learn how to scale organization without losing usability.
Knowing when to stop nesting and when to modularize helps maintain project health and team productivity.
Under the Hood
dbt scans the 'models' directory and all its subdirectories recursively to find SQL files. Each file is parsed and compiled into a SQL query. dbt builds a directed acyclic graph (DAG) of model dependencies using the {{ ref() }} function. This graph determines the order of execution. Folder structure does not affect dependency resolution but helps humans navigate the project.
Why designed this way?
dbt was designed to be flexible and simple. Automatically discovering models in subfolders avoids extra configuration, lowering the barrier to organizing projects. The separation of physical file structure and logical dependencies allows teams to organize code for readability without affecting execution logic.
dbt_project/
├── models/
│   ├── staging/
│   │   ├── customers.sql
│   │   └── orders.sql
│   ├── marts/
│   │   ├── sales/
│   │   │   └── sales_summary.sql
│   │   └── finance/
│   │       └── revenue.sql

Dependency Graph:
customers.sql ──┐
                ├─> orders.sql ──> sales_summary.sql
revenue.sql ────┘
Myth Busters - 4 Common Misconceptions
Quick: Does placing models in different folders change their execution order automatically? Commit yes or no.
Common Belief:If I put models in different folders, dbt will run them in folder order.
Tap to reveal reality
Reality:dbt runs models based on their dependencies, not folder location. Folder structure is only for organization.
Why it matters:Relying on folder order can cause unexpected results or failures if dependencies are not respected.
Quick: Can I only configure models individually, not by folder? Commit yes or no.
Common Belief:I must set configs like materialization inside each model file; folders don't help.
Tap to reveal reality
Reality:dbt_project.yml allows folder-level configs that apply to all models inside, reducing repetition.
Why it matters:Ignoring folder configs leads to duplicated code and inconsistent settings.
Quick: Does nesting folders infinitely always improve project clarity? Commit yes or no.
Common Belief:More folders and subfolders always make the project easier to understand.
Tap to reveal reality
Reality:Too many nested folders can confuse developers and slow navigation.
Why it matters:Over-nesting can reduce productivity and increase onboarding time.
Quick: Does dbt require extra config to find models in subfolders? Commit yes or no.
Common Belief:I need to tell dbt where each subfolder is in the config file.
Tap to reveal reality
Reality:dbt automatically finds all models in subfolders under 'models' without extra config.
Why it matters:Unnecessary config adds complexity and risk of errors.
Expert Zone
1
Folder names can be used as namespaces in model selectors, enabling precise targeting in commands.
2
Using folder-level configs can override model-level configs, so order and specificity matter.
3
Combining directory organization with dbt packages allows modular, reusable components across projects.
When NOT to use
Avoid deeply nested directories when your project is small or when team members are new to dbt. Instead, keep a flat structure for simplicity. For very large projects, consider splitting into multiple dbt packages or repositories to manage complexity.
Production Patterns
Teams often organize models by data source (staging), business domain (marts), and function (analytics). Folder-level configs enforce standards like materialization types. Selectors based on folders speed up CI/CD pipelines by running only changed parts.
Connections
Software Project Structure
Organizing code files into folders is a shared pattern between dbt models and software projects.
Understanding folder organization in software development helps grasp why dbt projects benefit from similar structure for maintainability.
Dependency Graphs
dbt uses dependency graphs to order model execution, similar to task scheduling in project management.
Knowing how dependency graphs work in other fields clarifies why folder order does not control execution order in dbt.
Library Classification in Libraries
Just like books are organized by topic and genre in a library, dbt models are organized by function and domain.
This cross-domain connection shows how organizing information for easy retrieval is a universal challenge.
Common Pitfalls
#1Assuming folder order controls model run order
Wrong approach:dbt run --select models/staging models/marts
Correct approach:dbt run --select staging+
Root cause:Misunderstanding that dbt runs models by dependency, not folder listing order.
#2Duplicating configs in every model file instead of using folder configs
Wrong approach:In every model SQL file: {{ config(materialized='table') }}
Correct approach:In dbt_project.yml: models: my_project: marts: +materialized: table
Root cause:Not knowing folder-level config options in dbt_project.yml.
#3Creating too many nested folders making navigation hard
Wrong approach:models/domain1/subdomainA/typeX/categoryY/model.sql
Correct approach:models/domain1/typeX/model.sql
Root cause:Believing deeper nesting always improves clarity without considering usability.
Key Takeaways
Organizing dbt models in directories groups related SQL files, making projects easier to navigate and maintain.
dbt automatically discovers models in subfolders, so you can organize freely without extra configuration.
Folder structure does not affect model execution order; dependencies defined by references control that.
Using folder-level configurations in dbt_project.yml reduces repetition and enforces consistency.
Balancing folder depth is key: too flat is messy, too deep is confusing; modularization can help scale.