Overview - One model per source table rule

What is it?

The 'One model per source table' rule means creating a separate data model in dbt for each raw table you get from your data source. Instead of mixing data from many tables into one model, you keep them separate at first. This helps keep your data organized and easier to understand. Each model represents a clean, transformed version of one source table.

Why it matters

Without this rule, your data models can become confusing and hard to maintain because they mix many sources together. It becomes difficult to track where data comes from or fix errors. By having one model per source table, you get clear, simple building blocks that make your data pipeline easier to debug, update, and trust. This leads to better data quality and faster development.

Where it fits

Before learning this, you should understand basic dbt concepts like models, sources, and transformations. After mastering this rule, you can learn about combining models, building complex transformations, and optimizing your dbt project structure.

Mental Model

Core Idea

Each source table gets its own dedicated model to keep data transformations clear, modular, and easy to manage.

Think of it like...

It's like organizing your kitchen: you keep fruits in one basket, vegetables in another, and spices in a separate jar. This way, you always know where to find what you need and can prepare meals without mixing everything up.

┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ Source Table 1│──▶│ Model for Tab1 │
└───────────────┘   └───────────────┘

┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ Source Table 2│──▶│ Model for Tab2 │
└───────────────┘   └───────────────┘

┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ Source Table 3│──▶│ Model for Tab3 │
└───────────────┘   └───────────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding Source Tables in dbt

Concept: Learn what source tables are and how dbt connects to raw data.

Source tables are the raw data tables in your database that you want to analyze. In dbt, you declare these tables as 'sources' so you can reference them safely in your models. This helps track where data comes from and ensures your models use the right inputs.

Result

You can now reference raw tables in your dbt project with clear source definitions.

Knowing what source tables are is the first step to organizing your data transformations clearly.

2

FoundationWhat is a dbt Model?

3

IntermediateWhy One Model per Source Table?

4

IntermediateImplementing One Model per Source Table

5

IntermediateBuilding Complex Models from Base Models

6

AdvancedManaging Dependencies and Performance

7

ExpertHandling Exceptions and Complex Sources

Under the Hood

dbt reads your project files and builds a Directed Acyclic Graph (DAG) of models based on dependencies. Each model runs its SQL query to create or update tables/views in the database. The 'one model per source table' rule means each source table is wrapped in a single model, which becomes a node in the DAG. This modularity allows dbt to track lineage, run models in order, and rebuild only what changed.

Why designed this way?

This rule was created to improve clarity and maintainability in data projects. Early data pipelines often mixed many sources in one model, causing confusion and bugs. Separating models by source table aligns with software engineering principles like modularity and single responsibility, making data projects easier to test, debug, and evolve.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Source Table 1│──────▶│ Model for Tab1 │──────▶│ Downstream Mod│
│ Source Table 2│──────▶│ Model for Tab2 │──────▶│ Downstream Mod│
│ Source Table 3│──────▶│ Model for Tab3 │──────▶│ Downstream Mod│
└───────────────┘       └───────────────┘       └───────────────┘

Myth Busters - 3 Common Misconceptions

Quick: Do you think combining multiple source tables into one model is better for clarity? Commit yes or no.

Common Belief:Combining many source tables into one model is simpler and easier to manage.

Tap to reveal reality

Quick: Do you think having many small models slows down dbt runs significantly? Commit yes or no.

Common Belief:More models always mean slower dbt runs and worse performance.

Tap to reveal reality

Quick: Is it always wrong to combine small related source tables into one model? Commit yes or no.

Common Belief:You must never combine source tables; one model per source table is absolute.

Tap to reveal reality

Expert Zone

1

Some source tables may require multiple models for different stages like raw staging and cleaned data, which still respects modularity.

2

Materialization strategies (table, view, incremental) affect how one model per source table impacts performance and storage.

3

Documenting exceptions to the rule clearly is crucial for team understanding and project longevity.

When NOT to use

Avoid strict one model per source table when source tables are extremely small and always used together; instead, combine them for simplicity. Also, for very complex transformations, build intermediate models that combine base models rather than mixing raw sources directly.

Production Patterns

In production, teams create base models per source table as staging layers, then build marts or business logic models on top. They use dbt's DAG visualization to manage dependencies and incremental models to optimize performance.

Connections

Modular Programming

The one model per source table rule applies the modular programming principle to data modeling.

Understanding modular programming helps grasp why separating data transformations into small, focused models improves maintainability.

Data Lineage

One model per source table clarifies data lineage by creating clear, traceable steps from raw data to final outputs.

Knowing this connection helps you appreciate how modular models make auditing and debugging data easier.

Supply Chain Management

Just like supply chains track parts from raw materials to finished products, one model per source table tracks data from raw tables to final reports.

Seeing data pipelines as supply chains highlights the importance of clear, modular steps for quality and traceability.

Common Pitfalls

#1Mixing multiple source tables in one model for convenience.

Wrong approach:SELECT * FROM source_table1 JOIN source_table2 ON ... -- all in one model

Correct approach:Create model_source_table1 AS (SELECT * FROM source_table1); Create model_source_table2 AS (SELECT * FROM source_table2); Then join these models in a separate model.

Root cause:Misunderstanding modularity and trying to shortcut by combining sources early.

#2Ignoring model dependencies and running models in wrong order.

Wrong approach:Running downstream models before base models are built, causing errors.

Correct approach:Use dbt run which respects dependencies or run models in dependency order manually.

Root cause:Not understanding dbt's DAG and dependency management.

#3Over-applying the rule and creating too many tiny models unnecessarily.

Wrong approach:Creating separate models for trivial tables that are always used together, causing clutter.

Correct approach:Combine very small, tightly related tables into one model when it simplifies the project.

Root cause:Rigidly following the rule without considering project context.

Key Takeaways

One model per source table keeps data transformations clear, modular, and easier to maintain.

This rule helps track data lineage and debug issues faster by isolating each source's logic.

Building complex models should happen on top of these base models, not by mixing raw sources directly.

Balancing strict adherence with practical exceptions leads to better project design.

Understanding dbt's dependency management and materializations is key to optimizing performance with this rule.