0
0
dbtdata~15 mins

Model naming conventions in dbt - Deep Dive

Choose your learning style9 modes available
Overview - Model naming conventions
What is it?
Model naming conventions are rules and patterns used to name data models consistently in dbt projects. They help organize and identify models clearly by their purpose, source, or transformation stage. This makes it easier for teams to understand and maintain the data pipeline. Without clear naming, models can become confusing and hard to manage.
Why it matters
Consistent model names prevent confusion and errors in data projects. They make collaboration smoother because everyone understands what each model does just by its name. Without naming conventions, teams waste time guessing model roles, leading to mistakes and slower development. Good naming saves time and improves data quality.
Where it fits
Before learning model naming conventions, you should understand basic dbt concepts like models, sources, and transformations. After mastering naming, you can learn advanced dbt topics like model dependencies, testing, and documentation. Naming conventions are a foundation for clean, scalable dbt projects.
Mental Model
Core Idea
Model naming conventions are a shared language that makes data models easy to find, understand, and trust.
Think of it like...
It's like organizing books in a library by genre, author, and series so anyone can find the right book quickly without confusion.
┌─────────────────────────────┐
│       Model Naming           │
├─────────────┬───────────────┤
│ Prefix      │ Purpose       │
│ (e.g., stg) │ (e.g., staging)│
├─────────────┼───────────────┤
│ Core name   │ Source or data│
│ (e.g., orders)│ (e.g., orders)│
├─────────────┼───────────────┤
│ Suffix      │ Detail or     │
│ (e.g., _agg)│ transformation│
└─────────────┴───────────────┘
Build-Up - 7 Steps
1
FoundationWhat is a dbt model name?
🤔
Concept: Understanding what a model name represents in dbt projects.
In dbt, a model is a SQL file that creates a table or view in your data warehouse. The model name is the filename without the .sql extension. This name is how dbt identifies and runs the model. For example, a file named 'stg_customers.sql' creates a model called 'stg_customers'.
Result
You know that the model name is the key identifier for your SQL transformations in dbt.
Knowing that model names directly map to database objects helps you see why naming them clearly is important for understanding your data pipeline.
2
FoundationWhy naming consistency matters
🤔
Concept: The importance of using consistent patterns for model names.
When many people work on a dbt project, inconsistent names cause confusion. For example, if one person names a model 'orders_staging' and another 'stg_orders', it’s hard to know if they serve the same purpose. Consistent naming makes it easy to find models and understand their role without opening the files.
Result
You realize that consistent names reduce mistakes and speed up teamwork.
Understanding that naming is a communication tool helps you appreciate its role beyond just labels.
3
IntermediateCommon naming prefixes explained
🤔Before reading on: do you think prefixes like 'stg' or 'int' mean the same thing or different stages? Commit to your answer.
Concept: Prefixes indicate the stage or type of data in the model.
Common prefixes include: - 'stg_' for staging models that load raw data - 'int_' for intermediate models that clean or join data - 'fct_' for fact models used in analysis - 'dim_' for dimension models that describe entities Using these prefixes helps quickly identify a model’s role.
Result
You can tell a model’s purpose just by its prefix.
Knowing prefixes map to data pipeline stages helps you organize models logically and predict their content.
4
IntermediateUsing suffixes for clarity
🤔Before reading on: do you think suffixes add important details or are just decorative? Commit to your answer.
Concept: Suffixes add extra information about the model’s content or transformation type.
Suffixes like '_agg' for aggregated data or '_hist' for historical snapshots clarify what the model contains. For example, 'fct_sales_agg' suggests a fact model with aggregated sales data. This detail helps users pick the right model for their analysis.
Result
You can distinguish models with similar names by their suffixes.
Understanding suffixes improves precision in model identification, reducing errors in data use.
5
IntermediateNaming models by source and purpose
🤔
Concept: Combining source names and purpose in model names for clarity.
Including the data source or subject in the model name helps locate data origins. For example, 'stg_stripe_payments' shows this staging model loads Stripe payment data. Pairing source and purpose in names creates a clear, searchable structure.
Result
You can quickly find models related to specific data sources or business areas.
Knowing to include source names prevents confusion when multiple data sources exist.
6
AdvancedHandling exceptions and special cases
🤔Before reading on: do you think all models should strictly follow naming rules, or are exceptions allowed? Commit to your answer.
Concept: Sometimes models don’t fit standard patterns and need special naming.
For example, models that combine multiple sources or perform complex calculations might use names like 'int_combined_orders' or 'fct_customer_lifetime_value'. Documenting exceptions ensures everyone understands why these names differ.
Result
You can handle complex projects without losing naming clarity.
Recognizing when to break rules thoughtfully keeps your naming system flexible and practical.
7
ExpertAutomating naming with dbt macros
🤔Before reading on: do you think naming can be automated in dbt or must always be manual? Commit to your answer.
Concept: Using dbt macros to enforce or generate model names automatically.
dbt macros are reusable code snippets. You can write macros that check model names follow conventions or generate names based on metadata. This reduces human error and keeps large projects consistent without manual effort.
Result
Your project maintains naming standards automatically, saving time and avoiding mistakes.
Understanding automation possibilities elevates your project’s reliability and scalability.
Under the Hood
dbt treats each model name as a unique identifier that maps to a database object (table or view). When you run dbt, it compiles SQL files into executable queries named after the model. The naming convention helps dbt and users track dependencies and organize models logically in the warehouse.
Why designed this way?
dbt was designed to be simple and transparent. Using model filenames as identifiers avoids extra configuration. Naming conventions emerged as best practices to manage complexity as projects grow, enabling teams to collaborate effectively without confusion.
┌───────────────┐
│ Model File    │
│ stg_orders.sql│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ dbt Model     │
│ stg_orders    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Database Table│
│ stg_orders    │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think model names can be anything without affecting project clarity? Commit yes or no.
Common Belief:Model names are just labels and don’t impact how the project works.
Tap to reveal reality
Reality:Model names are critical for understanding, maintaining, and running dbt projects correctly.
Why it matters:Ignoring naming leads to confusion, errors in dependencies, and wasted time searching for models.
Quick: Do you think using very long descriptive names is always better? Commit yes or no.
Common Belief:Long, detailed names are best because they explain everything.
Tap to reveal reality
Reality:Overly long names become hard to read and type, causing frustration and mistakes.
Why it matters:Balancing clarity and brevity keeps names useful and manageable.
Quick: Do you think prefixes like 'stg' and 'int' mean the same thing? Commit yes or no.
Common Belief:All prefixes are interchangeable and don’t matter much.
Tap to reveal reality
Reality:Each prefix signals a specific stage or role in the data pipeline, so mixing them causes confusion.
Why it matters:Misusing prefixes breaks the mental model teams rely on to understand data flow.
Quick: Do you think automation can fully replace human judgment in naming? Commit yes or no.
Common Belief:Automated naming removes the need for human decisions.
Tap to reveal reality
Reality:Automation helps but can’t capture all context; human review is still needed.
Why it matters:Relying solely on automation risks inappropriate names that confuse users.
Expert Zone
1
Some teams use environment or project-specific prefixes to separate models by deployment stage or business unit, which adds clarity in complex organizations.
2
Naming conventions can influence dbt's documentation generation and lineage graphs, so thoughtful names improve automated docs and debugging.
3
In multi-source projects, consistent source naming within model names prevents accidental data mixing and simplifies troubleshooting.
When NOT to use
Strict naming conventions may be less useful in very small projects or prototypes where speed matters more than clarity. In those cases, quick descriptive names or ad-hoc naming might be better. Also, if using automated tools that generate models, manual naming might be impractical.
Production Patterns
In production, teams often enforce naming conventions via code reviews and CI checks. They combine prefixes, source names, and suffixes to create a predictable structure. Some use dbt macros to validate names or generate them dynamically based on metadata. This ensures consistency across large, evolving projects.
Connections
Software naming conventions
Similar pattern of using consistent names to improve code readability and maintenance.
Understanding software naming helps grasp why data model names must be clear and consistent for teamwork and debugging.
Library classification systems
Both organize complex collections using structured naming or labeling for easy retrieval.
Seeing model naming like library classification reveals the importance of hierarchy and categories in managing large data sets.
Linguistics - Naming and categorization
Both involve creating shared labels that convey meaning and relationships clearly within a community.
Knowing how language shapes understanding helps appreciate naming conventions as a social contract in data teams.
Common Pitfalls
#1Using inconsistent prefixes for similar model types.
Wrong approach:Model files named 'orders_staging.sql' and 'int_orders.sql' for similar staging models.
Correct approach:Model files named consistently as 'stg_orders.sql' for all staging models.
Root cause:Lack of agreed-upon naming rules or ignoring them leads to confusion about model roles.
#2Overly long model names that are hard to read or type.
Wrong approach:Model named 'fct_customer_monthly_revenue_aggregated_view.sql'.
Correct approach:Model named 'fct_cust_month_rev_agg.sql'.
Root cause:Trying to include too much detail without balancing readability.
#3Ignoring source names in model names when multiple data sources exist.
Wrong approach:Models named 'stg_payments.sql' for both Stripe and PayPal data.
Correct approach:Models named 'stg_stripe_payments.sql' and 'stg_paypal_payments.sql'.
Root cause:Not considering data source clarity causes data mixing and errors.
Key Takeaways
Model naming conventions in dbt create a shared language that makes data models easy to find and understand.
Using consistent prefixes and suffixes signals the model’s role and content clearly to all team members.
Including source names in model names prevents confusion when working with multiple data origins.
Balancing clarity and brevity in names improves usability and reduces errors in data projects.
Automating naming checks with dbt macros can maintain consistency but human judgment remains essential.