Overview - Cross-team model sharing

What is it?

Cross-team model sharing in dbt means that different teams within an organization can use and build upon each other's data models. Instead of each team creating their own separate data transformations, they share reusable models to maintain consistency and save time. This approach helps teams collaborate better and ensures everyone works with the same trusted data. It is like sharing building blocks to create bigger, more complex data structures together.

Why it matters

Without cross-team model sharing, teams often duplicate work, create conflicting data definitions, and waste time fixing inconsistencies. This leads to confusion and mistrust in data across the company. Sharing models helps everyone speak the same data language, speeds up analytics, and improves decision-making. It turns data work from isolated silos into a collaborative effort that benefits the whole organization.

Where it fits

Before learning cross-team model sharing, you should understand basic dbt concepts like models, sources, and dependencies. After mastering sharing, you can explore advanced topics like dbt packages, version control integration, and automated testing. This topic sits in the middle of the dbt learning path, bridging individual model creation and large-scale data collaboration.

Mental Model

Core Idea

Cross-team model sharing is about building a shared library of trusted data transformations that multiple teams can use and extend to work together efficiently.

Think of it like...

Imagine a group of chefs in a kitchen sharing a common set of recipes. Instead of each chef inventing their own dish from scratch, they use and improve shared recipes to create consistent meals faster and with less waste.

┌───────────────────────────────┐
│       Shared Model Library     │
├─────────────┬───────────────┤
│ Team A      │ Team B        │
│ Uses Models │ Uses Models   │
│ from       │ from          │
│ Library    │ Library       │
├─────────────┴───────────────┤
│ Teams collaborate by sharing │
│ and reusing data models      │
└───────────────────────────────┘

Build-Up - 6 Steps

1

FoundationUnderstanding dbt Models Basics

Concept: Learn what dbt models are and how they transform raw data into usable tables.

In dbt, a model is a SQL file that defines a transformation on your raw data. When you run dbt, it runs these SQL queries and creates tables or views in your data warehouse. Models are the building blocks of your data pipeline.

Result

You can create simple tables in your warehouse by writing SQL in dbt models and running dbt commands.

Understanding models is essential because they are the units you will share across teams.

2

FoundationHow Dependencies Link Models

3

IntermediateIntroducing Cross-team Model Sharing

4

IntermediateUsing dbt Packages for Sharing

5

AdvancedManaging Versioning and Compatibility

6

ExpertOptimizing Cross-team Collaboration Workflows

Under the Hood

dbt compiles SQL models into executable queries and builds a dependency graph based on references. When sharing models, dbt packages bundle these models with metadata and version info. The package manager resolves dependencies and installs the correct versions into projects. During runs, dbt executes models in dependency order, ensuring shared models are built before dependent ones.

Why designed this way?

dbt was designed to treat data transformations as code, enabling software engineering best practices like modularity, version control, and testing. Sharing models as packages follows software library patterns, making data pipelines more maintainable and collaborative. Alternatives like copying SQL code were error-prone and hard to maintain, so dbt's package system was created to solve these issues.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Team A Model │──────▶│ Package Repo  │──────▶│ Team B Project│
│ (customer)   │       │ (versioned)   │       │ (uses model)  │
└───────────────┘       └───────────────┘       └───────────────┘
        │                      ▲                       │
        │                      │                       │
        └──────────────────────┴───────────────────────┘
                 dbt package sharing and installation

Myth Busters - 3 Common Misconceptions

Quick: Do you think copying SQL code between teams is better than referencing shared models? Commit yes or no.

Common Belief:Copying SQL code between teams is fine and ensures independence.

Tap to reveal reality

Quick: Do you think updating a shared model always breaks all dependent projects? Commit yes or no.

Common Belief:Any change to a shared model will break all teams using it.

Tap to reveal reality

Quick: Do you think cross-team sharing is only about code and not about communication? Commit yes or no.

Common Belief:Sharing models is just about sharing SQL files or packages.

Tap to reveal reality

Expert Zone

1

Shared models often require clear ownership and SLAs to ensure timely updates and support.

2

Semantic versioning in dbt packages is critical but often misunderstood, leading to accidental breaking changes.

3

Testing shared models with both unit and integration tests prevents subtle bugs that affect multiple teams.

When NOT to use

Cross-team model sharing is not ideal for highly experimental or rapidly changing models where stability is low. In such cases, teams should keep models isolated until they mature. Also, for very small teams or solo projects, the overhead of sharing may not be worth it.

Production Patterns

Organizations use dbt packages to create centralized data marts that multiple analytics teams consume. They implement CI/CD pipelines that run tests on shared models before publishing new versions. Governance teams manage package releases and documentation to ensure data quality and trust.

Connections

Software Package Management

Cross-team model sharing in dbt uses package management concepts similar to software libraries.

Understanding software package management helps grasp how dbt packages handle versioning, dependencies, and distribution of data models.

Modular Programming

Sharing models is like modular programming where code is split into reusable components.

Knowing modular programming principles clarifies why breaking data transformations into shared models improves maintainability and collaboration.

Collaborative Knowledge Sharing

Cross-team sharing parallels how teams share knowledge and best practices in organizations.

Recognizing this connection highlights that sharing models is not just technical but also cultural, requiring communication and governance.

Common Pitfalls

#1Duplicating SQL code instead of referencing shared models.

Wrong approach:SELECT * FROM raw_sales; -- copied and modified in multiple projects without ref()

Correct approach:SELECT * FROM {{ ref('raw_sales') }}; -- references shared model properly

Root cause:Misunderstanding that sharing means copying code rather than referencing reusable models.

#2Not pinning package versions, causing unexpected breaks.

Wrong approach:packages: - package: 'company/shared_models' version: '*'

Correct approach:packages: - package: 'company/shared_models' version: '1.2.3'

Root cause:Ignoring semantic versioning and trusting latest versions blindly.

#3Ignoring communication and governance when sharing models.

Wrong approach:Teams publish shared models without documentation or coordination.

Correct approach:Teams establish ownership, document models, and coordinate changes via meetings or tools.

Root cause:Assuming sharing is only a technical problem, not a social one.

Key Takeaways

Cross-team model sharing in dbt enables teams to reuse and build upon each other's data transformations, improving consistency and efficiency.

Sharing models by referencing and using dbt packages avoids duplication and helps maintain a single source of truth.

Proper versioning and testing of shared models prevent breaking changes and ensure stable data pipelines.

Effective sharing requires not only technical tools but also communication, governance, and collaboration processes.

Understanding software package management and modular programming concepts deepens the grasp of cross-team model sharing.