0
0
dbtdata~5 mins

Why sources define raw data contracts in dbt

Choose your learning style9 modes available
Introduction

Raw data contracts help teams agree on what data looks like before using it. This avoids confusion and errors later.

When multiple teams share the same data source and need clear expectations.
When you want to catch data problems early before analysis.
When building automated data pipelines that depend on consistent data formats.
When onboarding new team members who need to understand data structure quickly.
When tracking changes in data sources over time to avoid breaking reports.
Syntax
dbt
sources:
  - name: raw_data_source
    tables:
      - name: raw_table
        freshness:
          warn_after: {count: 24, period: hour}
          error_after: {count: 48, period: hour}
        loaded_at_field: updated_at
        description: "Raw data contract for source table"

The sources block defines where raw data comes from.

Freshness settings help monitor if data is up-to-date.

Examples
Defines a source named sales_db with a table orders.
dbt
sources:
  - name: sales_db
    tables:
      - name: orders
        description: "Contract for raw orders data"
Sets freshness rules and a timestamp field to track data loading time.
dbt
sources:
  - name: marketing_data
    tables:
      - name: leads
        freshness:
          warn_after: {count: 12, period: hour}
          error_after: {count: 24, period: hour}
        loaded_at_field: load_time
Sample Program

This example shows a raw data contract for an ecommerce customers table. It sets rules to warn if data is older than 6 hours and error if older than 12 hours.

dbt
version: 2

sources:
  - name: ecommerce_raw
    description: "Raw data contract for ecommerce source"
    tables:
      - name: customers
        description: "Customer raw data with expected fields"
        freshness:
          warn_after: {count: 6, period: hour}
          error_after: {count: 12, period: hour}
        loaded_at_field: last_updated

# This YAML defines a raw data contract in dbt for the ecommerce customers table.
# It helps ensure data freshness and documents expectations.
OutputSuccess
Important Notes

Raw data contracts are written in YAML files in dbt projects.

They help automate data quality checks and documentation.

Summary

Raw data contracts define clear expectations for source data.

They help catch data issues early and keep teams aligned.

In dbt, raw data contracts are defined using sources in YAML files.