0
0
DbtHow-ToBeginner ยท 3 min read

How to Configure dbt for Redshift: Step-by-Step Guide

To configure dbt for Amazon Redshift, you need to set up the profiles.yml file with your Redshift cluster's connection details such as host, user, password, port, and database. Then, specify the type: redshift adapter and other settings like schema and threads to enable dbt to connect and run models on Redshift.
๐Ÿ“

Syntax

The profiles.yml file defines how dbt connects to Redshift. It includes:

  • target: The active profile to use.
  • outputs: Connection details for Redshift.
  • type: Must be redshift to use the Redshift adapter.
  • host: The Redshift cluster endpoint.
  • user: Your Redshift username.
  • password: Your Redshift password.
  • port: Usually 5439 for Redshift.
  • dbname: The Redshift database name.
  • schema: The schema where dbt will build models.
  • threads: Number of parallel threads dbt can use.
yaml
my-redshift-profile:
  target: dev
  outputs:
    dev:
      type: redshift
      host: your-redshift-cluster.us-east-1.redshift.amazonaws.com
      user: your_username
      password: your_password
      port: 5439
      dbname: your_database
      schema: your_schema
      threads: 4
      keepalives_idle: 240
๐Ÿ’ป

Example

This example shows a complete profiles.yml configuration for dbt to connect to a Redshift cluster. Replace placeholders with your actual Redshift details. This setup allows dbt to run models in the specified schema using 4 threads.

yaml
my-redshift-profile:
  target: dev
  outputs:
    dev:
      type: redshift
      host: example-cluster.abc123xyz789.us-west-2.redshift.amazonaws.com
      user: dbt_user
      password: secure_password123
      port: 5439
      dbname: analytics_db
      schema: analytics_schema
      threads: 4
      keepalives_idle: 240
โš ๏ธ

Common Pitfalls

Common mistakes when configuring dbt for Redshift include:

  • Using the wrong host endpoint (must be the Redshift cluster endpoint, not a general AWS endpoint).
  • Incorrect port number; Redshift usually uses 5439.
  • Not specifying type: redshift, which causes dbt to fail connecting.
  • Forgetting to set the correct schema where models will be built.
  • Not setting threads properly, which can limit performance.

Example of a wrong and right configuration snippet:

yaml
# Wrong: missing type and wrong port
my-profile:
  target: dev
  outputs:
    dev:
      host: wrong-host
      user: user
      password: pass
      port: 1234
      dbname: db
      schema: public

# Right: correct type and port
my-profile:
  target: dev
  outputs:
    dev:
      type: redshift
      host: correct-host.redshift.amazonaws.com
      user: user
      password: pass
      port: 5439
      dbname: db
      schema: public
      threads: 2
๐Ÿ“Š

Quick Reference

Keep these tips in mind when configuring dbt for Redshift:

  • Always use type: redshift in your profile.
  • Use your exact Redshift cluster endpoint for host.
  • Default port is 5439.
  • Set schema to control where dbt builds models.
  • Adjust threads based on your workload and Redshift concurrency limits.
  • Use keepalives_idle to maintain stable connections.
โœ…

Key Takeaways

Configure your profiles.yml with type: redshift and correct connection details.
Use your Redshift cluster endpoint as the host and port 5439.
Set the schema where dbt will create models to organize your data.
Adjust threads for parallel execution but stay within Redshift limits.
Avoid common mistakes like missing the adapter type or wrong port to ensure connection success.