Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Understanding Why Models Are the Core of dbt
📖 Scenario: Imagine you work in a company that collects sales data every day. You want to organize this data to answer questions like "Which products sell best?" or "How much revenue did we make last month?". dbt helps you do this by letting you create models that transform raw data into useful tables.
🎯 Goal: You will build a simple example to see why models are the heart of dbt. You will create a data structure for sales, set a filter condition, write a model to select important data, and finally display the result.
📋 What You'll Learn
Create a dictionary with sales data
Add a filter threshold for minimum sales
Write a model to select products with sales above the threshold
Print the filtered sales data
💡 Why This Matters
🌍 Real World
In real companies, raw data is messy and large. Models help clean and organize data so teams can answer important questions quickly.
💼 Career
Data analysts and engineers use dbt models daily to build reliable data pipelines that power dashboards and reports.
Progress0 / 4 steps
1
Create the sales data dictionary
Create a dictionary called sales_data with these exact entries: 'apple': 100, 'banana': 50, 'cherry': 75, 'date': 30, 'elderberry': 90.
dbt
Hint
Use curly braces {} to create a dictionary. Separate keys and values with colons, and pairs with commas.
2
Set the minimum sales threshold
Create a variable called min_sales and set it to 60.
dbt
Hint
Just assign the number 60 to the variable min_sales.
3
Write the model to filter sales data
Create a new dictionary called filtered_sales using a dictionary comprehension. Include only items from sales_data where the sales value is greater than or equal to min_sales. Use product and sales as the loop variables.
dbt
Hint
Use a dictionary comprehension with {key: value for key, value in dict.items() if condition}.
4
Print the filtered sales data
Write a print statement to display the filtered_sales dictionary.
dbt
Hint
Use print(filtered_sales) to show the result.
Practice
(1/5)
1. What is the main role of models in dbt?
easy
A. To transform raw data into useful tables or views
B. To store raw data without changes
C. To create visual dashboards
D. To manage user permissions
Solution
Step 1: Understand the purpose of models in dbt
Models are SQL files that define how raw data is transformed into clean, organized tables or views.
Step 2: Identify the correct role from options
Only To transform raw data into useful tables or views describes transforming raw data into useful tables or views, which is the core function of models.
Final Answer:
To transform raw data into useful tables or views -> Option A
Quick Check:
Models transform data [OK]
Hint: Models transform raw data into tables/views [OK]
Common Mistakes:
Confusing models with dashboards
Thinking models store raw data unchanged
Assuming models manage permissions
2. Which of the following is the correct way to define a model in dbt?
easy
A. models/my_model.yaml containing configuration only
B. models/my_model.py containing Python code
C. models/my_model.txt containing raw data
D. models/my_model.sql containing a SELECT statement
Solution
Step 1: Recall dbt model file requirements
dbt models are SQL files that contain SELECT statements to transform data.
Step 2: Match file type and content
Only models/my_model.sql containing a SELECT statement uses a .sql file with a SELECT statement, which is correct for a dbt model.
Final Answer:
models/my_model.sql containing a SELECT statement -> Option D
Quick Check:
Model = SQL file with SELECT [OK]
Hint: Models are SQL files with SELECT statements [OK]
Common Mistakes:
Using Python or text files for models
Confusing config files with models
Not including a SELECT statement in model files
3. Given this dbt model SQL code:
SELECT user_id, COUNT(*) AS orders_count FROM raw.orders GROUP BY user_id
What will this model produce when run?
medium
A. A table or view with user_id and their total order counts
B. A list of all orders without grouping
C. An error because COUNT(*) is invalid
D. A table with only user_id and no counts
Solution
Step 1: Analyze the SQL query in the model
The query selects user_id and counts orders grouped by user_id, aggregating orders per user.
Step 2: Determine the output of the model
The model will create a table or view showing each user_id with their total number of orders.
Final Answer:
A table or view with user_id and their total order counts -> Option A
Quick Check:
GROUP BY user_id with COUNT(*) = aggregated counts [OK]
Hint: GROUP BY with COUNT(*) gives totals per group [OK]
Common Mistakes:
Ignoring GROUP BY and expecting raw data
Thinking COUNT(*) causes errors
Assuming counts are missing
4. You wrote this dbt model SQL:
SELECT customer_id, date, SUM(amount) AS total FROM sales GROUP BY customer_id
But dbt throws an error. What is the likely problem?
medium
A. SUM(amount) cannot be used with GROUP BY
B. The SELECT includes date but GROUP BY does not, causing mismatch
C. customer_id should be aggregated with SUM()
D. Missing WHERE clause causes error
Solution
Step 1: Check SELECT and GROUP BY columns
SELECT has customer_id, date, and SUM(amount), but GROUP BY includes only customer_id.
Step 2: Identify mismatch causing error
All non-aggregated columns in SELECT must be in GROUP BY. date is missing in GROUP BY, causing error.
Final Answer:
The SELECT includes date but GROUP BY does not, causing mismatch -> Option B
Quick Check:
GROUP BY columns must match SELECT non-aggregates [OK]
Hint: SELECT non-aggregates must match GROUP BY columns [OK]
Common Mistakes:
Ignoring GROUP BY and SELECT column mismatch
Thinking SUM() can't be used with GROUP BY
Assuming WHERE clause is mandatory
5. You want to create a dbt model that builds a monthly sales summary table. Which approach best uses models as the core of dbt?
hard
A. Create a YAML file listing monthly sales without SQL
B. Manually export raw sales data and summarize in Excel
C. Write a SQL model that selects sales data, groups by month, and calculates totals
D. Use a Python script outside dbt to summarize sales
Solution
Step 1: Identify how models transform data in dbt
Models are SQL files that transform raw data into organized tables, like monthly summaries.
Step 2: Choose the option that uses dbt models correctly
Write a SQL model that selects sales data, groups by month, and calculates totals uses a SQL model to group and summarize sales by month, fitting dbt's core purpose.
Final Answer:
Write a SQL model that selects sales data, groups by month, and calculates totals -> Option C
Quick Check:
Models transform data with SQL for summaries [OK]
Hint: Use SQL models to transform and summarize data [OK]