Overview - HAVING clause

What is it?

The HAVING clause is used in SQL to filter groups of rows created by GROUP BY. It lets you specify conditions on aggregated data, like sums or counts, to keep only groups that meet those conditions. Unlike WHERE, which filters individual rows before grouping, HAVING filters after grouping. This helps analyze summarized data effectively.

Why it matters

Without HAVING, you couldn't easily filter groups based on their aggregated values, like finding customers with total purchases above a certain amount. This would make data analysis and reporting much harder and less precise. HAVING makes it simple to ask questions about groups, not just single rows.

Where it fits

Before learning HAVING, you should understand basic SELECT queries, filtering with WHERE, and how GROUP BY works to group rows. After HAVING, you can explore advanced aggregation functions, window functions, and complex reporting queries.

Mental Model

Core Idea

HAVING filters groups after aggregation, while WHERE filters rows before grouping.

Think of it like...

Imagine sorting a basket of fruits by type (grouping), then deciding which fruit types to keep based on how many you have (HAVING). WHERE is like picking fruits before sorting, HAVING is like choosing groups after sorting.

SELECT columns
  FROM table
  WHERE row_conditions
  GROUP BY grouping_columns
  HAVING group_conditions

Flow:
[Rows] --WHERE--> [Filtered Rows] --GROUP BY--> [Groups] --HAVING--> [Filtered Groups]

Build-Up - 6 Steps

1

FoundationUnderstanding GROUP BY basics

Concept: Learn how GROUP BY collects rows into groups based on column values.

GROUP BY takes rows that share the same value in specified columns and bundles them together. For example, grouping sales by product ID collects all sales of each product into one group.

Result

Rows are organized into groups, enabling aggregate calculations like SUM or COUNT per group.

Understanding grouping is essential because HAVING works on these groups, not individual rows.

2

FoundationDifference between WHERE and HAVING

3

IntermediateUsing HAVING with aggregate functions

4

IntermediateHAVING without GROUP BY usage

5

AdvancedCombining WHERE and HAVING effectively

6

ExpertHAVING clause performance considerations

Under the Hood

When a SQL query with HAVING runs, the database first applies WHERE to filter rows. Then it groups the remaining rows by GROUP BY columns. Next, it calculates aggregate functions for each group. Finally, it applies the HAVING condition to these aggregated groups, keeping only those that satisfy the condition. This sequence ensures HAVING works on summarized data, not raw rows.

Why designed this way?

SQL was designed to separate row-level filtering (WHERE) from group-level filtering (HAVING) to provide clear, logical query structure. Early SQL versions lacked HAVING, making group filtering awkward. HAVING was introduced to fill this gap, allowing conditions on aggregates without complex subqueries.

Query Execution Flow:

[Input Rows]
    │
    ▼
[Apply WHERE filter]
    │
    ▼
[Group rows by GROUP BY]
    │
    ▼
[Calculate aggregates per group]
    │
    ▼
[Apply HAVING filter on groups]
    │
    ▼
[Return filtered groups]

Myth Busters - 4 Common Misconceptions

Quick: Does HAVING filter individual rows before grouping? Commit yes or no.

Common Belief:HAVING filters individual rows just like WHERE does.

Tap to reveal reality

Quick: Can HAVING be used without GROUP BY? Commit yes or no.

Common Belief:HAVING always requires GROUP BY to work.

Tap to reveal reality

Quick: Does WHERE filter rows after aggregation? Commit yes or no.

Common Belief:WHERE can filter aggregated results like HAVING.

Tap to reveal reality

Quick: Is HAVING always slower than WHERE? Commit yes or no.

Common Belief:HAVING is always slower than WHERE and should be avoided.

Tap to reveal reality

Expert Zone

1

HAVING can reference aliases defined in SELECT, but WHERE cannot, which affects query writing order.

2

Some databases optimize HAVING by pushing conditions down to WHERE when possible, but this is not guaranteed.

3

Using HAVING with non-aggregated columns without GROUP BY can cause errors or unexpected results depending on SQL mode.

When NOT to use

Avoid HAVING when filtering individual rows; use WHERE instead for better performance. For complex filtering on aggregates, consider window functions or subqueries as alternatives.

Production Patterns

In production, HAVING is often combined with WHERE to optimize queries. It's used in reports to filter summarized data, like top customers by sales or products with low stock. Indexing grouped columns and filtering early with WHERE improves performance.

Connections

Window functions

Builds-on

Understanding HAVING helps grasp window functions, which also work with grouped or aggregated data but allow more flexible filtering and ranking.

MapReduce programming

Similar pattern

HAVING's filter-after-grouping is like the Reduce phase in MapReduce, where data is aggregated then filtered, showing a common pattern in data processing.

Statistical hypothesis testing

Conceptual analogy

Filtering groups by aggregated values in HAVING is like deciding which data groups show significant effects in statistics, connecting data summarization and decision-making.

Common Pitfalls

#1Using HAVING to filter rows instead of groups.

Wrong approach:SELECT product_id, price FROM products HAVING price > 100;

Correct approach:SELECT product_id, price FROM products WHERE price > 100;

Root cause:Confusing HAVING as a row filter instead of a group filter.

#2Using aggregate functions in WHERE clause.

Wrong approach:SELECT product_id FROM sales WHERE SUM(quantity) > 10 GROUP BY product_id;

Correct approach:SELECT product_id FROM sales GROUP BY product_id HAVING SUM(quantity) > 10;

Root cause:Misunderstanding that WHERE cannot use aggregates.

#3Not using WHERE to reduce rows before HAVING.

Wrong approach:SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id HAVING SUM(amount) > 1000;

Correct approach:SELECT customer_id, SUM(amount) FROM orders WHERE status = 'completed' GROUP BY customer_id HAVING SUM(amount) > 1000;

Root cause:Ignoring query optimization by filtering early.

Key Takeaways

HAVING filters groups after aggregation, unlike WHERE which filters rows before grouping.

Use HAVING with aggregate functions to filter summarized data effectively.

HAVING can be used without GROUP BY to filter aggregated single-value results.

Combining WHERE and HAVING properly improves query performance and clarity.

Understanding HAVING's role in query execution helps write efficient and correct SQL queries.