0
0
PostgreSQLquery~15 mins

INTERSECT and EXCEPT in PostgreSQL - Deep Dive

Choose your learning style9 modes available
Overview - INTERSECT and EXCEPT
What is it?
INTERSECT and EXCEPT are SQL commands used to compare two sets of data from different queries. INTERSECT returns only the rows that appear in both query results. EXCEPT returns rows from the first query that do not appear in the second query. These commands help find common or unique data between tables or query results.
Why it matters
Without INTERSECT and EXCEPT, finding shared or differing data between two lists would require complex and error-prone manual comparisons. These commands simplify data analysis, reporting, and cleaning by quickly showing overlaps or differences. This saves time and reduces mistakes in real-world tasks like checking duplicates or missing records.
Where it fits
Before learning INTERSECT and EXCEPT, you should understand basic SQL SELECT queries and how to combine results using UNION. After mastering these, you can explore more advanced set operations, joins, and subqueries to manipulate and analyze data efficiently.
Mental Model
Core Idea
INTERSECT finds what two data sets share, while EXCEPT finds what the first data set has that the second does not.
Think of it like...
Imagine two friends each have a collection of books. INTERSECT is like finding the books both friends own. EXCEPT is like finding the books the first friend has that the second friend doesn’t.
┌───────────────┐   ┌───────────────┐
│   Query A     │   │   Query B     │
│ {1,2,3,4}    │   │ {3,4,5,6}    │
└──────┬────────┘   └──────┬────────┘
       │                   │
       │ INTERSECT         │ EXCEPT
       ▼                   ▼
   {3,4}               {1,2}
Build-Up - 7 Steps
1
FoundationUnderstanding Basic Set Operations
🤔
Concept: Learn what sets are and how SQL can treat query results as sets of rows.
In SQL, the results of queries can be thought of as sets of rows, similar to lists of items. Set operations let you compare these lists to find common or unique items. INTERSECT and EXCEPT are two such operations that work on these sets.
Result
You understand that query results can be compared like sets of items.
Understanding that query results behave like sets is key to grasping how INTERSECT and EXCEPT work.
2
FoundationBasic SELECT Queries Refresher
🤔
Concept: Review how to write simple SELECT queries to retrieve data from tables.
A SELECT query asks the database to give you rows from a table. For example, SELECT name FROM students; returns all student names. Knowing how to write these queries is essential before combining or comparing results.
Result
You can write simple queries to get data from tables.
Mastering SELECT queries is the foundation for using INTERSECT and EXCEPT effectively.
3
IntermediateUsing INTERSECT to Find Common Rows
🤔Before reading on: do you think INTERSECT returns all rows from both queries or only those present in both? Commit to your answer.
Concept: INTERSECT returns only rows that appear in both query results.
If you have two queries, INTERSECT returns the rows that both queries share. For example, if Query A returns {1,2,3} and Query B returns {2,3,4}, then Query A INTERSECT Query B returns {2,3}.
Result
The output shows only the rows common to both queries.
Knowing INTERSECT returns shared rows helps you quickly find overlaps between data sets.
4
IntermediateUsing EXCEPT to Find Differences
🤔Before reading on: does EXCEPT return rows from the first query, the second, or both? Commit to your answer.
Concept: EXCEPT returns rows from the first query that do not appear in the second query.
If Query A returns {1,2,3} and Query B returns {2,3,4}, then Query A EXCEPT Query B returns {1}. It shows what is unique to the first query compared to the second.
Result
The output shows rows only in the first query, excluding those in the second.
Understanding EXCEPT helps you find what data is missing or unique in one set compared to another.
5
IntermediateRules for Using INTERSECT and EXCEPT
🤔
Concept: Learn the syntax rules and requirements for these commands to work correctly.
Both queries must have the same number of columns and compatible data types. The columns are compared in order. Also, duplicates are removed from the results, so the output is a set of unique rows.
Result
You know how to write syntactically correct INTERSECT and EXCEPT queries.
Knowing these rules prevents errors and unexpected results when using these commands.
6
AdvancedCombining INTERSECT and EXCEPT in Queries
🤔Before reading on: do you think combining INTERSECT and EXCEPT in one query is straightforward or can cause surprises? Commit to your answer.
Concept: You can combine INTERSECT and EXCEPT to perform complex data comparisons.
For example, you can find rows common to two queries but exclude those that appear in a third query by using INTERSECT and EXCEPT together. This allows precise filtering of data sets.
Result
You can write queries that find nuanced overlaps and differences in data.
Combining these operations unlocks powerful data analysis capabilities beyond simple comparisons.
7
ExpertPerformance and Optimization Considerations
🤔Before reading on: do you think INTERSECT and EXCEPT always perform efficiently on large data? Commit to your answer.
Concept: Understanding how these operations execute helps optimize queries for large data sets.
INTERSECT and EXCEPT often require sorting or hashing to find matches or differences, which can be costly on big tables. Using indexes, limiting data early, or rewriting queries with joins can improve performance.
Result
You can write efficient queries and avoid slowdowns in production.
Knowing the internal cost of these operations helps you write scalable and performant SQL.
Under the Hood
INTERSECT and EXCEPT work by comparing rows from two query results. Internally, the database sorts or hashes rows to identify matches or differences. INTERSECT returns rows found in both sets, while EXCEPT returns rows unique to the first set. Duplicate rows are removed to produce a set-like result.
Why designed this way?
These commands were designed to simplify set operations in SQL, reflecting mathematical set theory. Sorting or hashing ensures accurate and efficient comparison. Alternatives like manual joins were more complex and error-prone, so these commands provide a clear, declarative way to express set logic.
┌───────────────┐       ┌───────────────┐
│   Query A     │       │   Query B     │
│  Rows sorted  │       │  Rows sorted  │
└──────┬────────┘       └──────┬────────┘
       │                       │
       │  Compare rows by hash or sort
       ▼                       ▼
  ┌───────────────┐     ┌───────────────┐
  │ INTERSECT     │     │ EXCEPT        │
  │ Rows in both  │     │ Rows in A not │
  │ sets          │     │ in B          │
  └───────────────┘     └───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does INTERSECT return duplicate rows if they appear multiple times in both queries? Commit to yes or no.
Common Belief:INTERSECT returns all duplicate rows that appear in both queries.
Tap to reveal reality
Reality:INTERSECT removes duplicates and returns only unique rows present in both queries.
Why it matters:Expecting duplicates can lead to wrong assumptions about data counts and cause errors in reports or analyses.
Quick: Does EXCEPT return rows from both queries or only from the first? Commit to your answer.
Common Belief:EXCEPT returns rows from both queries that are different.
Tap to reveal reality
Reality:EXCEPT returns only rows from the first query that are not in the second query.
Why it matters:Misunderstanding this causes incorrect query results and confusion about what data is being compared.
Quick: Can INTERSECT and EXCEPT be used with queries having different numbers of columns? Commit to yes or no.
Common Belief:You can use INTERSECT and EXCEPT with queries having different column counts.
Tap to reveal reality
Reality:Both queries must have the same number of columns with compatible data types for these operations to work.
Why it matters:Ignoring this causes syntax errors and wasted time debugging.
Quick: Does the order of columns affect the result of INTERSECT and EXCEPT? Commit to yes or no.
Common Belief:Column order does not matter in INTERSECT and EXCEPT comparisons.
Tap to reveal reality
Reality:Column order matters; rows are compared column by column in order.
Why it matters:Wrong column order leads to unexpected results or empty outputs.
Expert Zone
1
INTERSECT and EXCEPT always remove duplicates, unlike UNION ALL which keeps them; this subtlety affects result counts.
2
The order of columns and their data types must match exactly; implicit type conversions can cause errors or unexpected results.
3
Using EXCEPT with large datasets can be less efficient than LEFT JOIN with IS NULL filters, depending on indexes and data distribution.
When NOT to use
Avoid INTERSECT and EXCEPT when you need to preserve duplicates or when queries have different column structures. Instead, use JOINs or EXISTS subqueries for more flexible comparisons or when performance is critical on large datasets.
Production Patterns
In real systems, INTERSECT is often used to find common customers between campaigns, while EXCEPT helps identify missing or unmatched records during data reconciliation. Combining these with filters and indexes optimizes reporting and data quality checks.
Connections
Set Theory
INTERSECT and EXCEPT directly implement set intersection and difference operations from set theory.
Understanding mathematical sets clarifies why these SQL commands behave as they do and helps predict their results.
JOIN Operations in SQL
INTERSECT and EXCEPT can sometimes be rewritten using INNER JOIN and LEFT JOIN with filters.
Knowing JOINs helps understand alternative ways to achieve similar results and optimize queries.
Data Cleaning and Deduplication
These commands help identify duplicates and missing data, key tasks in data cleaning.
Mastering INTERSECT and EXCEPT improves your ability to prepare clean, accurate data for analysis.
Common Pitfalls
#1Using INTERSECT or EXCEPT with queries having different column counts.
Wrong approach:SELECT id, name FROM table1 INTERSECT SELECT id FROM table2;
Correct approach:SELECT id FROM table1 INTERSECT SELECT id FROM table2;
Root cause:Misunderstanding that both queries must have the same number of columns.
#2Expecting EXCEPT to return rows from the second query that are not in the first.
Wrong approach:SELECT id FROM table1 EXCEPT SELECT id FROM table2; -- expecting rows unique to table2
Correct approach:SELECT id FROM table2 EXCEPT SELECT id FROM table1; -- to get rows unique to table2
Root cause:Confusing the order of queries in EXCEPT operation.
#3Assuming INTERSECT returns duplicate rows if they appear multiple times.
Wrong approach:SELECT id FROM table1 INTERSECT SELECT id FROM table2; -- expecting duplicates
Correct approach:SELECT DISTINCT id FROM (SELECT id FROM table1 INTERSECT SELECT id FROM table2) AS sub;
Root cause:Not knowing INTERSECT removes duplicates automatically.
Key Takeaways
INTERSECT returns only the rows common to both query results, removing duplicates.
EXCEPT returns rows from the first query that do not appear in the second query, also removing duplicates.
Both commands require queries to have the same number of columns with compatible data types, compared in order.
These set operations simplify finding overlaps and differences in data, saving time and reducing errors.
Understanding their internal workings and limitations helps write efficient, correct SQL for real-world data tasks.