0
0
PostgreSQLquery~15 mins

LATERAL join for correlated subqueries in PostgreSQL - Deep Dive

Choose your learning style9 modes available
Overview - LATERAL join for correlated subqueries
What is it?
A LATERAL join in PostgreSQL lets you run a subquery for each row of a main query, using values from that row inside the subquery. This means the subquery can refer to columns from the outer query, making it 'correlated'. It helps combine data in ways that normal joins or subqueries can't easily do. It is like running a small query repeatedly, once per row, and joining the results.
Why it matters
Without LATERAL joins, you would struggle to write queries that need to use data from one row to find related data in another table dynamically. This limits how you can combine and analyze data efficiently. LATERAL joins solve this by allowing flexible, row-by-row subqueries, making complex data retrieval simpler and faster. Without it, many tasks would require multiple queries or complicated workarounds.
Where it fits
Before learning LATERAL joins, you should understand basic SQL joins, subqueries, and correlated subqueries. After mastering LATERAL joins, you can explore advanced SQL features like window functions, recursive queries, and set-returning functions to handle even more complex data relationships.
Mental Model
Core Idea
A LATERAL join runs a subquery for each row of the main query, letting the subquery use that row's data to find matching results.
Think of it like...
Imagine you have a list of friends, and for each friend, you ask them to pick their favorite books from a library. You ask each friend separately, using their name to find their choices. The LATERAL join is like asking each friend individually and collecting their answers together.
Main Table Rows
  │
  ├─> For each row, run Subquery using that row's data
  │       └─> Subquery returns matching rows
  │
  └─> Combine main row with subquery results (like a join)

Result: One combined table with main rows and their related subquery rows
Build-Up - 7 Steps
1
FoundationUnderstanding basic SQL joins
🤔
Concept: Learn how SQL joins combine rows from two tables based on related columns.
In SQL, a join combines rows from two tables when they share a common value. For example, joining customers with their orders by matching customer IDs. This creates a bigger table showing customers alongside their orders.
Result
You get a combined table with rows from both tables matched by a key.
Knowing how joins work is essential because LATERAL joins build on this idea but add more flexibility by running subqueries per row.
2
FoundationBasics of subqueries and correlation
🤔
Concept: Understand subqueries that run inside another query and how they can refer to outer query data.
A subquery is a query inside another query. A correlated subquery uses values from the outer query to filter its results. For example, finding orders where the order date matches the customer's signup date by referring to the outer customer's date inside the subquery.
Result
Subqueries can dynamically change their results based on each outer row.
This shows how queries can be nested and interact, which is the foundation for LATERAL joins.
3
IntermediateIntroducing LATERAL join syntax
🤔
Concept: Learn how to write a LATERAL join and how it differs from normal joins.
In PostgreSQL, you write a LATERAL join like this: SELECT a.*, b.* FROM table_a a JOIN LATERAL ( SELECT * FROM table_b WHERE table_b.ref_id = a.id ) b ON true; The subquery after LATERAL can use columns from 'a'. The ON true means we join all matching rows from the subquery.
Result
The query returns rows from 'a' combined with matching rows from the subquery that uses 'a's data.
LATERAL lets the subquery see each row from the main table, enabling dynamic filtering per row.
4
IntermediateUsing LATERAL for correlated subqueries
🤔Before reading on: Do you think a LATERAL join runs the subquery once for all rows or once per each row? Commit to your answer.
Concept: Understand that LATERAL runs the subquery once per row, making it truly correlated.
Unlike normal joins, LATERAL executes the subquery for every row in the main query. This means the subquery can use that row's values to find related data. For example, finding the latest order per customer by running a subquery that looks up orders for each customer individually.
Result
You get a result where each main row is combined with subquery results tailored to that row.
Knowing that the subquery runs per row explains why LATERAL is powerful for dynamic, row-specific data retrieval.
5
IntermediateComparing LATERAL with correlated subqueries
🤔Before reading on: Is a LATERAL join just a fancy syntax for correlated subqueries, or does it offer something more? Commit to your answer.
Concept: See how LATERAL joins can replace correlated subqueries and offer more flexibility in combining results.
Correlated subqueries often appear in WHERE or SELECT clauses and return single values. LATERAL joins let you return multiple rows per outer row and join them like a normal table. This makes it easier to combine complex data sets and use the results in further joins or filters.
Result
You can write clearer, more powerful queries that handle multiple related rows per main row.
Understanding this difference helps you choose the right tool for complex queries and improves readability and performance.
6
AdvancedUsing LATERAL with set-returning functions
🤔Before reading on: Can LATERAL joins work with functions that return multiple rows, or only with subqueries? Commit to your answer.
Concept: Learn that LATERAL can join with functions that return sets of rows, not just subqueries.
PostgreSQL allows joining with functions that return multiple rows, like generate_series(), using LATERAL. For example: SELECT a.id, gs.num FROM table_a a JOIN LATERAL generate_series(1, a.count) AS gs(num) ON true; This runs generate_series for each row in 'a', producing multiple rows per 'a' row.
Result
You get expanded rows per main row based on the function's output.
Knowing this expands your ability to generate dynamic row sets tied to each main row, useful for tasks like creating sequences or splitting data.
7
ExpertPerformance considerations and pitfalls
🤔Before reading on: Do you think LATERAL joins always perform better than correlated subqueries? Commit to your answer.
Concept: Understand when LATERAL joins can impact performance and how to optimize them.
LATERAL joins run subqueries per row, which can be expensive if the main table is large. However, PostgreSQL can optimize some LATERAL joins with indexes or by rewriting queries. Sometimes, rewriting a LATERAL join as a window function or a join with aggregates can be faster. Also, using LATERAL with large functions or complex subqueries can slow queries down.
Result
Knowing these tradeoffs helps you write efficient queries and avoid slowdowns.
Understanding performance helps you decide when to use LATERAL and when to seek alternatives for scalability.
Under the Hood
When PostgreSQL executes a LATERAL join, it processes the main query row by row. For each row, it runs the subquery or function in the LATERAL clause, passing the current row's values as parameters. This means the subquery is re-evaluated for every row, allowing it to access outer query columns. The results are then combined with the main row to form the final output. Internally, this involves nested loops and parameter passing between query execution contexts.
Why designed this way?
LATERAL joins were introduced to overcome limitations of traditional joins and correlated subqueries, which either couldn't access outer rows easily or returned only single values. The design allows flexible, row-wise subqueries that can return multiple rows, enabling richer data combinations. This approach balances expressiveness with SQL's declarative style, avoiding complex procedural code or multiple queries.
┌───────────────┐
│ Main Query    │
│ (table_a)     │
└──────┬────────┘
       │ For each row
       ▼
┌───────────────┐
│ LATERAL Subq. │
│ (uses outer   │
│  row values)  │
└──────┬────────┘
       │ Returns rows
       ▼
┌───────────────┐
│ Combine rows  │
│ from main and │
│ subquery      │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does a LATERAL join run its subquery only once for the entire main query? Commit yes or no.
Common Belief:LATERAL joins run the subquery only once, like normal joins.
Tap to reveal reality
Reality:LATERAL joins run the subquery separately for each row of the main query, allowing it to use that row's data.
Why it matters:Thinking it runs once leads to wrong assumptions about performance and query results, causing bugs or inefficient queries.
Quick: Can you use LATERAL joins in any SQL database? Commit yes or no.
Common Belief:LATERAL joins are standard and available in all SQL databases.
Tap to reveal reality
Reality:LATERAL joins are supported in PostgreSQL and some others, but not all SQL databases support them or have different syntax.
Why it matters:Assuming universal support can cause portability issues and errors when moving queries between systems.
Quick: Does using LATERAL always improve query performance? Commit yes or no.
Common Belief:LATERAL joins always make queries faster than correlated subqueries or other methods.
Tap to reveal reality
Reality:LATERAL joins can be slower if the subquery is expensive and the main table is large, because the subquery runs per row.
Why it matters:Overusing LATERAL without considering cost can lead to slow queries and resource waste.
Quick: Is a LATERAL join just syntactic sugar for a correlated subquery? Commit yes or no.
Common Belief:LATERAL joins are just a nicer way to write correlated subqueries with no added power.
Tap to reveal reality
Reality:LATERAL joins can return multiple rows per outer row and be combined like tables, offering more flexibility than correlated subqueries that return single values.
Why it matters:Underestimating LATERAL's power limits your ability to write complex, efficient queries.
Expert Zone
1
LATERAL joins can be combined with DISTINCT ON to efficiently pick top related rows per main row.
2
Using LATERAL with set-returning functions can generate dynamic row expansions that are hard to replicate with standard joins.
3
PostgreSQL's planner can sometimes inline LATERAL subqueries, optimizing performance, but this depends on query complexity and statistics.
When NOT to use
Avoid LATERAL joins when the main table is huge and the subquery is complex or slow, as it can cause performance issues. Instead, consider window functions, pre-aggregated tables, or materialized views for better efficiency.
Production Patterns
In real systems, LATERAL joins are used to fetch the latest or top-N related records per main row, to split JSON arrays into rows, or to call set-returning functions dynamically. They often appear in reporting queries, data transformations, and API backends requiring flexible data shapes.
Connections
Correlated Subqueries
LATERAL joins build on and extend correlated subqueries by allowing multiple rows and join-like combinations.
Understanding correlated subqueries helps grasp why LATERAL joins are more powerful and flexible for row-wise data retrieval.
Functional Programming Closures
Both LATERAL joins and closures capture external context to run code with access to outer variables.
Recognizing this similarity clarifies how LATERAL subqueries access outer query data dynamically, like closures capture variables.
Nested Loops in Algorithms
LATERAL join execution resembles nested loops where for each outer element, an inner process runs.
This analogy helps understand performance implications and why LATERAL can be costly with large datasets.
Common Pitfalls
#1Using LATERAL without ON condition or ON true, causing syntax errors.
Wrong approach:SELECT a.*, b.* FROM table_a a JOIN LATERAL (SELECT * FROM table_b WHERE table_b.ref_id = a.id);
Correct approach:SELECT a.*, b.* FROM table_a a JOIN LATERAL (SELECT * FROM table_b WHERE table_b.ref_id = a.id) b ON true;
Root cause:LATERAL joins require an explicit ON condition; forgetting it leads to syntax errors.
#2Expecting LATERAL subquery to run once instead of per row, leading to unexpected results.
Wrong approach:Using LATERAL join but assuming the subquery returns the same rows for all main rows.
Correct approach:Design queries knowing the subquery runs per row and uses that row's data for filtering.
Root cause:Misunderstanding the execution model of LATERAL joins causes wrong assumptions about results.
#3Using LATERAL join on very large tables with expensive subqueries without indexes.
Wrong approach:SELECT a.*, b.* FROM large_table a JOIN LATERAL (SELECT * FROM big_table WHERE big_table.ref = a.id) b ON true;
Correct approach:Add indexes on big_table.ref or rewrite query using window functions or pre-aggregated data.
Root cause:Ignoring performance implications of per-row subqueries leads to slow queries.
Key Takeaways
LATERAL joins let you run a subquery for each row of a main query, using that row's data inside the subquery.
They are more powerful than correlated subqueries because they can return multiple rows and be joined like tables.
LATERAL joins enable flexible, dynamic data retrieval but can be costly if used on large datasets without care.
Understanding how LATERAL works helps you write clearer, more efficient queries for complex data relationships.
Knowing when and how to use LATERAL joins is key to mastering advanced SQL querying in PostgreSQL.