PostgreSQLquery~7 mins

Correlated subqueries execution model in PostgreSQL

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Correlated subqueries let you compare each row of one table with rows in another table. They help answer questions where you need to check related data row by row.

When you want to find rows in one table that depend on values in another table for each row.

When filtering data based on a condition that involves matching or comparing with another table's data.

When calculating values that depend on each row's related data, like checking if a customer has orders above a certain amount.

When you need to write queries that can't be done easily with simple joins or aggregates.

When you want to check existence or conditions for each row individually using another table.

Syntax

PostgreSQL

SELECT column1, column2
FROM table1
WHERE columnX operator (
    SELECT columnY
    FROM table2
    WHERE table2.columnZ = table1.columnA
);

The subquery refers to a column from the outer query, making it 'correlated'.

The subquery runs once for each row of the outer query, which can affect performance.

Examples

This finds employees whose salary is above the average salary of their own department.

PostgreSQL

SELECT e.name
FROM employees e
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department = e.department
);

This finds customers who have at least one order with amount greater than 100.

PostgreSQL

SELECT c.customer_id
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
    AND o.amount > 100
);

Sample Program

This query lists employees who earn more than the average salary in their department. The subquery calculates the average salary for the department of each employee row.

PostgreSQL

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name TEXT,
    department TEXT,
    salary INT
);

INSERT INTO employees (name, department, salary) VALUES
('Alice', 'Sales', 5000),
('Bob', 'Sales', 4000),
('Carol', 'HR', 4500),
('Dave', 'HR', 5500);

SELECT name, salary, department
FROM employees e
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department = e.department
);

OutputSuccess

Important Notes

Correlated subqueries can be slower because the inner query runs for each outer row.

Sometimes you can rewrite correlated subqueries using JOINs for better performance.

Understanding how the subquery depends on the outer query helps avoid mistakes.

Summary

Correlated subqueries run the inner query once per outer row, using values from that row.

They are useful for row-by-row comparisons or checks involving related tables.

Be mindful of performance and consider alternatives like JOINs when possible.