0
0
PostgreSQLquery~5 mins

Covering indexes with INCLUDE in PostgreSQL - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Covering indexes with INCLUDE
O(log n + k)
Understanding Time Complexity

We want to understand how using covering indexes with INCLUDE affects query speed as data grows.

How does the query time change when the index holds extra columns to avoid looking up the main table?

Scenario Under Consideration

Analyze the time complexity of this index and query.


CREATE INDEX idx_orders_customer_date ON orders (customer_id) INCLUDE (order_date);

SELECT order_date FROM orders WHERE customer_id = 123;
    

This creates an index on customer_id and stores order_date inside it to speed up queries that select order_date by customer_id.

Identify Repeating Operations

Look at what repeats when the query runs.

  • Primary operation: Searching the index tree for matching customer_id entries.
  • How many times: Once per query, but may scan multiple matching entries depending on how many orders a customer has.
How Execution Grows With Input

As the number of orders grows, the index helps find matching rows faster without scanning the whole table.

Input Size (n)Approx. Operations
10About 3-4 steps to find matches
100About 5-6 steps to find matches
1000About 7-8 steps to find matches

Pattern observation: The number of steps grows slowly, not directly with total rows, because the index tree is balanced.

Final Time Complexity

Time Complexity: O(log n + k)

This means the query time grows slowly with total rows (log n) plus the number of matching rows (k) to read.

Common Mistake

[X] Wrong: "Adding INCLUDE columns makes the index size and search time grow a lot."

[OK] Correct: INCLUDE columns add data only to leaf nodes, so the search steps stay about the same; it just avoids extra table lookups.

Interview Connect

Understanding how covering indexes affect query speed shows you know how databases keep queries fast as data grows. This skill helps you design efficient data access.

Self-Check

What if we removed the INCLUDE clause and selected order_date in the query? How would the time complexity change?