Bird
Raised Fist0
DBMS Theoryknowledge~5 mins

Query optimization strategies in DBMS Theory - Time & Space Complexity

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Time Complexity: Query optimization strategies
O(n)
Understanding Time Complexity

When running database queries, the time it takes can change a lot depending on how the query is written and how the database handles it.

We want to understand how the work done grows as the data size grows and how optimization helps.

Scenario Under Consideration

Analyze the time complexity of the following SQL query with and without optimization.


SELECT employee.name, department.name
FROM employee
JOIN department ON employee.dept_id = department.id
WHERE employee.salary > 50000;
    

This query joins two tables and filters employees with salary over 50000.

Identify Repeating Operations

Look at what repeats when the query runs.

  • Primary operation: Scanning rows in employee and department tables.
  • How many times: Each row in employee is checked, and for each matching employee, the department is looked up.
How Execution Grows With Input

As the number of employees and departments grows, the work grows too.

Input Size (n employees)Approx. Operations
10About 10 employee checks + department lookups
100About 100 employee checks + department lookups
1000About 1000 employee checks + department lookups

Pattern observation: The work grows roughly in direct proportion to the number of employees.

Final Time Complexity

Time Complexity: O(n)

This means the time to run the query grows roughly in a straight line as the number of employees increases.

Common Mistake

[X] Wrong: "Adding indexes always makes queries run instantly regardless of data size."

[OK] Correct: Indexes help, but the query still needs to check matching rows, so time still grows with data size, just more slowly.

Interview Connect

Understanding how query time grows helps you write better queries and explain your thinking clearly in interviews.

Self-Check

"What if we add an index on employee.salary? How would the time complexity change?"

Practice

(1/5)
1. What is the main goal of query optimization in a database?
easy
A. To make data retrieval faster and more efficient
B. To increase the size of the database
C. To delete unnecessary data automatically
D. To encrypt data for security

Solution

  1. Step 1: Understand the purpose of query optimization

    Query optimization aims to improve how quickly and efficiently data is retrieved from a database.
  2. Step 2: Compare options with the goal

    Only To make data retrieval faster and more efficient matches this goal; others describe unrelated tasks.
  3. Final Answer:

    To make data retrieval faster and more efficient -> Option A
  4. Quick Check:

    Query optimization = faster data retrieval [OK]
Hint: Focus on speed and efficiency of data retrieval [OK]
Common Mistakes:
  • Confusing optimization with data deletion
  • Thinking optimization increases database size
  • Mixing security tasks with optimization
2. Which of the following is a correct SQL syntax to create an index on the column employee_id in the table employees?
easy
A. CREATE employees INDEX idx_emp_id(employee_id);
B. MAKE INDEX idx_emp_id IN employees(employee_id);
C. INDEX CREATE idx_emp_id FOR employees(employee_id);
D. CREATE INDEX idx_emp_id ON employees(employee_id);

Solution

  1. Step 1: Recall correct SQL syntax for creating an index

    The standard syntax is CREATE INDEX index_name ON table_name(column_name);
  2. Step 2: Match options with correct syntax

    Only CREATE INDEX idx_emp_id ON employees(employee_id); matches the correct syntax exactly.
  3. Final Answer:

    CREATE INDEX idx_emp_id ON employees(employee_id); -> Option D
  4. Quick Check:

    CREATE INDEX ... ON ... (column) [OK]
Hint: Remember: CREATE INDEX index_name ON table(column) [OK]
Common Mistakes:
  • Using wrong keywords like MAKE or FOR
  • Placing table name before INDEX keyword
  • Incorrect order of clauses
3. Consider the SQL query:
SELECT * FROM orders WHERE customer_id = 123;
If there is an index on customer_id, what is the expected effect on query performance?
medium
A. The query will return incorrect results
B. The query will run slower because indexes add overhead
C. The query will run faster by quickly locating matching rows
D. The query will ignore the index and scan the whole table

Solution

  1. Step 1: Understand the role of indexes in queries

    Indexes help the database find rows matching conditions faster without scanning the entire table.
  2. Step 2: Analyze the effect of an index on customer_id

    Since the query filters by customer_id, the index speeds up locating those rows.
  3. Final Answer:

    The query will run faster by quickly locating matching rows -> Option C
  4. Quick Check:

    Index on filter column = faster query [OK]
Hint: Index on filter column speeds up data retrieval [OK]
Common Mistakes:
  • Thinking indexes slow down SELECT queries
  • Assuming indexes cause wrong results
  • Believing indexes are always ignored
4. A developer wrote this SQL query:
SELECT * FROM products WHERE price > 100 AND price < 50;
What is the main issue affecting query optimization here?
medium
A. The SELECT * syntax is invalid
B. The WHERE clause has conflicting conditions making the query return no rows
C. The query is missing an index on the price column
D. The table name is misspelled

Solution

  1. Step 1: Analyze the WHERE clause conditions

    The conditions price > 100 and price < 50 cannot be true at the same time.
  2. Step 2: Understand impact on query results and optimization

    This conflict means no rows will match, so the query returns empty results, wasting resources.
  3. Final Answer:

    The WHERE clause has conflicting conditions making the query return no rows -> Option B
  4. Quick Check:

    Conflicting WHERE conditions = no results [OK]
Hint: Check WHERE conditions for logical conflicts [OK]
Common Mistakes:
  • Assuming SELECT * is wrong syntax
  • Ignoring logical conflicts in conditions
  • Thinking missing index causes no results
5. You have a large table sales with columns sale_date, region, and amount. You want to optimize this query:
SELECT region, SUM(amount) FROM sales WHERE sale_date BETWEEN '2023-01-01' AND '2023-01-31' GROUP BY region;
Which strategy will best improve performance?
hard
A. Create a composite index on (sale_date, region)
B. Create an index only on amount
C. Remove the GROUP BY clause
D. Use SELECT * instead of specific columns

Solution

  1. Step 1: Identify columns used in WHERE and GROUP BY clauses

    The query filters by sale_date and groups by region.
  2. Step 2: Choose an index that supports both filtering and grouping

    A composite index on (sale_date, region) helps quickly find rows in the date range and group them efficiently.
  3. Final Answer:

    Create a composite index on (sale_date, region) -> Option A
  4. Quick Check:

    Composite index on filter and group columns = better performance [OK]
Hint: Index columns used in WHERE and GROUP BY together [OK]
Common Mistakes:
  • Indexing only the aggregated column
  • Removing GROUP BY loses needed grouping
  • Using SELECT * increases data load unnecessarily