Bird
Raised Fist0
DBMS Theoryknowledge~10 mins

Cost-based optimization in DBMS Theory - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - Cost-based optimization
Receive SQL Query
Parse Query
Generate Possible Execution Plans
Estimate Cost for Each Plan
Compare Costs
Choose Lowest Cost Plan
Execute Chosen Plan
The database receives a query, creates many ways to run it, estimates the cost for each, and picks the cheapest plan to run.
Execution Sample
DBMS Theory
SELECT * FROM Orders WHERE CustomerID = 5;
The database decides how to get orders for customer 5 with the least cost.
Analysis Table
StepActionDetailsCost EstimateDecision
1Parse QueryIdentify tables and conditions-Proceed
2Generate Plan AFull table scan of OrdersCost = 100Consider
3Generate Plan BUse index on CustomerIDCost = 10Consider
4Compare CostsPlan A vs Plan B-Plan B cheaper
5Choose PlanSelect Plan B-Execute Plan B
6Execute PlanRetrieve matching rows using index-Done
💡 Execution stops after choosing and running the lowest cost plan (Plan B).
State Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
QuerySELECT * FROM Orders WHERE CustomerID = 5;ParsedParsedParsedParsed
Plan A CostN/A100100100100
Plan B CostN/AN/A101010
Chosen PlanNoneNoneNonePlan BPlan B
Key Insights - 3 Insights
Why does the optimizer consider multiple plans instead of just one?
Because different plans can have very different costs. The optimizer compares them all (see execution_table steps 2 and 3) to pick the cheapest.
What does 'cost' mean in cost-based optimization?
Cost is an estimate of resources like time and CPU needed to run a plan. Lower cost means faster or cheaper execution (see execution_table cost estimates).
Why might a full table scan have a higher cost than using an index?
A full scan reads every row, which is slow for large tables. Using an index targets only needed rows, so it costs less (see Plan A vs Plan B costs).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the cost estimate for Plan B?
A100
B10
C50
D0
💡 Hint
Check the 'Cost Estimate' column at Step 3 in the execution_table.
At which step does the optimizer decide which plan to execute?
AStep 5
BStep 2
CStep 4
DStep 6
💡 Hint
Look for the step labeled 'Choose Plan' in the execution_table.
If the index on CustomerID was missing, which plan would likely be chosen?
AA new plan with cost 5
BPlan B with cost 10
CPlan A with cost 100
DNo plan would be chosen
💡 Hint
Refer to the variable_tracker where Plan B cost is only available if index exists.
Concept Snapshot
Cost-based optimization chooses the best way to run a query by:
- Parsing the query
- Generating multiple execution plans
- Estimating the cost of each plan
- Selecting the plan with the lowest cost
- Executing that plan
Cost estimates reflect resource use like time and CPU.
Full Transcript
Cost-based optimization is a process used by databases to run queries efficiently. When a query is received, the database parses it to understand what data is needed. Then, it creates different possible ways to get that data, called execution plans. Each plan is given a cost estimate based on how much time and resources it might take. The optimizer compares these costs and picks the plan with the lowest cost to run. For example, it might choose to use an index instead of scanning the whole table because the index plan costs less. This process helps queries run faster and saves resources.

Practice

(1/5)
1. What is the main goal of cost-based optimization in a database system?
easy
A. To find the most efficient way to execute a query
B. To store data in the smallest space possible
C. To encrypt data for security
D. To backup the database automatically

Solution

  1. Step 1: Understand the purpose of cost-based optimization

    Cost-based optimization evaluates different ways to run a query and estimates their costs.
  2. Step 2: Identify the main goal

    The goal is to pick the plan with the lowest cost, meaning the fastest or least resource-heavy execution.
  3. Final Answer:

    To find the most efficient way to execute a query -> Option A
  4. Quick Check:

    Cost-based optimization = efficient query execution [OK]
Hint: Focus on efficiency and speed of query execution [OK]
Common Mistakes:
  • Confusing optimization with data storage
  • Thinking it handles security tasks
  • Assuming it manages backups
2. Which of the following is a key input used by cost-based optimizers to estimate query costs?
easy
A. User login credentials
B. Data statistics like table size and index availability
C. Network bandwidth
D. Database backup schedules

Solution

  1. Step 1: Identify what cost-based optimizers use

    They rely on data statistics such as table size, number of rows, and indexes to estimate costs.
  2. Step 2: Match the correct input

    Data statistics directly affect the cost estimation, unlike user credentials or backup schedules.
  3. Final Answer:

    Data statistics like table size and index availability -> Option B
  4. Quick Check:

    Cost estimation uses data statistics [OK]
Hint: Remember: statistics guide cost estimates, not user info [OK]
Common Mistakes:
  • Confusing user data with statistics
  • Thinking network or backups affect cost estimation
  • Ignoring the role of indexes
3. Consider a database query optimizer that chooses between two plans: Plan A costs 50 units, Plan B costs 80 units. Which plan will the optimizer select?
medium
A. Neither plan because cost is ignored
B. Plan B because it has a higher cost
C. Both plans equally because cost does not matter
D. Plan A because it has a lower cost

Solution

  1. Step 1: Understand cost comparison

    The optimizer picks the plan with the lowest estimated cost to improve performance.
  2. Step 2: Compare given costs

    Plan A costs 50 units, which is less than Plan B's 80 units, so Plan A is preferred.
  3. Final Answer:

    Plan A because it has a lower cost -> Option D
  4. Quick Check:

    Lower cost plan chosen = Plan A [OK]
Hint: Choose the plan with the smallest cost number [OK]
Common Mistakes:
  • Picking higher cost plan mistakenly
  • Ignoring cost values
  • Assuming cost is irrelevant
4. A cost-based optimizer is not choosing the fastest query plan. What could be a likely reason?
medium
A. The data statistics are outdated or inaccurate
B. The database server is turned off
C. The query syntax is incorrect
D. The user has no permissions

Solution

  1. Step 1: Identify factors affecting optimizer decisions

    The optimizer depends on accurate data statistics to estimate costs correctly.
  2. Step 2: Analyze the problem cause

    If statistics are outdated, the optimizer may pick a suboptimal plan, causing slower queries.
  3. Final Answer:

    The data statistics are outdated or inaccurate -> Option A
  4. Quick Check:

    Outdated stats cause wrong plan choice [OK]
Hint: Check if statistics are current to fix optimizer issues [OK]
Common Mistakes:
  • Blaming server status without checking stats
  • Confusing syntax errors with optimization issues
  • Assuming permissions affect plan choice
5. A database has two indexes on a table: one on column A and another on column B. A query filters on both columns. How does cost-based optimization decide which index to use?
hard
A. It ignores indexes and does a full table scan
B. It always uses the index on column A by default
C. It estimates the cost of using each index and picks the cheaper one
D. It uses both indexes simultaneously without cost estimation

Solution

  1. Step 1: Understand index selection by cost-based optimizer

    The optimizer calculates the cost of using each index based on statistics like selectivity and size.
  2. Step 2: Apply cost comparison to index choice

    It chooses the index that results in the lowest estimated cost for the query execution.
  3. Final Answer:

    It estimates the cost of using each index and picks the cheaper one -> Option C
  4. Quick Check:

    Index choice based on cost estimation [OK]
Hint: Optimizer picks index with lowest estimated cost [OK]
Common Mistakes:
  • Assuming fixed index usage without cost check
  • Thinking optimizer ignores indexes
  • Believing it uses multiple indexes without cost analysis