⭐ QUERY OPTIMIZATION AND EVALUATION PLANS
Query processing in a DBMS consists of two major parts:
- Query Optimization – choosing the best execution strategy
- Query Evaluation Plan – the actual sequence of operations to execute the query
Both are essential for fast and efficient database performance.
⭐ 1. QUERY OPTIMIZATION
Query Optimization is the process of selecting the most efficient execution plan from many logically equivalent alternatives.
Even a simple SQL statement can have dozens or hundreds of possible plans.
✔ Why Optimization is Needed?
- Reduce execution time
- Reduce CPU and I/O usage
- Minimize disk access (most expensive)
- Efficient use of indexes
- Select optimal join strategies
- Improve DB throughput for multi-user environments
DBMS chooses plans based on a cost model (I/O cost, CPU cost, memory usage).
⭐ Types of Query Optimization
A. Rule-Based Optimization (RBO)
Uses predefined heuristics/rules, such as:
- Push σ (select) operations as close as possible to the base tables
- Push π (project) early to reduce number of columns
- Replace Cartesian product + selection with JOIN
- Reorder joins for efficiency
- Use indexes wherever selective predicates exist
These transformations produce better logical query plans.
B. Cost-Based Optimization (CBO)
Most modern DBMS use CBO.
DBMS estimates cost based on:
✔ Table statistics
✔ Number of tuples
✔ Index availability
✔ Histogram values
✔ Selectivity of predicates
✔ Join cardinality
✔ Disk I/O blocks
Then chooses the execution plan with lowest estimated cost.
Example:
To satisfy WHERE Salary > 50000, DBMS can choose:
- Full table scan
- Index scan
- Bitmap index scan
- Hash join
- Merge join
CBO selects the best based on cost models.
⭐ Steps in Query Optimization
1. SQL Parsing
- Check syntax and semantics
- Generate parse tree
2. Query Rewrite (Logical Optimization)
Rewrite rules include:
- Predicate pushdown
- Projection pushdown
- Flatten nested subqueries
- Remove redundant joins
- Reorder joins based on selectivity
Output: Logical Query Plan (LQP) expressed in Relational Algebra.
3. Plan Enumeration
DBMS generates several possible physical plans using:
- Different join orders
- Different join algorithms
- Index vs table scan
- Filter pushdown variations
Join order problem alone is NP-hard → billions of possibilities for large queries.
4. Cost Estimation
Each plan is assigned a cost value using a cost model:
Cost =
- I/O cost +
- CPU cost +
- Network cost (for distributed DBs)
5. Selection of Best Plan
The plan with lowest estimated cost becomes the final physical execution plan.
⭐ 2. QUERY EVALUATION PLANS
A Query Evaluation Plan (QEP) or Execution Plan is a tree of physical operators that describes how the query will run.
Example physical operators:
- TableScan
- IndexScan
- Filter (σ)
- Projection (π)
- Hash Join
- Sort Merge Join
- Nested Loop Join
- Group Aggregate
- Sort
⭐ Components of a Query Evaluation Plan
✔ 1. Logical Plan
Uses relational algebra:
π Name ( σ Salary > 50000 (EMP) )
✔ 2. Physical Plan
Specifies exactly how it will be executed:
Index Scan on EMP using idx_salary
Filter: Salary > 50000
Project: Name
✔ 3. Cost Value
Each operator has cost:
- Index scan: low
- Table scan: high
- Hash join: medium
- Merge join: efficient for sorted data
⭐ Example of Query Optimization + Evaluation
SQL:
SELECT E.Name
FROM EMP E
JOIN DEPT D ON E.DeptID = D.DeptID
WHERE D.Location = 'Delhi' AND E.Salary > 50000;
Step 1: Logical Rewrite
RA expression:
π E.Name (
σ D.Location='Delhi' AND E.Salary>50000
( EMP ⋈ DEPT )
)
Step 2: Logical Optimization
- Push selections down:
π Name (
(σ Salary>50000 (EMP))
⋈
(σ Location='Delhi' (DEPT))
)
Step 3: Physical Plan Choices
Optimizer can choose:
- Index Scan on DEPT (Location)
- Index Scan on EMP (Salary)
- Hash Join or Merge Join
Step 4: Cost Based Decision
DBMS estimates:
Option A: Hash Join (EMP filtered)
Option B: Nested Loop Join (DEPT filtered)
Selects the lowest cost.
Step 5: Final QEP
Hash Join
├── Index Scan EMP (Salary > 50000)
└── Index Scan DEPT (Location='Delhi')
Projection: Name
⭐ Join Algorithms in Evaluation Plan (MCA Important)
1. Nested Loop Join
- Good for small tables
- Expensive for large tables
2. Hash Join
- Best for large unsorted tables
- Builds hash table → matches quickly
3. Sort Merge Join
- Best when inputs are already sorted
- Very efficient for large datasets
These choices form the core of execution plans.
⭐ Physical Operators in Evaluation Plan
| Operator | Use |
|---|---|
| Seq Scan | Full table scan |
| Index Scan | Uses index for filtering |
| Bitmap Scan | Efficient for OR predicates |
| Filter | Applies WHERE conditions |
| Join Operators | Combine tables |
| Aggregation | SUM, AVG, COUNT |
| Sorting | ORDER BY |
| Limit | Restrict rows |
⭐ Difference: Optimization vs Evaluation Plan
| Query Optimization | Query Evaluation Plan |
|---|---|
| Decides best plan | Executes the chosen plan |
| Cost-based decision making | Physical operations execution |
| Rule-based + cost-based | Actual scanning, joining, sorting |
| Generates multiple possibilities | Only one final plan used |
| Produces logical + physical plans | Produces actual output |
⭐ Perfect 5–6 Mark Summary
Query Optimization selects the most efficient plan to execute an SQL query. It involves parsing, logical rewriting, generating alternative plans, estimating cost, and selecting the lowest cost plan.
Query Evaluation Plan is a tree of physical operators like table scan, index scan, join algorithms, sorting, etc., that describes exactly how the DBMS executes the query. Together, they ensure fast and efficient query execution.
