Skip to content
Home » Relational Algebra

Relational Algebra


RELATIONAL ALGEBRA — DETAILED DISCUSSION

Relational Algebra (RA) is the mathematical foundation of the relational database model.
It defines a set of operations that take one or more relations as input and produce a new relation as output.

It is a procedural query language, meaning it tells HOW to retrieve the required data.


⭐ 1. Need & Importance of Relational Algebra

Relational Algebra is crucial because:

✔ Acts as theoretical basis of SQL

SQL’s SELECT–FROM–WHERE is directly derived from σ (selection), π (projection), ⋈ (join).

✔ Used by Query Optimizer

DBMS converts SQL into RA expression → optimizes → executes.

✔ Ensures correctness and guarantees

RA operations are mathematically defined → ensures reliable results.

✔ Allows non-ambiguous data manipulation

Every operation produces a relation → closure property.

✔ Foundation for Relational Calculus

Both together define what queries a DBMS can express.


⭐ 2. Properties of Relational Algebra

  1. Closure Property:
    Output of any RA operation is also a relation → can be fed to another operator.
  2. Set-Based:
    Operates on sets of tuples, not individual tuples.
  3. High-Level & Mathematical:
    Based on set theory and predicate logic.
  4. Optimizable:
    RA expressions can be rewritten for performance (e.g., push selection close to base table).

⭐ 3. Categories of Relational Algebra Operators

RA operators fall into 3 major classes:

  1. Unary Operators (work on one relation)
    • SELECT (σ)
    • PROJECT (π)
    • RENAME (ρ)
  2. Binary Operators (work on two relations)
    • UNION (∪)
    • SET DIFFERENCE (−)
    • INTERSECTION (∩)
    • CARTESIAN PRODUCT (×)
    • JOINS
  3. Advanced/Derived Operators
    • DIVISION (÷)
    • OUTER JOINS
    • SEMI-JOIN & ANTI-JOIN
    • AGGREGATION (extended RA)
    • Grouping & sorting (extended RA)

⭐ 4. UNARY OPERATORS (MCA LEVEL)

4.1 SELECT (σ) — ROW Selection

Extracts tuples that satisfy a condition.

Formal Notation:
σcondition(R)

Example:
Employees with salary > 50000
σSalary > 50000(EMP)

Multiple Conditions:

  • AND → σA AND B(R)
  • OR → σA OR B(R)
  • NOT → σNOT A(R)

Properties (important for exams):

  • Idempotent: σ(σ(R)) = σ(R)
  • Commutative: σA(σB(R)) = σB(σA(R))
  • Pushable: Can be pushed down for optimization.

4.2 PROJECT (π) — COLUMN Selection

Extracts specific columns and removes duplicates (set semantics).

Formal Notation:
πattributes(R)

Example:
πName, Salary(EMP)

Properties:

  • Idempotent: πA(πA(R)) = πA(R)
  • Projection reduces arity (fewer columns).

4.3 RENAME (ρ)

Renames attributes or a relation.

Useful to avoid confusion during JOIN.

Notation:
ρ NewName(R)


⭐ 5. SET OPERATIONS

Must satisfy union compatibility:
✔ Same number of attributes
✔ Corresponding attributes have compatible domains


5.1 UNION (∪)

R ∪ S returns all tuples in either R or S (duplicates removed).


5.2 INTERSECTION (∩)

R ∩ S returns tuples common to both.


5.3 SET DIFFERENCE (−)

R − S returns tuples in R but not in S.


⭐ 6. BINARY OPERATORS — CARTESIAN PRODUCT & JOINS

6.1 CARTESIAN PRODUCT (×)

Multiplies each tuple of R with each tuple of S.

R × S
Used internally to define JOIN.


⭐ 6.2 TYPES OF JOINS (EXTREMELY IMPORTANT FOR MCA)**

✔ 1. Theta Join (⋈θ)

General join with condition θ.

Example:
EMP ⋈EMP.DeptID = DEPT.DeptID DEPT


✔ 2. Equijoin

Theta join where θ is =.


✔ 3. Natural Join (⋈)

  • Joins on common attributes automatically
  • Removes duplicate columns

Example:
EMP ⋈ DEPT


✔ 4. Outer Joins (Extended RA)

Preserve unmatched tuples.

  • Left Outer Join
  • Right Outer Join
  • Full Outer Join

Used when missing relationships must be preserved.


✔ 5. Semi-Join (⋉)

Returns rows from the left relation only.

Useful in distributed databases.


✔ 6. Anti-Join

Returns rows not matching join condition.

Used in NOT EXISTS queries.


⭐ 7. DIVISION (÷) — ADVANCED RA OPERATOR

Division answers questions like:

“Find X related to all Y.”

Example:

SUPPLY(Supplier, Part)
PART(Part)

Query: Find suppliers who supply all parts.

SUPPLY ÷ PART = suppliers who supply every part.

Division works by grouping & matching all required values.


⭐ 8. EXTENDED RELATIONAL ALGEBRA (Used by SQL/DBMS)

Modern RA includes:

✔ Aggregation

SUM, COUNT, AVG, MIN, MAX
γgrouping attributes; aggregates(R)

Example:
γDeptID; AVG(Salary)(EMP)

✔ Sorting

τSalary DESC(EMP)

✔ Assignment

Useful internally in DBMS.


⭐ 9. EXAMPLE RELATIONAL ALGEBRA QUERY (MCA Level)

Query:

Find names of employees working in “Finance” department.

Relations:
EMP(Eid, Name, DeptID)
DEPT(DeptID, DeptName)

Step 1: Select DeptID of Finance

A = σDeptName = ‘Finance'(DEPT)

Step 2: Natural Join with EMP

B = EMP ⋈ A

Step 3: Project employee names

Result = πName(B)


⭐ 10. Relational Algebra vs SQL

Relational AlgebraSQL
ProceduralNon-procedural
Step-by-stepDeclarative
Uses symbolsUses English-like syntax
TheoreticalPractical

⭐ 11. Optimization Using Relational Algebra (very useful)

DBMS rewrites RA expressions:

✔ Push σ (selection) down early
✔ Push π (projection) early
✔ Replace Cartesian + selection → JOIN
✔ Use associative, commutative properties of joins

This speeds up query execution.


Perfect 5-mark Short Answer

Relational Algebra is a procedural query language used to manipulate relations in the relational model.
It includes operators like SELECT (σ), PROJECT (π), UNION (∪), SET DIFFERENCE (−), CARTESIAN PRODUCT (×), JOIN (⋈), and DIVISION (÷).
Each operation takes a relation as input and produces another relation.
Relational Algebra forms the basis of SQL and helps DBMS in query optimization.