Below is a complete, exam-focused, and easy-to-understand explanation of Normalization and its various Normal Forms — perfect for 5, 10, or 15-mark questions in DBMS.
⭐ Normalization and Its Various Forms
Normalization is a systematic process of organizing data in a database to:
✔ Reduce data redundancy
✔ Eliminate update anomalies
✔ Ensure data integrity
✔ Improve storage efficiency
Normalization was introduced by E.F. Codd.
A table is normalized by applying a series of normal forms, each removing specific problems.
⭐ Why Normalization is Needed?
Without normalization, a database may suffer from:
- Insertion anomalies (can’t add data because other data is missing)
- Update anomalies (multiple copies of the same data → inconsistent updates)
- Deletion anomalies (loss of useful data due to deletion)
- Redundancy (duplicate data stored at many places)
Normalization organizes tables to avoid these problems.
⭐ THE NORMAL FORMS
Normalization proceeds in stages known as Normal Forms (NF).
Each higher normal form is stronger and removes more redundancy.
We cover:
1️⃣ 1NF (First Normal Form)
2️⃣ 2NF (Second Normal Form)
3️⃣ 3NF (Third Normal Form)
4️⃣ BCNF (Boyce-Codd Normal Form)
5️⃣ 4NF (Fourth Normal Form)
6️⃣ 5NF (Fifth Normal Form)
7️⃣ 6NF (rare, used in temporal databases)
⭐ 1. First Normal Form (1NF)
A relation is in 1NF if:
✔ All values are atomic (no repeating groups / multivalued attributes).
✔ No composite values or arrays.
Example (NOT in 1NF):
| Student | Phones |
|---|---|
| A | 9876, 7654 |
Convert to 1NF:
| Student | Phone |
|---|---|
| A | 9876 |
| A | 7654 |
📌 1NF removes multivalued and composite attributes.
⭐ 2. Second Normal Form (2NF)
A relation is in 2NF if:
✔ It is in 1NF
✔ No partial dependency exists
(i.e., no non-key attribute depends on part of a composite key)
Applies only if the primary key is composite.
Example of partial dependency:
Key: (StudentID, CourseID)
Non-key: StudentName
StudentName depends only on StudentID → partial dependency.
Solution: Split into two tables.
📌 2NF removes partial dependencies.
⭐ 3. Third Normal Form (3NF)
A relation is in 3NF if:
✔ It is in 2NF
✔ No transitive dependency (A → B → C)
Example of transitive dependency:
StudentID → DepartmentID
DepartmentID → DepartmentName
DepartmentName indirectly depends on StudentID → remove it.
📌 3NF removes transitive dependencies.
⭐ 4. Boyce–Codd Normal Form (BCNF)
A stronger version of 3NF.
A relation is in BCNF if:
✔ For every functional dependency X → Y,
X must be a superkey.
BCNF is needed when:
- A table has overlapping candidate keys
- Non-key attributes determine key attributes
Example:
Teacher → Subject
Subject → Teacher
(Each subject has a teacher, each teacher has a subject)
Both are keys → must be decomposed.
📌 BCNF removes anomalies caused by multiple overlapping candidate keys.
⭐ 5. Fourth Normal Form (4NF)
A relation is in 4NF if:
✔ It is in BCNF
✔ It has no multi-valued dependencies
Used when an entity has two or more independent multi-valued attributes.
Example:
Student →→ PhoneNumbers
Student →→ Languages
These must be stored separately.
📌 4NF eliminates multi-valued dependency anomalies.
⭐ 6. Fifth Normal Form (5NF) / Project-Join Normal Form (PJNF)
A relation is in 5NF if:
✔ It is in 4NF
✔ It cannot be decomposed further without losing information
✔ Deals with join dependencies
Used in highly complex many-to-many situations.
📌 5NF ensures lossless decomposition.
⭐ 7. Sixth Normal Form (6NF) (rarely used)
Used in temporal databases and advanced OLAP systems.
Each table stores single attribute changes over time.
⭐ SUMMARY TABLE
| Normal Form | Removes | Condition |
|---|---|---|
| 1NF | Repeating groups, multi-valued attributes | Atomic values only |
| 2NF | Partial dependencies | 1NF + no attribute depends on part of composite key |
| 3NF | Transitive dependencies | 2NF + no non-key determines another non-key |
| BCNF | Overlapping candidate key anomalies | LHS of FD must be a superkey |
| 4NF | Multi-valued dependencies | BCNF + no MVDs |
| 5NF | Join dependencies | 4NF + decompositions are lossless |
| 6NF | Temporal issues | Advanced models |
⭐ Perfect 5-Mark Answer (Short)
Normalization is a process of organizing data to reduce redundancy and eliminate anomalies.
The major normal forms are:
- 1NF: Removes repeating groups; makes data atomic.
- 2NF: Removes partial dependencies.
- 3NF: Removes transitive dependencies.
- BCNF: Stronger 3NF; LHS of FD must be a superkey.
- 4NF: Eliminates multi-valued dependencies.
- 5NF: Eliminates join dependencies.
Normalization improves integrity, consistency, and storage efficiency.
