Skip to content
Home » Normalization and its various forms

Normalization and its various forms

Below is a complete, exam-focused, and easy-to-understand explanation of Normalization and its various Normal Forms — perfect for 5, 10, or 15-mark questions in DBMS.


Normalization and Its Various Forms

Normalization is a systematic process of organizing data in a database to:

✔ Reduce data redundancy
✔ Eliminate update anomalies
✔ Ensure data integrity
✔ Improve storage efficiency

Normalization was introduced by E.F. Codd.

A table is normalized by applying a series of normal forms, each removing specific problems.


Why Normalization is Needed?

Without normalization, a database may suffer from:

  • Insertion anomalies (can’t add data because other data is missing)
  • Update anomalies (multiple copies of the same data → inconsistent updates)
  • Deletion anomalies (loss of useful data due to deletion)
  • Redundancy (duplicate data stored at many places)

Normalization organizes tables to avoid these problems.


THE NORMAL FORMS

Normalization proceeds in stages known as Normal Forms (NF).
Each higher normal form is stronger and removes more redundancy.

We cover:

1️⃣ 1NF (First Normal Form)
2️⃣ 2NF (Second Normal Form)
3️⃣ 3NF (Third Normal Form)
4️⃣ BCNF (Boyce-Codd Normal Form)
5️⃣ 4NF (Fourth Normal Form)
6️⃣ 5NF (Fifth Normal Form)
7️⃣ 6NF (rare, used in temporal databases)


1. First Normal Form (1NF)

A relation is in 1NF if:

✔ All values are atomic (no repeating groups / multivalued attributes).
✔ No composite values or arrays.

Example (NOT in 1NF):

StudentPhones
A9876, 7654

Convert to 1NF:

StudentPhone
A9876
A7654

📌 1NF removes multivalued and composite attributes.


2. Second Normal Form (2NF)

A relation is in 2NF if:

✔ It is in 1NF
No partial dependency exists
(i.e., no non-key attribute depends on part of a composite key)

Applies only if the primary key is composite.

Example of partial dependency:

Key: (StudentID, CourseID)
Non-key: StudentName

StudentName depends only on StudentID → partial dependency.

Solution: Split into two tables.

📌 2NF removes partial dependencies.


3. Third Normal Form (3NF)

A relation is in 3NF if:

✔ It is in 2NF
✔ No transitive dependency (A → B → C)

Example of transitive dependency:

StudentID → DepartmentID
DepartmentID → DepartmentName

DepartmentName indirectly depends on StudentID → remove it.

📌 3NF removes transitive dependencies.


4. Boyce–Codd Normal Form (BCNF)

A stronger version of 3NF.

A relation is in BCNF if:

✔ For every functional dependency X → Y,
X must be a superkey.

BCNF is needed when:

  • A table has overlapping candidate keys
  • Non-key attributes determine key attributes

Example:

Teacher → Subject
Subject → Teacher

(Each subject has a teacher, each teacher has a subject)

Both are keys → must be decomposed.

📌 BCNF removes anomalies caused by multiple overlapping candidate keys.


5. Fourth Normal Form (4NF)

A relation is in 4NF if:

✔ It is in BCNF
✔ It has no multi-valued dependencies

Used when an entity has two or more independent multi-valued attributes.

Example:

Student →→ PhoneNumbers  
Student →→ Languages

These must be stored separately.

📌 4NF eliminates multi-valued dependency anomalies.


6. Fifth Normal Form (5NF) / Project-Join Normal Form (PJNF)

A relation is in 5NF if:

✔ It is in 4NF
✔ It cannot be decomposed further without losing information
✔ Deals with join dependencies

Used in highly complex many-to-many situations.

📌 5NF ensures lossless decomposition.


7. Sixth Normal Form (6NF) (rarely used)

Used in temporal databases and advanced OLAP systems.
Each table stores single attribute changes over time.


SUMMARY TABLE

Normal FormRemovesCondition
1NFRepeating groups, multi-valued attributesAtomic values only
2NFPartial dependencies1NF + no attribute depends on part of composite key
3NFTransitive dependencies2NF + no non-key determines another non-key
BCNFOverlapping candidate key anomaliesLHS of FD must be a superkey
4NFMulti-valued dependenciesBCNF + no MVDs
5NFJoin dependencies4NF + decompositions are lossless
6NFTemporal issuesAdvanced models

Perfect 5-Mark Answer (Short)

Normalization is a process of organizing data to reduce redundancy and eliminate anomalies.
The major normal forms are:

  • 1NF: Removes repeating groups; makes data atomic.
  • 2NF: Removes partial dependencies.
  • 3NF: Removes transitive dependencies.
  • BCNF: Stronger 3NF; LHS of FD must be a superkey.
  • 4NF: Eliminates multi-valued dependencies.
  • 5NF: Eliminates join dependencies.

Normalization improves integrity, consistency, and storage efficiency.