Let’s treat this as the “deep theory” part of normalization:
Functional Dependencies → Multivalued Dependencies → Join Dependencies
Each one is a stronger / more general kind of dependency and is tied to a specific normal form (3NF/BCNF, 4NF, 5NF).
1. Functional Dependencies (FDs)
1.1 Intuition
A Functional Dependency says:
If two rows (tuples) agree on attributes X, they must agree on attributes Y.
We write this as:
X → Y (X functionally determines Y)
So X is like a “cause” or “identifier”, and Y is determined by it.
1.2 Formal Definition
Let R(A₁, A₂, …, Aₙ) be a relation schema.
Let X and Y be subsets of attributes of R.
We say X → Y holds in R if:
For any two tuples t₁ and t₂ in any valid instance of R,
if t₁[X] = t₂[X], then t₁[Y] = t₂[Y].
In words: if X values are the same, Y values must be the same.
1.3 Examples of FDs
Consider a relation:
STUDENT(RollNo, Name, Dept, DeptPhone)
Possible FDs:
- RollNo → Name, Dept, DeptPhone
(RollNo is unique, so it determines everything about the student.) - Dept → DeptPhone
(All students in the same department share the same dept phone.)
Here, RollNo is a key because it functionally determines all attributes.
1.4 Types of Functional Dependencies
- Trivial FD
X → Y is trivial if Y ⊆ X.
Example: {RollNo, Name} → Name
Always true, no real constraint. - Non-Trivial FD
X → Y is non-trivial if Y is not a subset of X.
Example: RollNo → Name - Completely (Fully) Functional Dependency
Y is fully dependent on composite key X if Y depends on whole X, not on any proper subset of X. Example:ENROLL(StudentId, CourseId, Grade)
(StudentId, CourseId) → Grade (full FD)
If Grade depended only on StudentId, that would be partial. - Partial Dependency
Y depends on only a part of a composite key.
Example: In ENROLL, if we also store StudentName:
(StudentId, CourseId) → StudentName
But actually: StudentId → StudentName
This is partial (cause of 2NF violation). - Transitive Dependency
X → Y and Y → Z, then X → Z (indirect).
Example:
StudentId → DeptID
DeptID → DeptName
⇒ StudentId → DeptName (transitive; cause of 3NF violation).
1.5 FDs and Normalization
- 2NF: Remove partial dependencies (non-key attribute depends on part of composite key).
- 3NF: Remove transitive dependencies (non-key attribute depends on another non-key).
- BCNF: For every FD X → Y, X must be a superkey.
So FDs are the core of 2NF, 3NF, BCNF.
2. Multivalued Dependencies (MVDs)
2.1 Intuition
FDs talk about “one unique value” being determined.
Multivalued Dependency (MVD) talks about independent multi-valued attributes for the same key.
Notation: X →→ Y
For each value of X, there is a set of values of Y, and this set is independent of other attributes.
So X →→ Y means “for each X, there can be many Y values, and they don’t depend on the rest of the attributes”.
2.2 Classic Example
Relation:
STUDENT(StudId, Phone, Language)
Assume:
- Each student can have multiple phone numbers.
- Each student can know multiple languages.
- Phone numbers and languages are independent of each other.
This implies:
- StudId →→ Phone
- StudId →→ Language
For a student S1 with:
- Phones: P1, P2
- Languages: L1, L2
The table must contain all combinations:
| StudId | Phone | Language |
|---|---|---|
| S1 | P1 | L1 |
| S1 | P1 | L2 |
| S1 | P2 | L1 |
| S1 | P2 | L2 |
That’s the essence of an MVD: independent multi-valued sets generate combinations.
2.3 Formal Definition (Conceptual)
In relation R with attributes, Y is multivalued dependent on X (X →→ Y) if:
For any pair of tuples t₁ and t₂ in R with same X,
we can form tuples by swapping their Y components
and those tuples must also be valid in R.
Intuitively: once X is fixed, the combinations of Y with the rest of attributes are all possible.
2.4 Trivial Multivalued Dependencies
An MVD X →→ Y is trivial if:
- Y ⊆ X, or
- X ∪ Y = all attributes of relation
Trivial ones don’t add extra constraints.
2.5 Relationship Between FDs and MVDs
- Every FD X → Y is also an MVD: X →→ Y.
(If X determines a single value of Y, it also determines the set with that single value.)
But not every MVD is an FD, because MVDs allow multiple Y values for one X.
2.6 MVDs and 4NF
Fourth Normal Form (4NF) is defined using MVDs.
A relation R is in 4NF if:
For every non-trivial MVD X →→ Y in R,
X is a superkey of R.
Reason: If X is not a key, independent multi-valued attributes lead to huge redundancy.
Decomposition for 4NF (Student example):
Original: STUDENT(StudId, Phone, Language) with
StudId →→ Phone and StudId →→ Language
Decompose into:
STUDENT_PHONE(StudId, Phone)STUDENT_LANGUAGE(StudId, Language)
Now no unnecessary combinations → redundancy removed → 4NF achieved.
3. Join Dependencies (JDs)
3.1 Intuition
A Join Dependency (JD) is a more general constraint.
It says:
Relation R must be exactly equal to the join of its projections on some set of attribute subsets.
Notation: JD(R₁, R₂, …, Rₙ)
Where each Rᵢ is a subset of attributes of the main relation R.
If JD holds, any valid instance of R must be reconstructible (losslessly) by joining its projections on R₁, R₂, …, Rₙ.
MVD is a special case of JD (with n = 2).
3.2 Example of Join Dependency
Consider relation:
SPJ(Supplier, Part, Project)
Assume:
- The relationships between Supplier–Part, Supplier–Project, Part–Project are such that SPJ can always be reconstructed from:
SP(Supplier, Part)SJ(Supplier, Project)PJ(Part, Project)
Then we can say:
JD({Supplier, Part}, {Supplier, Project}, {Part, Project})
holds on SPJ.
This means SPJ is always equal to:
SPJ = SP ⋈ SJ ⋈ PJ
If this is always true, then the dependency is a Join Dependency.
3.3 JD and 5NF (Fifth Normal Form)
Fifth Normal Form (5NF), also called Project-Join Normal Form (PJNF), is defined using JDs.
A relation R is in 5NF if:
Every join dependency on R is implied by the candidate keys of R.
In simple language:
- You cannot decompose R into more tables (via projections)
without either:- losing information, or
- creating spurious tuples (extra, invalid combinations).
So, after 5NF, no further useful decomposition is possible.
Why JDs matter?
Some complex many-to-many relationships across 3 or more tables can cause redundancy that is not captured by FDs or MVDs alone. JD considers the effect of joining multiple projections.
3.4 Relationship: MVD vs JD
- An MVD X →→ Y corresponds to a JD of the form:
JD( XY, XZ ) where XYZ is the full schema.
So:
- 4NF: handles redundancy due to MVDs (special 2-way JDs).
- 5NF: handles redundancy due to more general JDs (3-way, n-way).
4. Putting It All Together
4.1 Summary of Each Dependency
- FD (X → Y)
X uniquely determines Y.
Used in: 2NF, 3NF, BCNF - MVD (X →→ Y)
For each X, there is an independent set of values of Y.
Used in: 4NF - JD (JD(R₁, R₂, …, Rₙ))
R must always be equal to join of projections on R₁, R₂, …, Rₙ.
Used in: 5NF
4.2 Dependency vs Normal Form Mapping
| Dependency Type | Related Normal Form | Goal |
|---|---|---|
| Functional Dependency (FD) | 2NF, 3NF, BCNF | Remove partial & transitive redundancy |
| Multivalued Dependency | 4NF | Remove redundancy from independent multi-valued attributes |
| Join Dependency | 5NF (PJNF) | Ensure no redundancy from complex joins |
