Functional Dependencies, Multivalued Dependencies, Join Dependencies

Let’s treat this as the “deep theory” part of normalization:
Functional Dependencies → Multivalued Dependencies → Join Dependencies
Each one is a stronger / more general kind of dependency and is tied to a specific normal form (3NF/BCNF, 4NF, 5NF).

1. Functional Dependencies (FDs)

1.1 Intuition

A Functional Dependency says:

If two rows (tuples) agree on attributes X, they must agree on attributes Y.

We write this as:
X → Y (X functionally determines Y)

So X is like a “cause” or “identifier”, and Y is determined by it.

1.2 Formal Definition

Let R(A₁, A₂, …, Aₙ) be a relation schema.
Let X and Y be subsets of attributes of R.

We say X → Y holds in R if:

For any two tuples t₁ and t₂ in any valid instance of R,
if t₁[X] = t₂[X], then t₁[Y] = t₂[Y].

In words: if X values are the same, Y values must be the same.

1.3 Examples of FDs

Consider a relation:

STUDENT(RollNo, Name, Dept, DeptPhone)

Possible FDs:

RollNo → Name, Dept, DeptPhone
(RollNo is unique, so it determines everything about the student.)
Dept → DeptPhone
(All students in the same department share the same dept phone.)

Here, RollNo is a key because it functionally determines all attributes.

1.4 Types of Functional Dependencies

Trivial FD
X → Y is trivial if Y ⊆ X.
Example: {RollNo, Name} → Name
Always true, no real constraint.
Non-Trivial FD
X → Y is non-trivial if Y is not a subset of X.
Example: RollNo → Name
Completely (Fully) Functional Dependency
Y is fully dependent on composite key X if Y depends on whole X, not on any proper subset of X. Example:
ENROLL(StudentId, CourseId, Grade)
(StudentId, CourseId) → Grade (full FD)
If Grade depended only on StudentId, that would be partial.
Partial Dependency
Y depends on only a part of a composite key.
Example: In ENROLL, if we also store StudentName:
(StudentId, CourseId) → StudentName
But actually: StudentId → StudentName
This is partial (cause of 2NF violation).
Transitive Dependency
X → Y and Y → Z, then X → Z (indirect).
Example:
StudentId → DeptID
DeptID → DeptName
⇒ StudentId → DeptName (transitive; cause of 3NF violation).

1.5 FDs and Normalization

2NF: Remove partial dependencies (non-key attribute depends on part of composite key).
3NF: Remove transitive dependencies (non-key attribute depends on another non-key).
BCNF: For every FD X → Y, X must be a superkey.

So FDs are the core of 2NF, 3NF, BCNF.

2. Multivalued Dependencies (MVDs)

2.1 Intuition

FDs talk about “one unique value” being determined.

Multivalued Dependency (MVD) talks about independent multi-valued attributes for the same key.

Notation: X →→ Y

For each value of X, there is a set of values of Y, and this set is independent of other attributes.

So X →→ Y means “for each X, there can be many Y values, and they don’t depend on the rest of the attributes”.

2.2 Classic Example

Relation:

STUDENT(StudId, Phone, Language)

Assume:

Each student can have multiple phone numbers.
Each student can know multiple languages.
Phone numbers and languages are independent of each other.

This implies:

StudId →→ Phone
StudId →→ Language

For a student S1 with:

Phones: P1, P2
Languages: L1, L2

The table must contain all combinations:

StudId	Phone	Language
S1	P1	L1
S1	P1	L2
S1	P2	L1
S1	P2	L2

That’s the essence of an MVD: independent multi-valued sets generate combinations.

2.3 Formal Definition (Conceptual)

In relation R with attributes, Y is multivalued dependent on X (X →→ Y) if:

For any pair of tuples t₁ and t₂ in R with same X,
we can form tuples by swapping their Y components
and those tuples must also be valid in R.

Intuitively: once X is fixed, the combinations of Y with the rest of attributes are all possible.

2.4 Trivial Multivalued Dependencies

An MVD X →→ Y is trivial if:

Y ⊆ X, or
X ∪ Y = all attributes of relation

Trivial ones don’t add extra constraints.

2.5 Relationship Between FDs and MVDs

Every FD X → Y is also an MVD: X →→ Y.
(If X determines a single value of Y, it also determines the set with that single value.)

But not every MVD is an FD, because MVDs allow multiple Y values for one X.

2.6 MVDs and 4NF

Fourth Normal Form (4NF) is defined using MVDs.

A relation R is in 4NF if:

For every non-trivial MVD X →→ Y in R,
X is a superkey of R.

Reason: If X is not a key, independent multi-valued attributes lead to huge redundancy.

Decomposition for 4NF (Student example):

Original: STUDENT(StudId, Phone, Language) with
StudId →→ Phone and StudId →→ Language

Decompose into:

STUDENT_PHONE(StudId, Phone)
STUDENT_LANGUAGE(StudId, Language)

Now no unnecessary combinations → redundancy removed → 4NF achieved.

3. Join Dependencies (JDs)

3.1 Intuition

A Join Dependency (JD) is a more general constraint.
It says:

Relation R must be exactly equal to the join of its projections on some set of attribute subsets.

Notation: JD(R₁, R₂, …, Rₙ)

Where each Rᵢ is a subset of attributes of the main relation R.

If JD holds, any valid instance of R must be reconstructible (losslessly) by joining its projections on R₁, R₂, …, Rₙ.

MVD is a special case of JD (with n = 2).

3.2 Example of Join Dependency

Consider relation:

SPJ(Supplier, Part, Project)

Assume:

The relationships between Supplier–Part, Supplier–Project, Part–Project are such that SPJ can always be reconstructed from:
- SP(Supplier, Part)
- SJ(Supplier, Project)
- PJ(Part, Project)

Then we can say:

JD({Supplier, Part}, {Supplier, Project}, {Part, Project})

holds on SPJ.

This means SPJ is always equal to:

SPJ = SP ⋈ SJ ⋈ PJ

If this is always true, then the dependency is a Join Dependency.

3.3 JD and 5NF (Fifth Normal Form)

Fifth Normal Form (5NF), also called Project-Join Normal Form (PJNF), is defined using JDs.

A relation R is in 5NF if:

Every join dependency on R is implied by the candidate keys of R.

In simple language:

You cannot decompose R into more tables (via projections)
without either:
- losing information, or
- creating spurious tuples (extra, invalid combinations).

So, after 5NF, no further useful decomposition is possible.

Why JDs matter?

Some complex many-to-many relationships across 3 or more tables can cause redundancy that is not captured by FDs or MVDs alone. JD considers the effect of joining multiple projections.

3.4 Relationship: MVD vs JD

An MVD X →→ Y corresponds to a JD of the form:
JD( XY, XZ ) where XYZ is the full schema.

So:

4NF: handles redundancy due to MVDs (special 2-way JDs).
5NF: handles redundancy due to more general JDs (3-way, n-way).

4. Putting It All Together

4.1 Summary of Each Dependency

FD (X → Y)
X uniquely determines Y.
Used in: 2NF, 3NF, BCNF
MVD (X →→ Y)
For each X, there is an independent set of values of Y.
Used in: 4NF
JD (JD(R₁, R₂, …, Rₙ))
R must always be equal to join of projections on R₁, R₂, …, Rₙ.
Used in: 5NF

4.2 Dependency vs Normal Form Mapping

Dependency Type	Related Normal Form	Goal
Functional Dependency (FD)	2NF, 3NF, BCNF	Remove partial & transitive redundancy
Multivalued Dependency	4NF	Remove redundancy from independent multi-valued attributes
Join Dependency	5NF (PJNF)	Ensure no redundancy from complex joins