📘 Open Addressing (Collision Resolution Technique)

Open Addressing is a collision-handling technique used in hash tables in which all key-value pairs are stored directly inside the hash table (i.e., no linked lists, no external structure).

When a collision occurs (two keys hash to the same index), the hash table looks for another empty slot inside the table according to a probing sequence.

📌 Key Idea

Store every element inside the table (size = m slots).
If the desired slot is occupied → probe (search) to find the next empty slot.
Searching, inserting, deleting → O(1) average, O(n) worst-case.

📘 Core Formula

For a key k and a probing number i:

[
h(k, i) = (h(k) + f(i)) \mod m
]

Where:

h(k) = primary hash function
f(i) = probing function
i = number of collisions so far

📌 Requirements of Open Addressing

Load factor must be < 1
[
\alpha = \frac{n}{m} < 1
]
Table must have at least one empty slot.
Deletions require special handling (“lazy delete” markers).

📘 Types of Open Addressing

There are three main types:

1. Linear Probing

[
h(k, i) = (h(k) + i) \mod m
]

Sequence:
h(k), h(k)+1, h(k)+2, …

✔ Advantages

Simple
Cache-friendly (array locality)

✘ Disadvantages

Primary clustering
- Long blocks of filled cells form
- Increases search/insert time

2. Quadratic Probing

[
h(k, i) = (h(k) + i^2) \mod m
]

Sequence:
h(k), h(k)+1², h(k)+2², h(k)+3² …

✔ Advantages

Reduces primary clustering
Better than linear probing

✘ Disadvantages

Secondary clustering
- Keys with the same h(k) follow same probe path
Must choose m and coefficients carefully to ensure full table coverage

3. Double Hashing (Best Technique)

[
h(k, i) = (h_1(k) + i \cdot h_2(k)) \mod m
]

Where:

h1(k) is main hash
h2(k) is secondary hash
h2(k) ≠ 0

✔ Advantages

Almost eliminates clustering
Best distribution
Most efficient open-addressing technique

✘ Disadvantages

Slightly more complex
Requires good second hash function

📘 Example of Probing (Linear Probing)

Table size = 7
Hash function: ( h(k) = k \mod 7 )

Insert keys: 10, 20, 30

10 → 10 % 7 = 3 → slot 3
20 → 20 % 7 = 6 → slot 6
30 → 30 % 7 = 2 → slot 2
Now insert 17:
17 % 7 = 3 → occupied
Try 4 → empty
→ Insert at slot 4

📘 Deletion in Open Addressing

Removing directly will break probe chains.

Solution → Lazy deletion

Mark cell as “deleted”
Future searches continue probing
Insertions may reuse deleted slots

📘 Performance of Open Addressing

Metric	Average	Worst Case
Search	O(1)	O(n)
Insert	O(1)	O(n)
Delete	O(1)	O(n)

Performance depends on:

load factor (α)
Probing technique
Quality of hash function

To maintain O(1) efficiency:
[
\alpha \le 0.5 \text{ to } 0.7
]

When α becomes large → rehashing needed.

📘 Comparison of Probing Methods

Method	Primary Clustering	Secondary Clustering	Complexity
Linear Probing	High ❌	Medium	High
Quadratic Probing	Medium	High ❌	Medium
Double Hashing	Very Low ✔	Very Low ✔	Best

Double Hashing is considered the best form of open addressing.

📘 Advantages of Open Addressing

No extra memory for linked lists
Cache-friendly (array-based)
Good average-case performance
Simple structure
Used in many modern hash tables

📘 Disadvantages

Performance drops sharply at high load factors
Deletion is complicated (requires lazy delete)
Cannot store more than m keys
Probing sequences can become long (clustering)

📘 Summary (Exam Notes)

Open addressing stores all keys inside the hash table
Uses probing to find empty slots
Types:
✔ Linear Probing
✔ Quadratic Probing
✔ Double Hashing (best)
Needs load factor α < 1
Uses lazy deletion
Average O(1), worst O(n)
Preferred method: Double Hashing