Hashing is a technique used to map data to a unique value called a hash code or hash key, which corresponds to a specific index in a hash table. The primary goal of hashing is to enable efficient data retrieval in O(1)O(1)O(1) time on average.
Key Components of Hashing
- Hash Function:
- Converts the key into a hash code (index).
- Example: hash_index=key%table_sizehash\_index = key \% table\_sizehash_index=key%table_size.
- Hash Table:
- An array-like data structure where data is stored at an index generated by the hash function.
- Collision:
- Occurs when multiple keys map to the same index in the hash table.
- Collision Resolution Techniques:
- Open Addressing:
- Linear Probing
- Quadratic Probing
- Double Hashing
- Chaining:
- Uses linked lists to store multiple values at the same index.
- Open Addressing:
Types of Hashing Techniques
1. Direct Hashing
- Directly maps the key to a table index.
- No hash function is used.
- Suitable for small datasets with continuous keys.
2. Modular Arithmetic Hashing
- Uses a modulo operation to calculate the index: index=key%table_sizeindex = key \% table\_sizeindex=key%table_size.
3. Folding Hashing
- Splits the key into parts, performs an operation (like summation), and applies the modulo operator.
Collision Resolution Techniques in C
1. Chaining
- Each index of the hash table contains a pointer to a linked list.
- Handles collisions by adding collided keys to the list.
C Code Example:
#include <stdio.h>
#include <stdlib.h>
#define TABLE_SIZE 10
typedef struct Node {
int key;
struct Node* next;
} Node;
Node* hashTable[TABLE_SIZE];
// Hash Function
int hashFunction(int key) {
return key % TABLE_SIZE;
}
// Insert into Hash Table
void insert(int key) {
int index = hashFunction(key);
Node* newNode = (Node*)malloc(sizeof(Node));
newNode->key = key;
newNode->next = hashTable[index];
hashTable[index] = newNode;
}
// Search in Hash Table
Node* search(int key) {
int index = hashFunction(key);
Node* temp = hashTable[index];
while (temp) {
if (temp->key == key) return temp;
temp = temp->next;
}
return NULL;
}
// Display Hash Table
void display() {
for (int i = 0; i < TABLE_SIZE; i++) {
printf(“Index %d: “, i);
Node* temp = hashTable[i];
while (temp) {
printf(“%d -> “, temp->key);
temp = temp->next;
}
printf(“NULL\n”);
}
}
int main() {
insert(10);
insert(20);
insert(30);
insert(15);
display();
int key = 20;
Node* result = search(key);
if (result) printf(“Key %d found.\n”, key);
else printf(“Key %d not found.\n”, key);
return 0;
}
2. Linear Probing
- Resolves collisions by checking the next available index sequentially.
C Code Example:
#include <stdio.h>
#define TABLE_SIZE 10
int hashTable[TABLE_SIZE];
// Hash Function
int hashFunction(int key) {
return key % TABLE_SIZE;
}
// Insert into Hash Table
void insert(int key) {
int index = hashFunction(key);
while (hashTable[index] != -1) {
index = (index + 1) % TABLE_SIZE;
}
hashTable[index] = key;
}
// Search in Hash Table
int search(int key) {
int index = hashFunction(key);
while (hashTable[index] != -1) {
if (hashTable[index] == key) return index;
index = (index + 1) % TABLE_SIZE;
}
return -1;
}
// Display Hash Table
void display() {
for (int i = 0; i < TABLE_SIZE; i++) {
if (hashTable[i] != -1) printf(“Index %d: %d\n”, i, hashTable[i]);
else printf(“Index %d: NULL\n”, i);
}
}
int main() {
for (int i = 0; i < TABLE_SIZE; i++) hashTable[i] = -1;
insert(10);
insert(20);
insert(30);
insert(15);
display();
int key = 20;
int index = search(key);
if (index != -1) printf(“Key %d found at index %d.\n”, key, index);
else printf(“Key %d not found.\n”, key);
return 0;
}
3. Quadratic Probing
- Resolves collisions by checking indices in a quadratic sequence:
index=(hash+i2)%table_sizeindex = (hash + i^2) \% table\_sizeindex=(hash+i2)%table_size.
4. Double Hashing
- Uses two hash functions: index=(hash1(key)+i⋅hash2(key))%table_sizeindex = (hash1(key) + i \cdot hash2(key)) \% table\_sizeindex=(hash1(key)+i⋅hash2(key))%table_size.
Advantages of Hashing
- Fast Access: Average time complexity for insertion, deletion, and searching is O(1)O(1)O(1).
- Efficient for Large Data: Handles large datasets effectively.
Disadvantages of Hashing
- Collisions: Requires additional handling, which can degrade performance.
- Space Utilization: May need extra space for linked lists or probing.
Applications of Hashing
- Databases: Indexing for quick data retrieval.
- Compilers: Symbol table implementation.
- Cryptography: Hash functions for secure data encoding.
- File Systems: File allocation tables.
- Caches: Fast lookup in LRU and LFU caches.
Hashing is a cornerstone of efficient data storage and retrieval, making it invaluable in various applications across computer science and engineering.