Hashing – Dr. Balvinder Taneja

Hashing is a technique used to map data to a unique value called a hash code or hash key, which corresponds to a specific index in a hash table. The primary goal of hashing is to enable efficient data retrieval in O(1)O(1)O(1) time on average.

Key Components of Hashing

Hash Function:
1. Converts the key into a hash code (index).
1. Example: hash_index=key%table_sizehash\_index = key \% table\_sizehash_index=key%table_size.
Hash Table:
1. An array-like data structure where data is stored at an index generated by the hash function.
Collision:
1. Occurs when multiple keys map to the same index in the hash table.
Collision Resolution Techniques:
1. Open Addressing:
  1. Linear Probing
  1. Quadratic Probing
  1. Double Hashing
1. Chaining:
  1. Uses linked lists to store multiple values at the same index.

Types of Hashing Techniques

1. Direct Hashing

Directly maps the key to a table index.
No hash function is used.
Suitable for small datasets with continuous keys.

2. Modular Arithmetic Hashing

Uses a modulo operation to calculate the index: index=key%table_sizeindex = key \% table\_sizeindex=key%table_size.

3. Folding Hashing

Splits the key into parts, performs an operation (like summation), and applies the modulo operator.

Collision Resolution Techniques in C

1. Chaining

Each index of the hash table contains a pointer to a linked list.
Handles collisions by adding collided keys to the list.

C Code Example:

#include <stdio.h>

#include <stdlib.h>

#define TABLE_SIZE 10

typedef struct Node {

int key;

struct Node* next;

} Node;

Node* hashTable[TABLE_SIZE];

// Hash Function

int hashFunction(int key) {

return key % TABLE_SIZE;

}

// Insert into Hash Table

void insert(int key) {

int index = hashFunction(key);

Node* newNode = (Node*)malloc(sizeof(Node));

newNode->key = key;

newNode->next = hashTable[index];

hashTable[index] = newNode;

}

// Search in Hash Table

Node* search(int key) {

int index = hashFunction(key);

Node* temp = hashTable[index];

while (temp) {

if (temp->key == key) return temp;

temp = temp->next;

}

return NULL;

}

// Display Hash Table

void display() {

for (int i = 0; i < TABLE_SIZE; i++) {

printf(“Index %d: “, i);

Node* temp = hashTable[i];

while (temp) {

printf(“%d -> “, temp->key);

temp = temp->next;

}

printf(“NULL\n”);

}

int main() {

insert(10);

insert(20);

insert(30);

insert(15);

display();

int key = 20;

Node* result = search(key);

if (result) printf(“Key %d found.\n”, key);

else printf(“Key %d not found.\n”, key);

return 0;

}

2. Linear Probing

Resolves collisions by checking the next available index sequentially.

C Code Example:

#include <stdio.h>

#define TABLE_SIZE 10

int hashTable[TABLE_SIZE];

// Hash Function

int hashFunction(int key) {

return key % TABLE_SIZE;

}

// Insert into Hash Table

void insert(int key) {

int index = hashFunction(key);

while (hashTable[index] != -1) {

index = (index + 1) % TABLE_SIZE;

}

hashTable[index] = key;

}

// Search in Hash Table

int search(int key) {

int index = hashFunction(key);

while (hashTable[index] != -1) {

if (hashTable[index] == key) return index;

index = (index + 1) % TABLE_SIZE;

}

return -1;

}

// Display Hash Table

void display() {

for (int i = 0; i < TABLE_SIZE; i++) {

if (hashTable[i] != -1) printf(“Index %d: %d\n”, i, hashTable[i]);

else printf(“Index %d: NULL\n”, i);

}

int main() {

for (int i = 0; i < TABLE_SIZE; i++) hashTable[i] = -1;

insert(10);

insert(20);

insert(30);

insert(15);

display();

int key = 20;

int index = search(key);

if (index != -1) printf(“Key %d found at index %d.\n”, key, index);

else printf(“Key %d not found.\n”, key);

return 0;

}

3. Quadratic Probing

Resolves collisions by checking indices in a quadratic sequence:
index=(hash+i2)%table_sizeindex = (hash + i^2) \% table\_sizeindex=(hash+i2)%table_size.

4. Double Hashing

Uses two hash functions: index=(hash1(key)+i⋅hash2(key))%table_sizeindex = (hash1(key) + i \cdot hash2(key)) \% table\_sizeindex=(hash1(key)+i⋅hash2(key))%table_size.

Advantages of Hashing

Fast Access: Average time complexity for insertion, deletion, and searching is O(1)O(1)O(1).
Efficient for Large Data: Handles large datasets effectively.

Disadvantages of Hashing

Collisions: Requires additional handling, which can degrade performance.
Space Utilization: May need extra space for linked lists or probing.

Applications of Hashing

Databases: Indexing for quick data retrieval.
Compilers: Symbol table implementation.
Cryptography: Hash functions for secure data encoding.
File Systems: File allocation tables.
Caches: Fast lookup in LRU and LFU caches.

Hashing is a cornerstone of efficient data storage and retrieval, making it invaluable in various applications across computer science and engineering.