Skip to content

Hashing

Hashing is a technique used to map data to a unique value called a hash code or hash key, which corresponds to a specific index in a hash table. The primary goal of hashing is to enable efficient data retrieval in O(1)O(1)O(1) time on average.


Key Components of Hashing

  1. Hash Function:
    1. Converts the key into a hash code (index).
    1. Example: hash_index=key%table_sizehash\_index = key \% table\_sizehash_index=key%table_size.
  2. Hash Table:
    1. An array-like data structure where data is stored at an index generated by the hash function.
  3. Collision:
    1. Occurs when multiple keys map to the same index in the hash table.
  4. Collision Resolution Techniques:
    1. Open Addressing:
      1. Linear Probing
      1. Quadratic Probing
      1. Double Hashing
    1. Chaining:
      1. Uses linked lists to store multiple values at the same index.

Types of Hashing Techniques

1. Direct Hashing

  • Directly maps the key to a table index.
  • No hash function is used.
  • Suitable for small datasets with continuous keys.

2. Modular Arithmetic Hashing

  • Uses a modulo operation to calculate the index: index=key%table_sizeindex = key \% table\_sizeindex=key%table_size.

3. Folding Hashing

  • Splits the key into parts, performs an operation (like summation), and applies the modulo operator.

Collision Resolution Techniques in C

1. Chaining

  • Each index of the hash table contains a pointer to a linked list.
  • Handles collisions by adding collided keys to the list.

C Code Example:

#include <stdio.h>

#include <stdlib.h>

#define TABLE_SIZE 10

typedef struct Node {

    int key;

    struct Node* next;

} Node;

Node* hashTable[TABLE_SIZE];

// Hash Function

int hashFunction(int key) {

    return key % TABLE_SIZE;

}

// Insert into Hash Table

void insert(int key) {

    int index = hashFunction(key);

    Node* newNode = (Node*)malloc(sizeof(Node));

    newNode->key = key;

    newNode->next = hashTable[index];

    hashTable[index] = newNode;

}

// Search in Hash Table

Node* search(int key) {

    int index = hashFunction(key);

    Node* temp = hashTable[index];

    while (temp) {

        if (temp->key == key) return temp;

        temp = temp->next;

    }

    return NULL;

}

// Display Hash Table

void display() {

    for (int i = 0; i < TABLE_SIZE; i++) {

        printf(“Index %d: “, i);

        Node* temp = hashTable[i];

        while (temp) {

            printf(“%d -> “, temp->key);

            temp = temp->next;

        }

        printf(“NULL\n”);

    }

}

int main() {

    insert(10);

    insert(20);

    insert(30);

    insert(15);

    display();

    int key = 20;

    Node* result = search(key);

    if (result) printf(“Key %d found.\n”, key);

    else printf(“Key %d not found.\n”, key);

    return 0;

}


2. Linear Probing

  • Resolves collisions by checking the next available index sequentially.

C Code Example:

#include <stdio.h>

#define TABLE_SIZE 10

int hashTable[TABLE_SIZE];

// Hash Function

int hashFunction(int key) {

    return key % TABLE_SIZE;

}

// Insert into Hash Table

void insert(int key) {

    int index = hashFunction(key);

    while (hashTable[index] != -1) {

        index = (index + 1) % TABLE_SIZE;

    }

    hashTable[index] = key;

}

// Search in Hash Table

int search(int key) {

    int index = hashFunction(key);

    while (hashTable[index] != -1) {

        if (hashTable[index] == key) return index;

        index = (index + 1) % TABLE_SIZE;

    }

    return -1;

}

// Display Hash Table

void display() {

    for (int i = 0; i < TABLE_SIZE; i++) {

        if (hashTable[i] != -1) printf(“Index %d: %d\n”, i, hashTable[i]);

        else printf(“Index %d: NULL\n”, i);

    }

}

int main() {

    for (int i = 0; i < TABLE_SIZE; i++) hashTable[i] = -1;

    insert(10);

    insert(20);

    insert(30);

    insert(15);

    display();

    int key = 20;

    int index = search(key);

    if (index != -1) printf(“Key %d found at index %d.\n”, key, index);

    else printf(“Key %d not found.\n”, key);

    return 0;

}


3. Quadratic Probing

  • Resolves collisions by checking indices in a quadratic sequence:
    index=(hash+i2)%table_sizeindex = (hash + i^2) \% table\_sizeindex=(hash+i2)%table_size.

4. Double Hashing

  • Uses two hash functions: index=(hash1(key)+i⋅hash2(key))%table_sizeindex = (hash1(key) + i \cdot hash2(key)) \% table\_sizeindex=(hash1(key)+i⋅hash2(key))%table_size.

Advantages of Hashing

  1. Fast Access: Average time complexity for insertion, deletion, and searching is O(1)O(1)O(1).
  2. Efficient for Large Data: Handles large datasets effectively.

Disadvantages of Hashing

  1. Collisions: Requires additional handling, which can degrade performance.
  2. Space Utilization: May need extra space for linked lists or probing.

Applications of Hashing

  1. Databases: Indexing for quick data retrieval.
  2. Compilers: Symbol table implementation.
  3. Cryptography: Hash functions for secure data encoding.
  4. File Systems: File allocation tables.
  5. Caches: Fast lookup in LRU and LFU caches.

Hashing is a cornerstone of efficient data storage and retrieval, making it invaluable in various applications across computer science and engineering.