Skip to content
Home ยป Hash Functions

Hash Functions


๐Ÿ“˜ Hash Functions (under Hashing)

A Hash Function is a mathematical function that converts a key into a fixed-size integer (hash value or index) used to store the key in a hash table.

[
h(k) = \text{index in hash table}
]

The goal of a hash function is to map keys uniformly across the table so that collisions are minimized.


๐Ÿ“Œ Why Hash Functions Are Important?

A good hash function ensures:

  • Uniform distribution
  • Low collision rate
  • Fast computation
  • Efficient search, insert, delete
  • Consistent O(1) average performance

๐Ÿ“˜ Properties of a Good Hash Function

1. Deterministic

Same key โ†’ same hash value every time.

2. Uniform Distribution

Keys should spread evenly across the table.

3. Fast to compute

Must run in constant time O(1).

4. Minimize Collisions

Different keys should rarely map to same index.

5. Use full range of hash table

Avoid clustering.


๐Ÿ“˜ Types of Hash Functions

1. Division (Mod) Method

[
h(k) = k \bmod m
]

Where:

  • k = key
  • m = table size (preferably prime)

Example:

h(100) = 100 % 7 = 2

Pros: Simple, fast
Cons: Poor if m has patterns with key distribution.


2. Multiplication Method (Knuth)

[
h(k) = \lfloor m (kA – \lfloor kA \rfloor ) \rfloor
]

Where:

  • A is a constant, usually 0.6180339887 (fraction of golden ratio)

Pros: Works well for any table size
Cons: Slightly more expensive to compute


3. Mid-Square Method

  1. Square the key.
  2. Extract middle digits as hash value.

Example:

Key = 123
123ยฒ = 15129
Middle digits = 512  โ†’ h(k) = 512

Pros: Very uniform
Cons: Expensive for large keys


4. Folding Method

Break key into equal parts and add them.

Example:

Key = 123456 โ†’ (123 + 456) = 579

Pros: Works for large integers
Cons: Still may collide


5. Universal Hashing

Use a family of hash functions, randomly choose one at runtime.

[
h_{a,b}(k) = ((ak + b) \bmod p) \bmod m
]

Where:

  • p is a large prime
  • a โ‰  0, b < p

Pros: Minimizes worst-case collisions
Used in cryptography & randomized algorithms


6. Cryptographic Hash Functions (not used for hash tables)

Examples:

  • MD5
  • SHA-1
  • SHA-256

Properties: irreversible, secure
Usage: password hashing, digital signatures
(Not used for normal hash tables because more expensive)


๐Ÿ“˜ Hashing for Strings

Common technique:

Polynomial Rolling Hash

[
h(s) = (s_0 a^0 + s_1 a^1 + \dots + s_{n-1} a^{n-1}) \bmod m
]

Where:

  • characters โ†’ integers
  • a = constant like 31 or 37
  • m = large prime

Used by Java, Python, C++ strings internally.


๐Ÿ“˜ Hash Function Performance

A good hash function should:

  • Avoid clustering
  • Reduce collisions
  • Spread keys uniformly
  • Work fast in O(1) time

Bad hash functions cause:

  • Long chains (in chaining)
  • Long probe sequences (in open addressing)
    โ†’ Result: performance degrades to O(n)

๐Ÿ“˜ Python Example: Simple Hash Functions

def division_hash(key, table_size):
    return key % table_size

def multiplication_hash(key, table_size):
    A = 0.6180339887  # suggested constant
    return int(table_size * ((key * A) % 1))

def mid_square_hash(key, table_size):
    square = key * key
    mid = (square // 10) % 1000  # extract middle digits
    return mid % table_size

def string_hash(s, table_size):
    h = 0
    a = 31  # multiplier
    for char in s:
        h = (h * a + ord(char)) % table_size
    return h

๐Ÿ“˜ Summary (Exam Notes)

  • Hash function maps a key โ†’ index
  • Must be fast, uniform, and minimize collisions
  • Techniques:
    • Division method
    • Multiplication method
    • Mid-square
    • Folding
    • Universal hashing
    • Polynomial rolling hash for strings
  • Good hash function ensures O(1) average performance
  • Bad hash function causes clustering and O(n) performance