๐ Multi-Way Merge Sort
Multi-way merge sort is a generalization of merge sort where instead of splitting the array into 2 parts, we split it into k parts and perform a k-way merge.
It is especially useful when dealing with:
- Large datasets
- External sorting (files too large to fit in RAM)
- Database systems
- Merge operations on many sorted lists
๐ Why Multi-Way?
Standard merge sort:
- Splits into 2 subarrays
- Merges them using 2-way merging
Multi-way merge sort:
- Splits into k subarrays
- Uses k-way merging
- Better for scenarios involving large chunks of pre-sorted data
๐ Steps of Multi-Way Merge Sort
- Divide array into k equal parts
- Recursively apply multi-way merge sort on each part
- Merge all k sorted lists using a min-heap (priority queue)
- Produce the final sorted list
๐ Complexity Analysis
Let:
- n = total elements
- k = number of ways (number of splits)
Time Complexity
- Divide step โ O(1)
- Sort k partitions recursively
- k-way merge using a heap takes โ O(n log k)
So total complexity:
[
T(n) = kT(n/k) + O(n\log k)
]
Solving gives:
[
T(n) = O(n\log_k n + n\log k)
]
Since (\log_k n = \frac{\log n}{\log k}):
[
\boxed{ T(n) = O(n \log n) }
]
(Same as normal merge sort, but more efficient for external memory.)
Space Complexity
- O(n) for output
- O(k) for heap
๐ Why Use a Min-Heap?
During merging:
- You always need the smallest element among k sorted lists.
- Min-heap gives this in O(log k) time.
- Efficient for multi-way merging.
๐ Python Code โ Multi-Way Merge Sort Using Min-Heap
This example uses k-way split and merges using heapq.
โ Python Implementation
import heapq
def multiway_merge_sort(arr, k):
n = len(arr)
if n <= 1:
return arr
# Step 1: Divide into k equal buckets
size = (n + k - 1) // k # Ceiling division
parts = [arr[i:i + size] for i in range(0, n, size)]
# Step 2: Recursively sort each part
sorted_parts = [multiway_merge_sort(part, k) for part in parts]
# Step 3: K-way merge using min-heap
return k_way_merge(sorted_parts)
def k_way_merge(sorted_lists):
min_heap = []
result = []
# Initialize heap with first element of each list
for list_index, lst in enumerate(sorted_lists):
if lst: # non-empty list
heapq.heappush(min_heap, (lst[0], list_index, 0))
# Extract-min until heap is empty
while min_heap:
value, list_index, element_index = heapq.heappop(min_heap)
result.append(value)
# Push next element from the same list
if element_index + 1 < len(sorted_lists[list_index]):
next_value = sorted_lists[list_index][element_index + 1]
heapq.heappush(min_heap, (next_value, list_index, element_index + 1))
return result
# Example usage:
arr = [12, 3, 5, 7, 19, 1, 4, 9]
k = 3 # 3-way merge sort
sorted_arr = multiway_merge_sort(arr, k)
print("Sorted array:", sorted_arr)
๐ Example Output
Sorted array: [1, 3, 4, 5, 7, 9, 12, 19]
๐ Summary (Exam Notes)
- Multi-way merge sort splits into k subarrays
- Uses k-way merge with a min-heap (priority queue)
- More efficient for external sorting & large datasets
- Time Complexity = O(n log n)
- Merge complexity = O(n log k)
- Space Complexity = O(n + k)
- Practical when merging many sorted files or lists
