Skip to content
Home » Parallel Databases : Database System Architectures

Parallel Databases : Database System Architectures

Below is a complete, MCA-level, exam-oriented explanation of Database System Architectures in Parallel Databases — perfect for 8–15 mark answers.


PARALLEL DATABASES: DATABASE SYSTEM ARCHITECTURES

A Parallel Database System uses multiple processors, disks, and memory units working simultaneously to improve:

  • Performance
  • Throughput
  • Scalability
  • Fault tolerance

Parallelism greatly speeds up:

  • Query processing
  • Transaction execution
  • Large data analytics

Parallel DBMS architecture defines how processors and storage units are organized and how they coordinate tasks.


WHY PARALLEL DATABASE ARCHITECTURE?

✔ To handle huge datasets (GB → TB → PB)
✔ To process complex queries quickly
✔ To achieve high availability
✔ To support large-scale OLTP & OLAP workloads
✔ To reduce response time by dividing tasks across CPUs


TYPES OF DATABASE SYSTEM ARCHITECTURES (Parallel DBMS)

Parallel database architecture is mainly classified into three categories:

  1. Shared-Memory Architecture
  2. Shared-Disk Architecture
  3. Shared-Nothing Architecture

Additionally, we study hybrid and massively parallel (MPP) systems.


1. SHARED MEMORY ARCHITECTURE

(Also called SMP – Symmetric Multiprocessor System)

All processors:

  • Share the same memory
  • Share the same disk
  • Share the same operating system

✔ Architecture Diagram (Text Form)

        +-------------+
CPU1 -- |             |
CPU2 -- |  Shared     |
CPU3 -- |  Memory     |
        +-------------+
             |
        +-------------+
        | Shared Disk |
        +-------------+

✔ Features

  • Many CPUs connected to a single memory
  • All processors can access shared data directly

✔ Advantages

  • Very easy to program
  • Simple architecture
  • Efficient for OLTP workloads
  • Fast communication (shared memory = low latency)

✔ Disadvantages

  • Limited scalability (memory bus becomes bottleneck)
  • More CPUs → more contention
  • Typically supports only 8–32 processors

✔ Examples

  • Oracle RAC (partial)
  • SQL Server SMP systems
  • MySQL multi-threaded servers

2. SHARED DISK ARCHITECTURE

All processors have:

  • Their own private memory
  • But share the same disk
  • DBMS maintains cache coherence

✔ Architecture Diagram

CPU1 + Memory1  \
CPU2 + Memory2   ---> Shared Disk Storage
CPU3 + Memory3  /

✔ Features

  • Processors can work independently
  • Common disk enables data sharing

✔ Advantages

  • Good for high availability (failover easy)
  • More scalable than shared-memory
  • Supports cluster environments

✔ Disadvantages

  • Disk becomes bottleneck
  • Requires cache coordination between nodes
  • Expensive infrastructure

✔ Examples

  • Oracle Parallel Server / Oracle RAC
  • IBM DB2 Parallel Edition

3. SHARED NOTHNG ARCHITECTURE

(Most efficient for massive parallelism)
Also known as MPP – Massively Parallel Processing Systems

Every processor has:

  • Its own private memory
  • Its own private disk
  • Communicates with others using a high-speed network

✔ Architecture Diagram

CPU1 + Memory1 + Disk1  
CPU2 + Memory2 + Disk2  ---- interconnected via network ----
CPU3 + Memory3 + Disk3

✔ Features

  • No sharing → no conflicts
  • Best scalability
  • Nodes operate independently

✔ Advantages

  • Highest performance
  • Linear scalability (add more nodes → get more power)
  • No central bottleneck
  • Fault isolation (one node fails → others unaffected)

✔ Disadvantages

  • Complex to implement
  • Data partitioning required
  • Network communication overhead

✔ Examples (Modern Big Data Systems)

  • Google BigQuery
  • Amazon Redshift
  • Teradata
  • Apache Cassandra
  • Hadoop/Hive (parallel processing)
  • Greenplum

COMPARISON OF PARALLEL ARCHITECTURES

FeatureShared MemoryShared DiskShared Nothing
ScalabilityLowMediumHigh
Data SharingEasyEasyDifficult
Fault ToleranceMediumHighHigh
CostLow/MediumHighMedium
Best UseSmall-medium workloadsHigh availability clustersBig data, MPP queries

4. HYBRID PARALLEL ARCHITECTURE

Combines elements of shared-disk and shared-nothing architectures.

Examples:

  • Oracle Exadata
  • SAP HANA
  • IBM PureData

Provides a balance between scalability and ease of management.


5. MPP – MASSIVELY PARALLEL PROCESSING

Most modern parallel databases use MPP architecture.

Features:

✔ Hundreds or thousands of nodes
✔ Parallel query execution
✔ Distributed storage
✔ Fault tolerance using replication
✔ High-speed interconnect (InfiniBand)

Application Areas:

  • Data warehousing
  • OLAP
  • Machine learning on big data
  • Real-time analytics

HOW PARALLELISM IS ACHIEVED? (Very important for exams)

Three types of parallelism:

✔ 1. Inter-Query Parallelism

Multiple queries run parallel.

✔ 2. Intra-Query Parallelism

Same query executed using multiple processors.

✔ 3. Intra-Operation Parallelism

Single operation (e.g., join, sort, scan) executed in parallel.


Parallel Query Execution Techniques

  1. Partitioned scanning
  2. Parallel sorting
  3. Parallel join algorithms:
    • Hash join
    • Merge join
    • Partitioned join
  4. Parallel aggregation
  5. Pipeline parallelism
  6. Data partitioning (Hash/Range/Round Robin)

Perfect 5–6 Mark Summary

Parallel database architectures describe how processors and storage units are arranged for parallel execution.
Three major architectures:

  1. Shared Memory: Multiple CPUs share memory and disk—simple but limited scalability.
  2. Shared Disk: CPUs have private memory but share a common disk—good for high availability.
  3. Shared Nothing: Each node has its own memory and disk—highest scalability and used in MPP systems.

Parallel architectures help execute queries faster, support large datasets, and improve availability and performance.