Below is a complete, MCA-level, exam-oriented explanation of Database System Architectures in Parallel Databases — perfect for 8–15 mark answers.
⭐ PARALLEL DATABASES: DATABASE SYSTEM ARCHITECTURES
A Parallel Database System uses multiple processors, disks, and memory units working simultaneously to improve:
- Performance
- Throughput
- Scalability
- Fault tolerance
Parallelism greatly speeds up:
- Query processing
- Transaction execution
- Large data analytics
Parallel DBMS architecture defines how processors and storage units are organized and how they coordinate tasks.
⭐ WHY PARALLEL DATABASE ARCHITECTURE?
✔ To handle huge datasets (GB → TB → PB)
✔ To process complex queries quickly
✔ To achieve high availability
✔ To support large-scale OLTP & OLAP workloads
✔ To reduce response time by dividing tasks across CPUs
⭐ TYPES OF DATABASE SYSTEM ARCHITECTURES (Parallel DBMS)
Parallel database architecture is mainly classified into three categories:
- Shared-Memory Architecture
- Shared-Disk Architecture
- Shared-Nothing Architecture
Additionally, we study hybrid and massively parallel (MPP) systems.
⭐ 1. SHARED MEMORY ARCHITECTURE
(Also called SMP – Symmetric Multiprocessor System)
All processors:
- Share the same memory
- Share the same disk
- Share the same operating system
✔ Architecture Diagram (Text Form)
+-------------+
CPU1 -- | |
CPU2 -- | Shared |
CPU3 -- | Memory |
+-------------+
|
+-------------+
| Shared Disk |
+-------------+
✔ Features
- Many CPUs connected to a single memory
- All processors can access shared data directly
✔ Advantages
- Very easy to program
- Simple architecture
- Efficient for OLTP workloads
- Fast communication (shared memory = low latency)
✔ Disadvantages
- Limited scalability (memory bus becomes bottleneck)
- More CPUs → more contention
- Typically supports only 8–32 processors
✔ Examples
- Oracle RAC (partial)
- SQL Server SMP systems
- MySQL multi-threaded servers
⭐ 2. SHARED DISK ARCHITECTURE
All processors have:
- Their own private memory
- But share the same disk
- DBMS maintains cache coherence
✔ Architecture Diagram
CPU1 + Memory1 \
CPU2 + Memory2 ---> Shared Disk Storage
CPU3 + Memory3 /
✔ Features
- Processors can work independently
- Common disk enables data sharing
✔ Advantages
- Good for high availability (failover easy)
- More scalable than shared-memory
- Supports cluster environments
✔ Disadvantages
- Disk becomes bottleneck
- Requires cache coordination between nodes
- Expensive infrastructure
✔ Examples
- Oracle Parallel Server / Oracle RAC
- IBM DB2 Parallel Edition
⭐ 3. SHARED NOTHNG ARCHITECTURE
(Most efficient for massive parallelism)
Also known as MPP – Massively Parallel Processing Systems
Every processor has:
- Its own private memory
- Its own private disk
- Communicates with others using a high-speed network
✔ Architecture Diagram
CPU1 + Memory1 + Disk1
CPU2 + Memory2 + Disk2 ---- interconnected via network ----
CPU3 + Memory3 + Disk3
✔ Features
- No sharing → no conflicts
- Best scalability
- Nodes operate independently
✔ Advantages
- Highest performance
- Linear scalability (add more nodes → get more power)
- No central bottleneck
- Fault isolation (one node fails → others unaffected)
✔ Disadvantages
- Complex to implement
- Data partitioning required
- Network communication overhead
✔ Examples (Modern Big Data Systems)
- Google BigQuery
- Amazon Redshift
- Teradata
- Apache Cassandra
- Hadoop/Hive (parallel processing)
- Greenplum
⭐ COMPARISON OF PARALLEL ARCHITECTURES
| Feature | Shared Memory | Shared Disk | Shared Nothing |
|---|---|---|---|
| Scalability | Low | Medium | High |
| Data Sharing | Easy | Easy | Difficult |
| Fault Tolerance | Medium | High | High |
| Cost | Low/Medium | High | Medium |
| Best Use | Small-medium workloads | High availability clusters | Big data, MPP queries |
⭐ 4. HYBRID PARALLEL ARCHITECTURE
Combines elements of shared-disk and shared-nothing architectures.
Examples:
- Oracle Exadata
- SAP HANA
- IBM PureData
Provides a balance between scalability and ease of management.
⭐ 5. MPP – MASSIVELY PARALLEL PROCESSING
Most modern parallel databases use MPP architecture.
Features:
✔ Hundreds or thousands of nodes
✔ Parallel query execution
✔ Distributed storage
✔ Fault tolerance using replication
✔ High-speed interconnect (InfiniBand)
Application Areas:
- Data warehousing
- OLAP
- Machine learning on big data
- Real-time analytics
⭐ HOW PARALLELISM IS ACHIEVED? (Very important for exams)
Three types of parallelism:
✔ 1. Inter-Query Parallelism
Multiple queries run parallel.
✔ 2. Intra-Query Parallelism
Same query executed using multiple processors.
✔ 3. Intra-Operation Parallelism
Single operation (e.g., join, sort, scan) executed in parallel.
⭐ Parallel Query Execution Techniques
- Partitioned scanning
- Parallel sorting
- Parallel join algorithms:
- Hash join
- Merge join
- Partitioned join
- Parallel aggregation
- Pipeline parallelism
- Data partitioning (Hash/Range/Round Robin)
⭐ Perfect 5–6 Mark Summary
Parallel database architectures describe how processors and storage units are arranged for parallel execution.
Three major architectures:
- Shared Memory: Multiple CPUs share memory and disk—simple but limited scalability.
- Shared Disk: CPUs have private memory but share a common disk—good for high availability.
- Shared Nothing: Each node has its own memory and disk—highest scalability and used in MPP systems.
Parallel architectures help execute queries faster, support large datasets, and improve availability and performance.
