Skip to content
Home » Distributed Transactions

Distributed Transactions


DISTRIBUTED TRANSACTIONS (Detailed Discussion)

A Distributed Transaction is a database transaction that accesses data stored at multiple sites (servers) in a distributed database system.
Even though data is located at different physical locations, the transaction must behave as one unified, atomic unit of work.

Example:
A banking transaction transferring money between accounts stored at different branches (each branch = a site).

A distributed transaction must preserve ACID properties across all sites.


WHY DISTRIBUTED TRANSACTIONS?

✔ Organizations operate in many branches/locations
✔ Data is fragmented or replicated across sites
✔ One transaction may require data from multiple sites

Examples:

  • Airline booking (seat data at different servers)
  • Online shopping (inventory at warehouse + payment server)
  • Banking (accounts at different branches)

ACID PROPERTIES IN DISTRIBUTED TRANSACTIONS

Distributed transactions must maintain:

✔ Atomicity

All sub-transactions across sites commit or abort together.

✔ Consistency

Data should remain valid across all sites after execution.

✔ Isolation

Concurrent distributed transactions must not interfere.

✔ Durability

Committed changes survive failures at any site.

Guaranteeing ACID across multiple sites is complex and requires special protocols.


HOW DISTRIBUTED TRANSACTIONS WORK?

A distributed transaction involves:

  1. Global Transaction Manager (Coordinator)
  2. Local Transaction Managers at each site (Participants)
  3. Communication network
  4. Commit protocol (2PC or 3PC)

COMPONENTS OF DISTRIBUTED TRANSACTIONS

✔ 1. Coordinator

  • Manages entire distributed transaction
  • Initiates the commit process
  • Communicates with all participant sites

✔ 2. Participants (Subordinates)

Each site involved has its own local manager that:

  • Executes part of the transaction
  • Reports status to coordinator
  • Commits/aborts based on coordinator’s instructions

STEPS IN A DISTRIBUTED TRANSACTION

  1. Start – Coordinator begins the transaction
  2. Execute – Operations distributed to relevant sites
  3. Prepare to Commit – Each local site checks if it can commit
  4. Commit/Abort Decision – Coordinator decides and informs all sites
  5. Completion – All sites commit or abort together

FAILURE TYPES IN DISTRIBUTED TRANSACTIONS

Distributed environments face additional failure types:

✔ Site failures

One site crashes.

✔ Communication failures

Messages lost, network partition.

✔ Transaction failures

Local constraint violations.

✔ Coordinator failure

Coordinator crashes before final decision.


SOLUTION: COMMIT PROTOCOLS

To maintain atomicity across sites, we use atomic commit protocols.

The main protocols are:

  1. Two-Phase Commit (2PC)
  2. Three-Phase Commit (3PC)

1. TWO-PHASE COMMIT PROTOCOL (2PC) (Very important)

Ensures all sites either commit or abort together.

Phase 1: PREPARE Phase

Coordinator → “Prepare to commit?”

Each participant:

  • Writes local logs
  • Replies YES or NO

Phase 2: COMMIT Phase

If ALL sites replied YES → Coordinator sends COMMIT
If ANY site replied NO → Coordinator sends ABORT

✔ Guarantees Atomicity

But if the coordinator crashes → participants may block (wait forever).

2PC is blocking.


2. THREE-PHASE COMMIT PROTOCOL (3PC)

Designed to overcome blocking in 2PC.

Phases:

  1. CanCommit?
  2. PreCommit
  3. Commit

✔ Non-blocking

If coordinator fails → participants can still reach a decision.

✔ More messages → More overhead

Rarely implemented in practice.


DISTRIBUTED CONCURRENCY CONTROL

For correctness, distributed transactions need concurrency control mechanisms:

✔ Distributed Locking (Global Lock Manager)

  • Each site uses local locks
  • Coordinator resolves conflicts

✔ Distributed Timestamp Ordering

  • Global timestamps ensure serializability

✔ Distributed Deadlock Detection

  • Using wait-for graphs across sites
  • Edge-chasing (probe messages)

DISTRIBUTED DEADLOCKS

Occurs when:

  • T1 waits for T2 at Site A
  • T2 waits for T1 at Site B

Detection methods:

  • Centralized deadlock detector
  • Distributed wait-for graph
  • Edge-chasing algorithm

Resolution:

Abort one of the transactions.


DISTRIBUTED RECOVERY

If failure occurs:

✔ Use distributed logs

✔ Use WRITE-AHEAD logging (WAL)

✔ Recovery completed using 2PC logs

✔ Checkpointing used to reduce recovery overhead

If a site crashes:

  • Participants that have committed must redo
  • Participants that have not committed must undo

ADVANTAGES OF DISTRIBUTED TRANSACTIONS

✔ Data consistency across sites
✔ Supports global applications
✔ Increases reliability
✔ Data can be processed closer to where it resides
✔ Reduces communication cost for local operations


DISADVANTAGES

✘ High communication overhead
✘ Concurrency control more complex
✘ Distributed deadlocks
✘ Complex recovery
✘ Slow commit due to network delays
✘ 2PC is blocking


EXAMPLE (MCA STYLE)

A transaction transfers money between:

  • Account A stored at Bank Server 1
  • Account B stored at Bank Server 2

Steps:

  1. Coordinator sends debit instruction to Site 1
  2. Sends credit instruction to Site 2
  3. Both sites prepare
  4. Coordinator decides commit/abort
  5. Both sites execute final decision

Ensures the transfer is either fully completed or not done at all.


Perfect 5–6 Mark Short Answer

A distributed transaction is a transaction that accesses data stored at multiple sites in a distributed database system. It must maintain ACID properties across all nodes. Distributed transactions are managed using a coordinator and participant sites. Commit protocols like Two-Phase Commit (2PC) and Three-Phase Commit (3PC) ensure atomicity, while distributed concurrency control, deadlock detection, and failure recovery ensure correctness. These transactions enable consistent updates across geographically dispersed databases.