HomeArchitecturesAmazon DynamoDB Architecture
โšก Distributed SystemsExpertWeek 3

Amazon DynamoDB Architecture

The Dynamo paper that changed distributed databases forever

AmazonLinkedInCassandra (inspired)

Key Insight

The CAP theorem isn't binary systems like DynamoDB let you tune where on the consistency-availability spectrum to operate per operation.

Request Journey

Client sends write with partition keyโ†’
Router hashes key via MD5 to find ring positionโ†’
Request lands on responsible virtual nodeโ†’
Node writes to WAL and memtableโ†’
Replicates to two replicas with quorum W=2
+1 more steps

How It Works

1

โ‘  Client sends write with partition key

2

โ‘ก Router hashes key via MD5 to find ring position

3

โ‘ข Request lands on responsible virtual node

4

โ‘ฃ Node writes to WAL and memtable

5

โ‘ค Replicates to two replicas with quorum W=2

6

โ‘ฅ Acknowledges write to client

โš The Problem

Amazon's holiday shopping season 2004exposed a critical weakness: their relational databases couldn't scale to handle peak cart and checkout traffic without costly vertical scaling. The system needed to be always-writable โ€” an unavailable shopping cart directly costs revenue. Traditional RDBMS systems sacrifice availability during network partitions, which is unacceptable for Amazon's core commerce operations.

โœ“The Solution

Amazon built Dynamo, a fully distributed key-value store using consistent hashing with virtual nodes for data partitioning, sloppy quorum for tunable consistency, vector clocks for conflict detection, gossip protocol for failure detection, and Merkle trees for anti-entropy repair. The 2007 Dynamo paper became one of the most influential distributed systems papers ever, directly inspiring Cassandra, Riak, and Voldemort.

๐Ÿ“ŠScale at a Glance

Millions

Requests/sec

99.995%

Availability Target

<10ms

Latency (p99)

16,384 slots

Hash Partitions

๐Ÿ”ฌDeep Dive

1

Consistent Hashing with Virtual Nodes

Dynamo maps keys to nodes using consistent hashing on a virtual ring. Each physical node owns multiple positions on the ring (virtual nodes or vnodes), typically 150โ€“256 per node. When a node joins or leaves, only keys in adjacent ring positions are affected โ€” roughly 1/N of all keys rather than the complete redistribution required by simple modulo hashing. Virtual nodes also solve the hotspot problem: a powerful machine gets more vnodes, receiving proportionally more data. The ring is the foundational abstraction that makes Dynamo horizontally scalable.

2

Sloppy Quorum and Hinted Handoff

Dynamo uses a sloppy quorum system with parameters (N, R, W): each key is replicated to N nodes, reads require R responses, and writes require W acknowledgments. The consistency level is tunable per operation โ€” setting W=1 gives maximum write availability, while R+W>N provides strong consistency. During node failures, hinted handoff allows writes to be temporarily stored on a healthy neighbor node with a 'hint' indicating the intended recipient. When the failed node recovers, hints are replayed to restore the correct replica placement.

3

Vector Clocks for Conflict Resolution

Since Dynamo allows concurrent writes to succeed (availability over consistency), conflicts are inevitable. Vector clocks track the causal history of each write โ€” each node increments its own counter on every update, producing a version vector like [(A,1), (B,2), (C,1)]. If two versions have incomparable vector clocks (neither dominates the other), both are preserved and returned to the application on the next read. The application then resolves the conflict using domain-specific logic โ€” Amazon's shopping cart uses union merge so items are never silently lost.

4

Gossip Protocol for Cluster Membership

Dynamo nodes discover each other and detect failures using a gossip protocol. Every second, each node picks a random peer and exchanges its membership list โ€” a table of (node_id, heartbeat_counter, timestamp). If a node's heartbeat hasn't been updated within a timeout, it's marked as suspected failed. Gossip is eventually consistent โ€” membership changes propagate across the cluster in O(log N) rounds. This decentralized approach avoids a single point of failure that a centralized coordinator like ZooKeeper would introduce.

5

Merkle Trees for Anti-Entropy Repair

Over time, replicas can diverge due to missed hinted handoffs or partial failures. Dynamo uses Merkle trees (hash trees) per key range to efficiently detect inconsistencies. Each leaf is a hash of a key-value pair; internal nodes are hashes of their children. Two nodes compare root hashes โ€” if they match, all data is consistent. If they differ, they traverse the tree to find exactly which key ranges diverge, minimizing the data transferred for synchronization. This makes full-cluster repair feasible even at Amazon's scale.

โฌกArchitecture Diagram

Amazon DynamoDB Architecture โ€” simplified architecture overview

โœฆCore Concepts

๐Ÿ”‘

Consistent Hashing

โš™๏ธ

Virtual Nodes

โš™๏ธ

Gossip Protocol

๐Ÿ“

Vector Clocks

โš™๏ธ

Quorum Reads/Writes

โš™๏ธ

Merkle Trees

โš™๏ธ

Tunable Consistency

โš–Tradeoffs & Design Decisions

Every architectural decision is a tradeoff. Here's what you gain and what you give up.

โœ“ Strengths

  • โœ“Always-writable design ensures shopping cart availability even during partitions
  • โœ“Tunable consistency (N,R,W) lets each operation choose its consistency-availability trade-off
  • โœ“Consistent hashing with vnodes enables seamless horizontal scaling with minimal data movement
  • โœ“Fully decentralized gossip protocol eliminates single-point-of-failure coordinators

โœ— Weaknesses

  • โœ—Vector clock conflict resolution pushes complexity to the application layer
  • โœ—Eventually consistent reads can return stale data โ€” not suitable for all use cases
  • โœ—Merkle tree maintenance and anti-entropy repair consume background CPU and network bandwidth
  • โœ—Sloppy quorum with hinted handoff can temporarily reduce effective replication factor during cascading failures

๐ŸŽฏFAANG Interview Questions

Interview Prep

๐Ÿ’ก These questions appear in FAANG system design rounds. Focus on tradeoffs, not just what the system does.

These are real system design interview questions asked at Google, Meta, Amazon, Apple, Netflix, and Microsoft. Study the architecture above before attempting.

  1. Q1

    Explain consistent hashing. If you add a new node to a 10-node Dynamo cluster, what percentage of keys need to move?

  2. Q2

    What are vector clocks and how do they differ from Lamport timestamps? When would you use each?

  3. Q3

    Design a key-value store with tunable consistency. How do the parameters N, R, and W interact?

  4. Q4

    A node in your Dynamo cluster has been down for 3 hours and comes back online. How does it catch up on missed writes?

  5. Q5

    Amazon's shopping cart uses union merge for conflict resolution. What would happen if they used last-write-wins instead?

Research Papers & Further Reading

2007

Dynamo: Amazon's Highly Available Key-value Store

DeCandia, G. et al. (Amazon)

Read

Listen to the Podcast Episode

๐ŸŽ™๏ธ Free Podcast

Alex & Sam break it down

Listen to a conversational deep-dive on this architecture โ€” real trade-offs, production context, and student-friendly explanations. Free, no login required.

Listen to Episode

Free ยท No account required ยท Listen in browser

More Distributed Systems

View all
๐ŸŽ™๏ธ Podcast ยท All Free

Listen to more architecture deep-dives

30 free podcast episodes โ€” Alex & Sam break down every architecture in this library. Listen in your browser, no account needed.

All architecture articles are free ยท No account needed