Amazon DynamoDB Architecture
The Dynamo paper that changed distributed databases forever
Key Insight
The CAP theorem isn't binary systems like DynamoDB let you tune where on the consistency-availability spectrum to operate per operation.
Request Journey
How It Works
โ Client sends write with partition key
โก Router hashes key via MD5 to find ring position
โข Request lands on responsible virtual node
โฃ Node writes to WAL and memtable
โค Replicates to two replicas with quorum W=2
โฅ Acknowledges write to client
โ The Problem
Amazon's holiday shopping season 2004exposed a critical weakness: their relational databases couldn't scale to handle peak cart and checkout traffic without costly vertical scaling. The system needed to be always-writable โ an unavailable shopping cart directly costs revenue. Traditional RDBMS systems sacrifice availability during network partitions, which is unacceptable for Amazon's core commerce operations.
โThe Solution
Amazon built Dynamo, a fully distributed key-value store using consistent hashing with virtual nodes for data partitioning, sloppy quorum for tunable consistency, vector clocks for conflict detection, gossip protocol for failure detection, and Merkle trees for anti-entropy repair. The 2007 Dynamo paper became one of the most influential distributed systems papers ever, directly inspiring Cassandra, Riak, and Voldemort.
๐Scale at a Glance
Millions
Requests/sec
99.995%
Availability Target
<10ms
Latency (p99)
16,384 slots
Hash Partitions
๐ฌDeep Dive
Consistent Hashing with Virtual Nodes
Dynamo maps keys to nodes using consistent hashing on a virtual ring. Each physical node owns multiple positions on the ring (virtual nodes or vnodes), typically 150โ256 per node. When a node joins or leaves, only keys in adjacent ring positions are affected โ roughly 1/N of all keys rather than the complete redistribution required by simple modulo hashing. Virtual nodes also solve the hotspot problem: a powerful machine gets more vnodes, receiving proportionally more data. The ring is the foundational abstraction that makes Dynamo horizontally scalable.
Sloppy Quorum and Hinted Handoff
Dynamo uses a sloppy quorum system with parameters (N, R, W): each key is replicated to N nodes, reads require R responses, and writes require W acknowledgments. The consistency level is tunable per operation โ setting W=1 gives maximum write availability, while R+W>N provides strong consistency. During node failures, hinted handoff allows writes to be temporarily stored on a healthy neighbor node with a 'hint' indicating the intended recipient. When the failed node recovers, hints are replayed to restore the correct replica placement.
Vector Clocks for Conflict Resolution
Since Dynamo allows concurrent writes to succeed (availability over consistency), conflicts are inevitable. Vector clocks track the causal history of each write โ each node increments its own counter on every update, producing a version vector like [(A,1), (B,2), (C,1)]. If two versions have incomparable vector clocks (neither dominates the other), both are preserved and returned to the application on the next read. The application then resolves the conflict using domain-specific logic โ Amazon's shopping cart uses union merge so items are never silently lost.
Gossip Protocol for Cluster Membership
Dynamo nodes discover each other and detect failures using a gossip protocol. Every second, each node picks a random peer and exchanges its membership list โ a table of (node_id, heartbeat_counter, timestamp). If a node's heartbeat hasn't been updated within a timeout, it's marked as suspected failed. Gossip is eventually consistent โ membership changes propagate across the cluster in O(log N) rounds. This decentralized approach avoids a single point of failure that a centralized coordinator like ZooKeeper would introduce.
Merkle Trees for Anti-Entropy Repair
Over time, replicas can diverge due to missed hinted handoffs or partial failures. Dynamo uses Merkle trees (hash trees) per key range to efficiently detect inconsistencies. Each leaf is a hash of a key-value pair; internal nodes are hashes of their children. Two nodes compare root hashes โ if they match, all data is consistent. If they differ, they traverse the tree to find exactly which key ranges diverge, minimizing the data transferred for synchronization. This makes full-cluster repair feasible even at Amazon's scale.
โฌกArchitecture Diagram
Amazon DynamoDB Architecture โ simplified architecture overview
โฆCore Concepts
Consistent Hashing
Virtual Nodes
Gossip Protocol
Vector Clocks
Quorum Reads/Writes
Merkle Trees
Tunable Consistency
โTradeoffs & Design Decisions
Every architectural decision is a tradeoff. Here's what you gain and what you give up.
โ Strengths
- โAlways-writable design ensures shopping cart availability even during partitions
- โTunable consistency (N,R,W) lets each operation choose its consistency-availability trade-off
- โConsistent hashing with vnodes enables seamless horizontal scaling with minimal data movement
- โFully decentralized gossip protocol eliminates single-point-of-failure coordinators
โ Weaknesses
- โVector clock conflict resolution pushes complexity to the application layer
- โEventually consistent reads can return stale data โ not suitable for all use cases
- โMerkle tree maintenance and anti-entropy repair consume background CPU and network bandwidth
- โSloppy quorum with hinted handoff can temporarily reduce effective replication factor during cascading failures
๐ฏFAANG Interview Questions
Interview Prep๐ก These questions appear in FAANG system design rounds. Focus on tradeoffs, not just what the system does.
These are real system design interview questions asked at Google, Meta, Amazon, Apple, Netflix, and Microsoft. Study the architecture above before attempting.
- Q1
Explain consistent hashing. If you add a new node to a 10-node Dynamo cluster, what percentage of keys need to move?
- Q2
What are vector clocks and how do they differ from Lamport timestamps? When would you use each?
- Q3
Design a key-value store with tunable consistency. How do the parameters N, R, and W interact?
- Q4
A node in your Dynamo cluster has been down for 3 hours and comes back online. How does it catch up on missed writes?
- Q5
Amazon's shopping cart uses union merge for conflict resolution. What would happen if they used last-write-wins instead?
Research Papers & Further Reading
Dynamo: Amazon's Highly Available Key-value Store
DeCandia, G. et al. (Amazon)
Listen to the Podcast Episode
Alex & Sam break it down
Listen to a conversational deep-dive on this architecture โ real trade-offs, production context, and student-friendly explanations. Free, no login required.
Listen to EpisodeFree ยท No account required ยท Listen in browser
More Distributed Systems
View allNetflix Content Delivery Architecture
How Netflix streams to 260M users without a single datacenter
Netflix ยท Disney+ ยท Hulu
Twitter Fan-Out & Timeline Architecture
The push vs pull dilemma at 500M tweets/day
X (Twitter) ยท Instagram ยท LinkedIn
Uber Surge Pricing & Geospatial Architecture
H3 hexagonal indexing, real-time dispatch, and dynamic pricing
Uber ยท Lyft ยท DoorDash
Listen to more architecture deep-dives
30 free podcast episodes โ Alex & Sam break down every architecture in this library. Listen in your browser, no account needed.
All architecture articles are free ยท No account needed