Twitter Fan-Out & Timeline Architecture
The push vs pull dilemma at 500M tweets/day
Key Insight
The celebrity problem: uniform push fan-out breaks for accounts with extreme follower counts. Hybrid models win.
Request Journey
How It Works
โ User posts tweet
โก Write to Tweets table and Kafka event
โข Fan-out service reads follower list
โฃ Normal users: push tweet ID to Redis timeline
โค Celebrities: skip push, pull on read
โฅ Client reads timeline from Redis sorted set
โ The Problem
When a user opens their home timeline, they expect a real-time feed of tweets from everyone they follow, sorted chronologically. The naive approach โ querying all followed users' tweets at read time โ requires joining millions of rows per request, which at 300K+ timeline reads per second brings any database to its knees. The challenge is compounded by celebrity accounts like Katy Perry with 100M+ followers: a single tweet must appear in 100M timelines within seconds.
โThe Solution
Twitter uses a hybrid fan-out model. For regular users (fewer than ~500K followers), tweets are pushed at write time into each follower's precomputed timeline stored in Redis sorted sets. For celebrity accounts, tweets are fetched at read time and merged with the precomputed timeline on the fly. This hybrid approach bounds worst-case write amplification while keeping reads consistently fast at sub-5ms.
๐Scale at a Glance
500M+
Tweets/Day
300K+
Timeline Reads/sec
~5M
Fan-out Writes/sec
~800 tweets/user
Timeline Cache Size
๐ฌDeep Dive
Fan-Out on Write โ The Push Model
When a regular user tweets, a fan-out service takes the tweet ID and pushes it into each follower's home timeline โ a Redis sorted set keyed by user ID, scored by Snowflake timestamp. For a user with 10K followers, this means 10K Redis ZADD operations per tweet. Redis sorted sets keep the timeline naturally ordered, so reads are a simple ZREVRANGE call returning the latest N tweet IDs. This approach trades significant write amplification for constant-time, sub-5ms reads on the hot path.
Fan-Out on Read โ The Celebrity Problem
When Katy Perry tweets to 100M followers, pushing to 100M Redis sorted sets would take minutes and overwhelm the entire cache cluster. Instead, tweets from accounts exceeding a follower threshold (~500K) are excluded from write-time fan-out. At read time, the user's precomputed timeline is merged with fresh tweets from any followed celebrity accounts fetched on demand. This mixed approach keeps write latency bounded at the cost of slightly more complex read-path logic and marginally higher tail latency.
Snowflake IDs โ Time-Sortable Distributed Identifiers
Twitter's Snowflake generates 64-bit unique IDs composed of 41 bits for timestamp, 10 bits for machine ID, and 12 bits for sequence number. These IDs are sortable by creation time without any database lookup, which means Redis sorted sets can use the raw ID as the score for chronological ordering. Each Snowflake worker generates up to 4,096 IDs per millisecond. This eliminates the need for a centralized auto-increment counter โ a critical single point of failure at Twitter's write volume.
Redis as the Timeline Store
Each user's home timeline is a Redis sorted set capped at roughly 800 tweet IDs. At read time, the client fetches the latest 20โ50 tweet IDs via ZREVRANGE, then batch-fetches actual tweet content from a separate tweet object cache in a single multi-get. Keeping timelines in-memory means the common-case read latency is under 5ms. The trade-off is memory: 800 IDs ร hundreds of millions of active users requires a massive Redis fleet โ Twitter operated one of the largest Redis deployments in the world.
FlockDB โ The Social Graph Store
Twitter built FlockDB, a distributed graph database optimized for adjacency-list queries like 'who follows this user.' It stores edges (follower โ followee) sharded by source node ID across MySQL backends, with a graph-aware query layer supporting set operations like intersection (mutual followers) and difference. When a tweet triggers fan-out, the fan-out service queries FlockDB to retrieve the full follower list. FlockDB is optimized for high read throughput on large adjacency lists โ critical when a single user can have millions of followers.
โฌกArchitecture Diagram
Twitter Fan-Out & Timeline Architecture โ simplified architecture overview
โฆCore Concepts
Fan-out on Write
Fan-out on Read
Redis Sorted Sets
Finagle RPC
FlockDB
Snowflake IDs
โTradeoffs & Design Decisions
Every architectural decision is a tradeoff. Here's what you gain and what you give up.
โ Strengths
- โSub-5ms timeline reads for all users via precomputed Redis sorted sets
- โSnowflake IDs eliminate centralized ID generation bottleneck entirely
- โHybrid fan-out model bounds worst-case write amplification for celebrity tweets
- โRedis sorted sets provide natural chronological ordering without additional sorting
โ Weaknesses
- โWrite amplification: a tweet from a user with 100K followers generates 100K Redis writes
- โCelebrity tweets have higher read latency due to on-demand merge at read time
- โRedis memory cost is enormous โ 800 tweet IDs ร hundreds of millions of users
- โCache invalidation for deleted or protected tweets must propagate across millions of timelines
๐ฏFAANG Interview Questions
Interview Prep๐ก These questions appear in FAANG system design rounds. Focus on tradeoffs, not just what the system does.
These are real system design interview questions asked at Google, Meta, Amazon, Apple, Netflix, and Microsoft. Study the architecture above before attempting.
- Q1
Design a news feed system. When would you choose fan-out on write vs fan-out on read?
- Q2
A user with 50M followers posts a tweet. Walk through exactly what happens in the system end to end.
- Q3
How would you handle tweet deletions in a fan-out-on-write architecture where the tweet ID exists in millions of timelines?
- Q4
Twitter's timeline uses Redis sorted sets. Why sorted sets instead of lists? What are the complexity trade-offs?
- Q5
Design Snowflake: a distributed ID generator that produces time-sortable, globally unique 64-bit IDs without coordination between nodes.
Research Papers & Further Reading
Scaling Twitter's Ad Targeting Platform
Twitter Engineering
Listen to the Podcast Episode
Alex & Sam break it down
Listen to a conversational deep-dive on this architecture โ real trade-offs, production context, and student-friendly explanations. Free, no login required.
Listen to EpisodeFree ยท No account required ยท Listen in browser
More Distributed Systems
View allNetflix Content Delivery Architecture
How Netflix streams to 260M users without a single datacenter
Netflix ยท Disney+ ยท Hulu
Uber Surge Pricing & Geospatial Architecture
H3 hexagonal indexing, real-time dispatch, and dynamic pricing
Uber ยท Lyft ยท DoorDash
WhatsApp Messaging at 100B Messages/Day
How 50 engineers built a system bigger than Twitter
WhatsApp ยท Telegram ยท Signal
Listen to more architecture deep-dives
30 free podcast episodes โ Alex & Sam break down every architecture in this library. Listen in your browser, no account needed.
All architecture articles are free ยท No account needed