Stripe Payment Processing Architecture
Idempotency, event sourcing, and double-entry ledgers
Key Insight
Idempotency is non-negotiable in payments network retries happen, and you must guarantee the operation executes exactly once.
Request Journey
How It Works
โ Merchant sends charge with idempotency key
โก Stripe checks key to prevent duplicate charges
โข PAN tokenized and stored in vault
โฃ Request routed to card network Visa/Mastercard
โค Issuing bank authorizes via fraud score
โฅ Settlement batch runs at end of day
โ The Problem
Payment processing demands the strongestcorrectness guarantees in all of software engineering. Charging a credit card twice is catastrophic โ it erodes user trust and creates regulatory liability. Network failures, retries, and timeouts are inevitable, yet every payment must execute exactly once. Multi-step payment flows (authorize โ capture โ settle โ payout) span multiple external systems (card networks, banks, fraud services) with different failure modes and latencies.
โThe Solution
Stripe uses idempotency keys โ client-generated UUIDs attached to every mutating API request โ to guarantee exactly-once semantics regardless of retries. An immutable event log (event sourcing) records every state transition for auditability. Double-entry bookkeeping ensures every credit has a matching debit. Multi-step payment flows use distributed sagas with compensating transactions instead of traditional distributed transactions, enabling graceful partial failure recovery.
๐Scale at a Glance
Billions
API Requests/Day
$1T+
Payment Volume/Year
99.999%
Uptime SLA
<100ms
Fraud Detection Latency
๐ฌDeep Dive
Idempotency Keys โ Exactly-Once Payment Semantics
Every mutating Stripe API request includes an idempotency key โ a client-generated UUID. The server stores the key and its associated response in a durable idempotency store. If the same key is sent again (due to network retry, client timeout, or duplicate webhook), the server returns the stored response without re-executing the operation. This is critical for payments: if a charge request times out and the client retries, the card is charged exactly once. The idempotency store uses a compound key of (API key + idempotency key) and entries expire after 24 hours.
Event Sourcing โ Immutable Audit Trail
Stripe records every payment state transition as an immutable event in an append-only log: payment_created โ payment_authorized โ payment_captured โ payment_settled. The current state of any payment is derived by replaying its event history. This provides a complete, tamper-proof audit trail required by financial regulators (PCI DSS, SOX). Event sourcing also enables temporal queries ('what was the state of this payment at 3:47 PM?') and makes debugging production issues straightforward โ you can replay the exact sequence of events that led to any state.
Double-Entry Bookkeeping โ Financial Integrity
Stripe's ledger uses double-entry bookkeeping: every financial transaction creates two entries โ a debit and a credit โ that must sum to zero. When a customer pays $100, the ledger debits the customer's payment account and credits the merchant's pending balance. When settlement occurs, the pending balance is debited and the merchant's bank account is credited. This centuries-old accounting principle ensures that money never appears or disappears โ the total across all accounts always balances. Any imbalance triggers an immediate alert and investigation.
Distributed Sagas for Multi-Step Payments
A payment flow involves multiple external systems: fraud check โ card network authorization โ capture โ settlement โ merchant payout. Traditional distributed transactions (2PC) are impractical across third-party systems. Instead, Stripe uses the saga pattern: each step is an independent transaction with a compensating action. If authorization succeeds but capture fails, the system automatically executes a void (the compensating transaction). A saga orchestrator tracks the current step and ensures either all steps complete or all completed steps are compensated. Each step is idempotent, making retries safe.
Fraud Detection โ ML Scoring in the Hot Path
Stripe Radar evaluates every transaction with an ML model that scores fraud probability in under 100ms โ it must not add perceptible latency to checkout. The model uses hundreds of signals: card fingerprint, IP geolocation, device attributes, transaction velocity, behavioral biometrics, and merchant risk profile. Features are served from a low-latency feature store. The model is trained on billions of historical transactions across Stripe's entire network โ a merchant-level system wouldn't have enough data. High-risk transactions are blocked or stepped up to 3D Secure authentication automatically.
โฌกArchitecture Diagram
Stripe Payment Processing Architecture โ simplified architecture overview
โฆCore Concepts
Idempotency Keys
Event Sourcing
CQRS
Distributed Sagas
Double-Entry Ledger
Webhook Fan-out
โTradeoffs & Design Decisions
Every architectural decision is a tradeoff. Here's what you gain and what you give up.
โ Strengths
- โIdempotency keys guarantee exactly-once semantics regardless of network retries
- โEvent sourcing provides a tamper-proof audit trail required by financial regulators
- โDouble-entry bookkeeping makes financial inconsistencies mathematically impossible
- โNetwork-level fraud model trained on billions of transactions outperforms merchant-level models
โ Weaknesses
- โIdempotency store must be highly available and durable โ its failure means duplicate charges are possible
- โEvent sourcing generates massive storage volume and requires careful event schema evolution
- โSaga compensating transactions can fail, requiring human intervention for stuck payment flows
- โFraud model false positives block legitimate transactions, directly causing merchant revenue loss
๐ฏFAANG Interview Questions
Interview Prep๐ก These questions appear in FAANG system design rounds. Focus on tradeoffs, not just what the system does.
These are real system design interview questions asked at Google, Meta, Amazon, Apple, Netflix, and Microsoft. Study the architecture above before attempting.
- Q1
Design a payment processing system with exactly-once semantics. How do you handle network timeouts on charge requests?
- Q2
Explain the saga pattern. A payment is authorized but capture fails โ walk through the compensating transaction flow.
- Q3
Why does Stripe use event sourcing instead of a traditional mutable database? What are the trade-offs for a financial system?
- Q4
How would you design an idempotency layer? What happens if the idempotency store itself has a partial failure?
- Q5
Design a real-time fraud detection system that must score every transaction in under 100ms. What features would you use?
Listen to the Podcast Episode
Alex & Sam break it down
Listen to a conversational deep-dive on this architecture โ real trade-offs, production context, and student-friendly explanations. Free, no login required.
Listen to EpisodeFree ยท No account required ยท Listen in browser
More Distributed Systems
View allNetflix Content Delivery Architecture
How Netflix streams to 260M users without a single datacenter
Netflix ยท Disney+ ยท Hulu
Twitter Fan-Out & Timeline Architecture
The push vs pull dilemma at 500M tweets/day
X (Twitter) ยท Instagram ยท LinkedIn
Uber Surge Pricing & Geospatial Architecture
H3 hexagonal indexing, real-time dispatch, and dynamic pricing
Uber ยท Lyft ยท DoorDash
Listen to more architecture deep-dives
30 free podcast episodes โ Alex & Sam break down every architecture in this library. Listen in your browser, no account needed.
All architecture articles are free ยท No account needed