System Design Article
Design an E-Commerce Platform (Amazon)
Difficulty: Medium
Design an Amazon-scale e-commerce platform that lets 200M monthly users browse 100M SKUs, add items to a cart, check out, and have orders fulfilled from regional warehouses. The interview centerpiece is the order lifecycle: how to reserve inventory atomically while a customer is on the checkout page, how to chain cart-to-payment-to-fulfillment as a saga with compensating actions, and how to make checkout idempotent so a flaky network never charges a customer twice. We also cover catalog browse at scale, multi-warehouse fulfillment routing, and the asymmetric read/write workload that makes aggressive catalog caching the right call.
Design an E-Commerce Platform (Amazon)
Design an Amazon-scale e-commerce platform that lets 200M monthly users browse 100M SKUs, add items to a cart, check out, and have orders fulfilled from regional warehouses. The interview centerpiece is the order lifecycle: how to reserve inventory atomically while a customer is on the checkout page, how to chain cart-to-payment-to-fulfillment as a saga with compensating actions, and how to make checkout idempotent so a flaky network never charges a customer twice. We also cover catalog browse at scale, multi-warehouse fulfillment routing, and the asymmetric read/write workload that makes aggressive catalog caching the right call.
651 views
8
Requirements
Functional Requirements
- Browse and search: customers browse a catalog of 100M SKUs with category filters, faceted search, and free-text search.
- Product detail: photos, variants (size, color), price, availability, reviews, recommendations.
- Cart: add, update quantity, remove items; cart persists across sessions and devices.
- Checkout: select shipping address, payment method, delivery option; place the order.
- Order tracking: customers see status (placed, paid, shipped, delivered) and tracking number.
- Inventory accuracy: never sell a unit that does not exist; if the last one is taken, the next customer sees out of stock within seconds.
- Returns and refunds: customers can request a return; once received, payment is refunded.
Out of Scope (state explicitly)
- Seller onboarding and seller-side dashboards.
- Detailed recommendation algorithms (we mention the surface, not the model).
- Warehouse-level robotics and physical fulfillment.
- Tax calculation engine (treated as an external service call).
Non-Functional Requirements
- Browse latency: p99 < 300 ms for product detail and search.
- Checkout latency: p99 < 2 s end to end (network + reservation + payment).
- Availability: 99.99% for browse; 99.95% for checkout. Browse outage costs revenue but checkout failures cost trust.
- Inventory consistency: strong; no overselling, especially during Black Friday flash demand.
- Scale: 200M MAU, 50M DAU, 1.5B page views/day, 5M orders/day average, 5K orders/sec peak.
- Multi-region: customers in NA, EU, APAC each served from a regional stack with cross-region inventory awareness.
Back-of-the-Envelope Estimation
Traffic
---------- Traffic mix ----------
MAU: 200M
DAU: 50M
Page views per DAU: ~30 (browse heavy)
Total page views/day: ~1.5B
Product detail QPS avg: ~17K
Product detail QPS peak: ~80K
Search QPS avg: ~5K
Cart writes/day: ~50M
Orders/day: ~5M (5K/sec at Black Friday peak, 60/sec average)
Read/write ratio: ~300:1The 300:1 read/write ratio is the architectural cue: optimize aggressively for reads.
Storage
---------- Storage estimation ----------
Products (100M SKUs * ~10 KB each): ~1 TB (sharded Postgres)
Product images (100M * 5 photos * 200 KB): ~100 TB (S3 + CDN)
Reviews (1B * ~500 B): ~500 GB
Orders (5M/day * 5 yr * ~5 KB): ~45 TB (sharded by customer)
Cart state (50M active * ~2 KB): ~100 GB (Redis hot, Postgres cold)
Inventory (100M SKUs * ~32 B): ~3 GB (Redis hot, Postgres cold)Bandwidth
---------- Bandwidth ----------
Product images served via CDN: ~80% offload from origin
Origin egress for images: ~5 PB/month after CDN
API egress (JSON for product/cart/order): ~50 TB/monthHigh-Level Design
---------- Service map ----------
[ Client ] -> [ CDN / Edge ] (also serves images/static)
|
v
[ API Gateway ] (auth, rate limit)
|
+-----------+--+--+-----------+-----------+
v v v v v
+-------+ +--------+ +--------+ +--------+ +--------+
|Catalog| | Search | | Cart | | Order | |Reviews |
+-------+ +--------+ +--------+ +--------+ +--------+
| | | | |
v v v v v
+-------+ +--------+ +--------+ +--------+ +--------+
|Postgrs| |Elastic-| | Redis +| |Postgres| |Postgres|
|catalog| |search | |Postgres| |Order DB| |/ Mongo |
+-------+ +--------+ +--------+ +--------+ +--------+
|
+----------------------+----------------------+
v v v
+------------------+ +------------------+ +------------------+
| Inventory Svc | | Payment Svc | | Fulfillment Svc |
+------------------+ +------------------+ +------------------+
| | |
v v v
+------------------+ +------------------+ +------------------+
| Inventory Store | | Stripe / PSP | | Warehouse |
| (Postgres+Redis) | | | | Routing Service |
+------------------+ +------------------+ +------------------+Key APIs
GET /api/v1/products/:id // product detail
GET /api/v1/search?q=...&filters=... // faceted search
GET /api/v1/cart // current customer cart
POST /api/v1/cart/items // add or update cart item
POST /api/v1/checkout // start checkout (idempotency-key required)
body: { addressId, paymentMethodId, idempotencyKey }
GET /api/v1/orders/:id // order detail and status
POST /api/v1/orders/:id/return // initiate returnRead Path (Browse)
- Client requests product detail.
- CDN serves images and any cached HTML/JSON shells.
- API gateway forwards to Catalog Service.
- Catalog Service hits a Redis cache; on miss, reads Postgres replica and warms cache (TTL 5 min).
- Inventory snippet (in stock / out of stock) read from Inventory Service Redis store.
Write Path (Checkout)
See the Detailed Design section. The interesting part is the saga.
Detailed Design
The two interesting components are the inventory reservation and the checkout saga.
Inventory Reservation
The naive approach: on order placement, decrement inventory in the database. Problem: between the customer clicking 'Buy' and payment completing, the unit is still 'available' to other customers; two people can both proceed to pay for the last unit.
The correct approach: a TTL hold. When the customer enters checkout, reserve N units for a short window (e.g., 10 minutes). If checkout completes, the hold becomes a permanent decrement; if it expires, units return to the pool.
---------- Inventory state per SKU ----------
on_hand int physical units in warehouse
reserved_holds list active short-lived reservations [{order_id, qty, expires_at}]
available view on_hand - sum(active holds)The atomic operation is 'reserve N units IFF available >= N', implemented with Postgres row-level locking or a Redis Lua script.
-- Postgres version (per SKU row, pessimistic lock; OK at moderate scale)
BEGIN;
SELECT on_hand, reserved_holds
FROM inventory
WHERE sku_id = $1
FOR UPDATE; -- row lock until COMMIT
-- in app: compute available = on_hand - sum(active holds); abort if available < $3
UPDATE inventory
SET reserved_holds = reserved_holds || jsonb_build_object(
'order_id', $2, 'qty', $3, 'expires_at', NOW() + interval '10 min'
)
WHERE sku_id = $1;
COMMIT;For flash-sale SKUs (one product, 100K simultaneous reservations), Postgres row locks become a bottleneck. Switch to a Redis Lua script keyed by SKU:
Redis reservation Lua script (KEYS[1] = inventory:sku:<id> hash with fields on_hand and reserved_total; ARGV = qty, order_id, expires_at_ms):
local on_hand = tonumber(redis.call('HGET', KEYS[1], 'on_hand') or '0')
local reserved = tonumber(redis.call('HGET', KEYS[1], 'reserved_total') or '0')
if on_hand - reserved < tonumber(ARGV[1]) then
return 0
end
redis.call('HINCRBY', KEYS[1], 'reserved_total', tonumber(ARGV[1]))
redis.call('ZADD', KEYS[1] .. ':holds', tonumber(ARGV[3]), ARGV[2] .. ':' .. ARGV[1])
return 1A background sweeper drops expired holds from the sorted set and decrements reserved_total.
The Checkout Saga
Checkout is a multi-step workflow that touches inventory, payment, and order systems. Distributed transactions across them are too slow and brittle; we use a saga with explicit compensating actions.
---------- Checkout saga ----------
1. validate cart (auth, addresses, totals)
2. reserve inventory (compensation: release reservation)
3. charge payment (compensation: refund or void)
4. create order record (compensation: cancel order)
5. confirm inventory (turn hold into decrement)
6. dispatch fulfillment (compensation: cancel shipment)
7. notify customer (email, push)If any step fails, the saga walks backward and runs the compensating action for each completed step. The orchestrator (Order Service) is the saga coordinator; it persists saga state so a process restart can resume.
// Persisted saga state in the order_sagas table
{
"saga_id": "sa_abc",
"order_id": "o_456",
"customer_id": "u_42",
"step": "PAYMENT_CHARGED",
"steps_completed": ["INVENTORY_RESERVED", "PAYMENT_CHARGED"],
"compensations_to_run": ["REFUND_PAYMENT", "RELEASE_INVENTORY"],
"updated_at": "2026-04-26T10:00:00Z"
}Idempotency at Checkout
The client sends an Idempotency-Key header on POST /checkout. The Order Service stores (idempotency_key, response) in a 24-hour cache. A retried checkout with the same key returns the original response without re-reserving inventory or re-charging the card.
async function placeOrder({ cart, addressId, paymentMethodId }) {
const idempotencyKey = crypto.randomUUID();
for (let attempt = 0; attempt < 3; attempt++) {
try {
const res = await fetch('/api/v1/checkout', {
method: 'POST',
headers: { 'Idempotency-Key': idempotencyKey, 'Content-Type': 'application/json' },
body: JSON.stringify({ cart, addressId, paymentMethodId })
});
if (res.ok) return res.json();
if (res.status >= 500) continue; // retryable
throw new Error(await res.text()); // 4xx: do not retry
} catch (e) { if (attempt === 2) throw e; }
}
}Cart Service
Cart is read on every page (mini-cart icon). It must be fast and durable. Two-tier storage:
- Redis holds the live cart (
cart:<user_id>) for fast reads / writes. - Postgres is written asynchronously (every 30 s or on cart update if Redis flush fails) so a Redis crash does not lose carts.
For anonymous users we key on a session cookie; on login we merge the anonymous cart into the user cart.
Catalog and Search
Catalog data lives in Postgres sharded by category or SKU id. Search runs on Elasticsearch with a documented schema per product, indexed asynchronously from the canonical Postgres store via Kafka CDC. Faceted search (price range, brand, color) is what Elasticsearch handles cleanly.
Product images live in S3 and are served via a CDN with multi-resolution variants (thumb-200.jpg, detail-1200.webp).
Recommendations
Personalized recommendations are computed offline (collaborative filtering and content-based signals) and served from a key-value store at request time. We do NOT compute them inline during product detail; that would explode latency.
Data Model
Postgres: products (sharded by sku_id hash)
CREATE TABLE products (
sku_id BIGINT PRIMARY KEY,
title VARCHAR(512) NOT NULL,
description TEXT,
price_cents INT NOT NULL,
currency CHAR(3) NOT NULL,
brand_id BIGINT,
category_id BIGINT,
weight_grams INT,
attrs JSONB,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_products_category ON products (category_id) WHERE is_active;
CREATE INDEX idx_products_brand ON products (brand_id) WHERE is_active;Postgres: inventory (sharded by sku_id)
CREATE TABLE inventory (
sku_id BIGINT PRIMARY KEY,
warehouse_id BIGINT,
on_hand INT NOT NULL,
reserved_holds JSONB NOT NULL DEFAULT '[]',
version INT NOT NULL DEFAULT 0,
updated_at TIMESTAMPTZ NOT NULL
);Postgres: orders (sharded by customer_id)
CREATE TABLE orders (
order_id BIGINT PRIMARY KEY,
customer_id BIGINT NOT NULL,
status VARCHAR(32) NOT NULL, -- PLACED, PAID, SHIPPED, DELIVERED, CANCELLED, RETURNED
total_cents INT NOT NULL,
currency CHAR(3) NOT NULL,
shipping_addr JSONB NOT NULL,
payment_id VARCHAR(64),
placed_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE TABLE order_items (
order_id BIGINT,
sku_id BIGINT,
qty INT NOT NULL,
price_cents INT NOT NULL,
PRIMARY KEY (order_id, sku_id)
);Redis: cart, inventory hot path, idempotency
---------- Redis layout ----------
cart:<user_id> hash {sku_id: qty} TTL 30 days
inventory:sku:<sku_id> hash {on_hand, reserved_total} no TTL
inventory:sku:<sku_id>:holds ZSET (member=order_id:qty, score=expires_ms)
idempotency:<key> JSON serialized response TTL 24 hElasticsearch: product index
Documents per SKU with text fields (title, description), keyword facets (brand, category, color), numeric fields (price), and a popularity score for ranking.
Scaling and Bottlenecks
Black Friday: 5K orders/sec
- Inventory hot keys: a flash-sale SKU might receive 50K reservation attempts/sec for 100 units. Use the Redis Lua reservation; the script is single-threaded per shard but runs in microseconds. Once
available - reserved == 0, subsequent calls return 0 immediately. - Checkout queue: if the Order Service is overloaded, queue checkouts in Kafka and process at capacity; show the customer a 'placing your order' state until the saga completes.
- Payment provider rate limits: Stripe enforces its own rate limit; we partition our API key usage and use idempotency keys so that retries during throttling are safe.
Black Friday: 80K product-detail QPS
- Edge cache product detail responses for 60 s. Even at 80K QPS, 95% hits at the edge means origin sees 4K QPS, well within Postgres replica capacity.
- Inventory snippet on the product page is more dynamic; fetch it via a side request to Inventory Service that returns
in stock/out of stockwith a 5 s TTL. We deliberately split the cache TTLs: the rest of the product detail (title, description, images) tolerates a 60 s lag, but the in-stock badge would be misleading if it lagged 60 s during a flash sale, so it bypasses the long edge cache.
Multi-Warehouse Routing
When a customer in Seattle orders a Kindle, the order may fulfill from one of several warehouses. Routing service picks based on stock availability + distance + shipping speed promised. If the closest warehouse runs out mid-order, the routing service splits the shipment across two warehouses.
---------- Routing decision ----------
Preferred warehouse: Sumner, WA (closest, 1-day Prime promise)
Fallback 1: Reno, NV (2-day)
Fallback 2: Phoenix, AZ (2-day)
If Sumner is out: split across Reno + PhoenixCatalog Index Lag
Elasticsearch indexing from CDC has 10-30 s lag. After a price change, the catalog API shows the new price within 1 s (Postgres replica) but search may return the old price for ~30 s. Acceptable; we display the canonical price on the product detail page, where the click lands.
Returns and Refunds
Return flow is a separate saga (RMA): customer requests, label issued, package received at warehouse, inspection passes, refund issued via payment provider. Failure modes: package never arrives, inspection fails. Each has a compensating path or escalation to ops.
Fraud and Risk
A risk service screens checkout in under 200 ms (synchronously) using model scores plus deterministic rules. High-risk orders enter a manual review queue rather than immediately charging; low-risk orders proceed.
Trade-offs and Alternatives
Saga vs Two-Phase Commit
A distributed 2PC across inventory, payment, and order would block all participants for the duration of the transaction; under load this is unworkable. A saga gives up atomicity (each step commits independently) in exchange for availability and clear failure handling via compensations. The cost is reasoning about partial-failure states; the saga state machine in a database makes this tractable.
Pessimistic vs Optimistic Locking for Inventory
Postgres row locks are simple and correct for moderate contention. They become a bottleneck for flash-sale SKUs (100K simultaneous reservations on one row). Optimistic locking (version + retry) plus the Redis fast path handles flash demand without serializing on a database row.
Strong vs Eventual Consistency for Inventory
We choose strong inventory consistency. Overselling damages trust and creates expensive customer service work; the latency cost (one Redis hop + Postgres write inside checkout) is acceptable. Search-index lag is eventually consistent because seeing slightly stale prices in search is harmless.
Single Postgres vs Sharded
At 100M SKUs, a single Postgres can serve catalog reads from replicas but writes (price updates, inventory changes) bottleneck. Shard by sku_id hash for products and inventory; shard by customer_id for orders. Choose shard keys per access pattern: customers query their own orders, so customer_id is the natural shard key.
Synchronous vs Asynchronous Fulfillment
Checkout returns success after payment + order creation (synchronous). Fulfillment dispatch (warehouse pick, ship) is asynchronous via a Kafka topic; the warehouse system processes events at its own pace. This decouples the checkout latency from the warehouse system's throughput.
Why Idempotency Keys, Not Server-Generated Order IDs?
The client knows when it intends one checkout; the server only sees a request. If the network drops the response, the client retries and the server cannot tell it is a retry without help. The idempotency key (client-generated UUID) lets the server recognize and replay the original response. Server-generated order IDs are still used internally; the idempotency key is the external retry-safe handle.
Real-World Examples
How real systems implement this in production
Amazon decomposes into hundreds of microservices: catalog, search, cart, inventory, order, payment, recommendations, ratings, prime eligibility, etc. Each owns its data store. Famous for 'API mandate' (every team exposes its data only via APIs) and for using DynamoDB extensively for high-write workloads like cart and session state.
Trade-off: Amazon's extreme decomposition forces every team to think about service contracts and failure modes, but it makes cross-cutting changes (e.g., adding a new tax field) require coordinated rollouts across many teams. The lesson: service boundaries that align with business domains pay off at scale, but require investment in tooling and culture.
Shopify hosts multi-tenant storefronts for ~2M merchants on a sharded Rails monolith with Pods (each Pod hosts a slice of merchants in its own database). Inventory and checkout follow a similar saga pattern; they famously survive Black Friday/Cyber Monday by isolating Pods so a single merchant's surge does not cascade.
Trade-off: Shopify's monolith plus sharded Pods is simpler than full microservices and avoids cross-service distributed transactions, but each Pod imposes a cap on a single merchant's scale. The lesson: isolating tenants by shard limits blast radius without paying the full microservices coordination tax.
Walmart's e-commerce stack runs as ~2000 microservices and handles peaks where in-store stock and online stock unify (you can buy online and pick up in store). They invest heavily in inventory consistency across physical and digital channels, with a unified inventory service that talks to both warehouse and store-level systems.
Trade-off: Walmart pays significant complexity for omnichannel inventory: the inventory service has to reconcile physical store counts in real time with online reservations. The lesson: unifying inventory across channels is product-defining (BOPIS works) but operationally one of the hardest parts of the stack.
Etsy serves 96M buyers across 7M sellers from a sharded MySQL backend with Memcached for hot reads. Each seller's products and orders are scoped to that seller; checkout aggregates orders from possibly many sellers in one cart. They run much of the platform on a primary/replica MySQL topology with strict read/write splits.
Trade-off: Etsy's per-seller orders make multi-seller carts (split into sub-orders at checkout) more complex than single-tenant marketplaces. The lesson: marketplace mechanics create unique data-modeling challenges; a single 'order' often becomes N orders behind the scenes, one per seller.
Quick Interview Phrases
Key terms to use in your answer
Common Interview Questions
Questions you might be asked about this topic
Client POSTs /checkout with cart, address, payment method, and an Idempotency-Key header. API gateway authenticates and rate-limits, then routes to Order Service. Order Service starts a saga, persists initial state, and runs steps in order: validate cart (price match, address valid), reserve inventory (Redis Lua script per SKU, with 10-min TTL hold), charge payment (Stripe API call with idempotency key), create order record (Postgres orders + order_items), confirm inventory (turn holds into permanent decrements), dispatch fulfillment via Kafka event to warehouse system. If any step fails, the saga walks backward and runs compensating actions for completed steps (refund payment, release inventory). On success, return 200 with order_id; on failure, return 4xx/5xx with a human-readable error. The customer sees confirmation page either way: success or 'we could not complete your order, please retry'.
Two-layer defense. (1) Inventory reservation in Redis via a Lua script keyed by SKU: the script atomically checks `on_hand - reserved >= qty`, increments `reserved`, and adds an entry to a sorted set of holds with a 10-min TTL score. The script runs single-threaded per shard, so all 50K attempts serialize cleanly; once the sum reaches 100, every subsequent call returns 0 in microseconds. (2) The customer who got the hold then proceeds to payment; if payment fails or the customer abandons, the hold expires and the unit returns. A background sweeper drops expired holds and decrements `reserved`. The Redis fast path handles 100K reservation attempts/sec on a single shard; persistent state syncs to Postgres asynchronously.
The authoritative cart lives in Redis under `cart:<user_id>` (a hash of sku_id -> qty). Every cart mutation is a write-through: update Redis (low latency), then asynchronously persist to Postgres for durability. When the customer logs in on a new device, the client fetches GET /cart, which reads Redis (cache hit) or reconstructs from Postgres on miss. Anonymous carts are keyed by session cookie; on login we merge the anonymous cart into the user cart (sum quantities, dedupe SKUs). For 50M active carts averaging 2 KB, total Redis footprint is ~100 GB across a sharded cluster. TTL of 30 days lets dormant carts age out of Redis; they remain in Postgres and reload on next access.
The saga catches it. Saga state in `order_sagas` table records that PAYMENT_CHARGED completed; the next step ORDER_CREATED throws. The orchestrator runs the compensation list in reverse: REFUND_PAYMENT calls the payment provider with the original payment_id and the idempotency key (refund-safe). RELEASE_INVENTORY removes the hold from Redis. The customer sees 'we could not complete your order, no charge'; in reality there was a brief charge that's already been refunded; the customer's bank statement shows pending then refund within minutes. Saga state is durable so a process restart in the middle resumes correctly. Critically, the payment provider call uses an idempotency key so retrying compensation is safe.
Today: 5K orders/sec peak. To 50K, scale each layer. Inventory: Redis Lua throughput per shard is ~100K ops/sec for short scripts; for hyper-popular SKUs (one item, 50K reservations/sec) we'd shard the inventory counter into sub-buckets and reconcile (similar to rate-limiter hot-key pattern). Order Service: stateless; horizontal scale to N replicas behind the gateway. Payment: each Stripe API key has its own rate limit; spread across multiple keys with a key router. Database: orders table is already sharded by customer_id; add shards. Saga state: Postgres for sagas can become a write hotspot; consider DynamoDB or Cassandra. Fulfillment dispatch: Kafka can absorb 50K events/sec trivially. Edge: pre-warm CDN caches for popular product pages before the sale starts. Monitoring: alarm on saga latency p99 and on payment provider error rates.
Interview Tips
How to discuss this topic effectively
Lead with the read/write asymmetry. Saying 'browse is 300x more frequent than orders, so I optimize the read path with edge caching and Postgres replicas' frames the design before diving into checkout.
Bring up the saga and idempotency together. Both protect checkout from partial failure; an interviewer hears one and expects the other.
Use the TTL-hold pattern for inventory, not a naive decrement. Mentioning the race between page-load and payment is the senior signal.
Show you know when to switch from Postgres locks to Redis Lua for hot SKUs. Saying 'pessimistic locks are fine until 1000 simultaneous reservations on one row, then move to a Lua script in Redis' demonstrates production thinking.
Decompose explicitly into Catalog, Cart, Inventory, Order, Payment, Fulfillment. Naming the services first prevents the design from collapsing into one giant monolith on the whiteboard.
Common Mistakes
Pitfalls to avoid in interviews
Decrementing inventory immediately when the customer enters checkout
Use a TTL hold instead. A reservation expires after 10 minutes if the customer abandons; the unit returns to availability automatically. Permanent decrement happens only after payment succeeds.
Designing checkout as one big distributed transaction
Two-phase commit across inventory, payment, and order blocks all three for the duration; under load this fails. Use a saga: each step commits independently with a defined compensating action so partial failures unwind cleanly.
Skipping idempotency keys on checkout
A network blip during checkout will trigger a client retry; without idempotency the customer is charged twice. Require an Idempotency-Key header on POST /checkout and store the result for 24 hours.
Computing recommendations inline during product detail render
Personalized recommendations involve large feature lookups and model inference; doing them inline blows the 300 ms latency budget. Precompute offline and serve from a fast key-value store; the page can stream recs in after the main content.
Using one giant Postgres for everything
Catalog reads, inventory writes, and order writes have different access patterns and load profiles. Shard catalog by sku_id, orders by customer_id, and use a separate Redis hot path for inventory. Co-locating them on one Postgres bottlenecks all three under load.
