Case Study

690

Design Pastebin

Design a service like Pastebin or GitHub Gist where users dump up to 10 MB of text and share a link. The interview twist over a URL shortener: pastes are big, so you store them in object storage (S3) and only keep metadata in your database. This lesson covers the metadata vs blob split, expiration via S3 lifecycle policies, presigned URLs for direct uploads, syntax highlighting strategy, and how to handle the read pattern when most pastes are read once and never again.

design-pastebin

601

Design Instagram (Photo Sharing)

Design a photo sharing service like Instagram with 500M daily active users uploading 100M photos a day, served as personalized feeds at sub-200 ms p99. The interview centerpiece is the news feed: fan-out on write versus fan-out on read, the celebrity problem, and the hybrid pull-on-read model that real Instagram uses. We also cover photo upload pipelines (presigned URLs, multi-resolution generation, CDN), the metadata data model, and how to scale follow graphs that go from a few friends to hundreds of millions of followers.

design-instagram

797

Design Twitter / X (Social Feed)

Design a microblogging service like Twitter or X with 250M daily active users posting 500M tweets a day, served as a personalized timeline at sub-200 ms p99. The interview centerpiece is the home timeline: hybrid fan-out at the celebrity boundary, write amplification math, and how Twitter built Manhattan and the Timeline Service to make 250M people see fresh tweets within seconds. We also cover trending topics, the search index, retweet semantics, and how Twitter handles 50,000 tweets per second when a major event happens.

design-twitter

1.1k

Design Reddit (Forum / Voting)

Design a community-driven forum like Reddit with 50M daily active users, 500K subreddits, and the famous hot/top/best ranking algorithms that decide which posts you see. The interview centerpiece is the ranking system: how to score posts in real time as votes pour in, how to make the front page personalized without per-user fan-out, and how to render nested comment trees at sub-200 ms when a popular thread has 10,000 nested replies. We also cover voting fraud detection, the difference between hot and Wilson score, and the tiered cache that makes 50K reads per second on the front page survive a viral post.

design-reddit

909

Design YouTube (Video Platform)

Design a video platform like YouTube with 2 billion users, 500 hours of video uploaded every minute, and 1 billion hours watched per day. The interview centerpiece is the video pipeline: chunked uploads, parallel transcoding to 8 resolutions and 3 codecs, HLS/DASH adaptive streaming over a global CDN, and the metadata service that ties it all together. We also cover recommendations (the secondary feed problem), comment scaling, view-counter accuracy, and how YouTube serves 200 Tbps of egress without melting the internet.

design-youtube

adaptive-bitrate-streaming

video-streaming

video-transcoding

recommendation-system

1.1k

System Design

Premium

Design TikTok (Short-Form Video)

Design TikTok with 1.5B monthly active users, 100M short videos uploaded daily, and the For You Page that decides which video plays next for every viewer in under 100 ms. Unlike Instagram and Twitter, TikTok has no follower-driven feed - the For You Page is pure ML recommendation from a global pool. The interview centerpiece is the recommendation system architecture: candidate retrieval, two-tower models, online ranking with engagement signals, and how to keep video pre-loaded so the next swipe is instant. We also cover content moderation at scale, edge caching for the long-form-of-short-form access pattern, and why TikTok's product choice eliminated the celebrity fan-out problem entirely.

design-tiktok

case-study

social-content-platforms

for-you-page

short-form-video

recommendation-system

engagement-signals

content-moderation

edge-caching

video-cdn

video-streaming

social-media

system-design

advanced

premium

669

Hard

System Design

Premium

Design Facebook News Feed

Design Facebook's News Feed for 2 billion daily active users where every feed open reads from a personalized, ML-ranked timeline assembled from thousands of candidate posts in real time. Unlike Instagram's chronological precomputed feed or TikTok's pure recommendation, Facebook blends a friend graph, group memberships, page follows, and ads into one ranked stream via the legendary EdgeRank-and-successor algorithms. The interview centerpiece is the aggregator pattern: parallel candidate retrieval from many sources, real-time feature lookup, ML scoring, and online filtering, all under a 200 ms p99 budget. We also cover real-time updates (push notifications when a friend posts), edge ranking signals, and how Meta keeps the feed fresh with no precomputed timeline.

design-facebook-newsfeed

case-study

social-content-platforms

feed-ranking

edge-ranking

edgerank

aggregation-service

real-time-updates

fan-out-on-read

recommendation-system

social-media

system-design

advanced

premium

854

Hard

Design a Chat System (WhatsApp)

Design a real-time chat system like WhatsApp serving 2B users sending 100B messages per day with sub-second delivery, presence indicators, and read receipts. The interview centerpiece is the persistent WebSocket connection layer: how many connections per server, how to route a message to a recipient who may be on a different server, and how to guarantee delivery when the recipient is offline. We cover the message delivery state machine (sent, delivered, read), the connection routing layer that maps user_id to a chat server, the message store for offline delivery, and presence/typing indicators that operate at a higher write rate than messages themselves.

design-chat-system

design-notification-service

messaging-communication

664

Design a Notification Service

Design a multi-channel notification service that delivers 10B push, email, and SMS notifications per day across three independent provider networks (APNs, FCM, SendGrid, Twilio) with priority queues, per-user rate limits, and idempotent retries. The interview centerpiece is the fan-out from a single application event to multiple channels and providers, each with its own rate limits, failure modes, and delivery semantics. We cover priority queues for transactional vs marketing traffic, retry policies with exponential backoff, deduplication of duplicate triggers, user preference enforcement, and the device token lifecycle that quietly invalidates tens of millions of tokens per day.

messaging-communication

946

Design an Email Service (Gmail)

Design an email service like Gmail handling 1.8B users storing 500EB of email, accepting ~300B inbound messages per day from the public SMTP network while filtering 90%+ as spam, and serving full-text search over a user's entire inbox in sub-200ms. The interview centerpiece is the asymmetric architecture: SMTP is an untrusted public protocol with hostile traffic patterns (spam, phishing, sender forgery) that needs heavy gateway-side filtering, while the user-facing IMAP/web layer needs cheap reads, pagination of huge mailboxes, and per-user inverted indexes for search. We cover the SMTP MX gateway, the spam pipeline (SPF/DKIM/DMARC + ML), the per-user inverted index for search, and how mailboxes scale when one user holds 50GB of email.

design-email-service

messaging-communication

926

System Design

Premium

Design Video Conferencing (Zoom)

Design a real-time video conferencing system like Zoom that supports 1-on-1 calls and meetings of up to 1000 participants with sub-200ms glass-to-glass latency, adapts to user bandwidth, and runs reliably across mobile networks. The interview centerpiece is the choice of media topology: peer-to-peer mesh (small calls), MCU mixing (centralized, expensive), or SFU forwarding (the modern standard). We cover the WebRTC stack (signaling vs media planes, ICE/STUN/TURN), simulcast and SVC for adaptive quality, recording pipelines, and how to keep latency low when participants span multiple continents.

design-video-conferencing

case-study

messaging-communication

video-conferencing

webrtc

sfu

mcu

rtp

simulcast

svc

ice-stun-turn

low-latency-media

real-time

system-design

advanced

premium

547

Hard

System Design

Premium

Design Discord (Real-time Communities)

Design Discord, a real-time community platform with 200M monthly active users, organized into 'guilds' (servers) of up to 500K members each, with persistent text channels storing trillions of messages and live voice channels with sub-100ms latency. The interview centerpiece is the dual architecture: a sharded text-message store (Cassandra/ScyllaDB) with billions of messages per guild and per-channel ordering, plus a real-time voice infrastructure with regional voice servers and custom UDP transport. We cover guild sharding by Snowflake ID, the Elixir/Erlang gateway that holds millions of WebSocket connections, presence at the guild scale, and how Discord migrated from MongoDB to Cassandra to ScyllaDB as message volume crossed trillions.

design-discord

case-study

messaging-communication

discord

guild-architecture

websocket-gateway

cassandra

scylladb

voice-channels

presence-fan-out

elixir

system-design

advanced

premium

944

Hard

Design Typeahead / Autocomplete

Design a typeahead/autocomplete service like Google Search's suggestion bar that returns the top 10 ranked completions for a query prefix in under 100ms p99, scaling to 5B searches per day with a multi-billion-entry suggestion index. The interview centerpiece is the data structure choice (trie vs sorted strings vs ngram index) and the offline pipeline that ranks suggestions by frequency, recency, personalization, and click-through rate. We cover the trie with precomputed top-K per node, edge n-gram indexes for typo tolerance, the MapReduce/Spark batch pipeline that rebuilds suggestions nightly, and the per-region edge cache that absorbs 99% of traffic.

712

Design a Web Crawler

Design a distributed web crawler that fetches 5 billion pages per month from the public web while respecting robots.txt, applying per-host politeness limits, deduplicating URLs and content across a 50PB corpus, and feeding the indexer pipeline downstream. The interview centerpiece is the URL frontier: a priority-aware queue of pending URLs sharded by host so politeness rules can be enforced per domain, plus content deduplication via hashing and shingling. We cover the fetcher worker pool, DNS caching, content extraction, the bloom-filter URL seen set, and how to handle hostile sites (large pages, redirect loops, slow responses, deliberate spam).

626

System Design

Premium

Design a Search Engine

Design a web-scale search engine that indexes 50B documents and serves 100K queries per second with sub-200ms p99 latency, ranking results by relevance (BM25), authority (PageRank), and personalization. The interview centerpiece is the inverted index sharded across thousands of nodes with scatter-gather query execution, plus the multi-stage ranking pipeline (cheap candidate generation, expensive learned-to-rank rerank). We cover document parsing and tokenization, the offline indexing pipeline (Spark MapReduce), term-partitioned vs document-partitioned sharding, query understanding and expansion, snippet generation, and how to keep the index fresh as the web changes.

design-search-engine

case-study

search-discovery

search-engine

inverted-index

bm25

pagerank

scatter-gather

learned-to-rank

tf-idf

tokenization

near-real-time-indexing

system-design

advanced

premium

516

Hard

Design Nearby / Location Service (Yelp)

Design a 'nearby' service like Yelp that returns the top businesses within a search radius of the user's location, ranking by distance, rating, and category, scaling to 200M monthly users querying 100M businesses. The interview centerpiece is the geospatial index: how to find 'all businesses within 5 km of (lat, lng)' efficiently. We compare bounding-box scans, geohashes, quadtrees, R-trees, and PostGIS GIST indexes; we recommend geohash + secondary index for write-heavy systems and quadtree/R-tree for read-heavy. We cover business storage and search, review ranking, the infrequent-update vs frequent-query asymmetry, and how to handle the long tail of remote regions.

design-nearby-service

location-based-services

171

Design a Rate Limiter

Design a distributed rate limiter that protects an API platform from abuse and uneven load while staying fast and accurate at 1B requests per day. The interview centerpiece is choosing among the five canonical algorithms (fixed window, sliding window log, sliding window counter, token bucket, leaky bucket) and explaining how to make the chosen one atomic across a Redis cluster. We cover where to place the limiter (edge, gateway, in-process), per-IP vs per-user vs per-API-key keys, returning 429 with Retry-After, the hot key problem, and fail-open vs fail-closed under cache outages.

design-rate-limiter

ecommerce-marketplace

737

Design an E-Commerce Platform (Amazon)

Design an Amazon-scale e-commerce platform that lets 200M monthly users browse 100M SKUs, add items to a cart, check out, and have orders fulfilled from regional warehouses. The interview centerpiece is the order lifecycle: how to reserve inventory atomically while a customer is on the checkout page, how to chain cart-to-payment-to-fulfillment as a saga with compensating actions, and how to make checkout idempotent so a flaky network never charges a customer twice. We also cover catalog browse at scale, multi-warehouse fulfillment routing, and the asymmetric read/write workload that makes aggressive catalog caching the right call.

design-ecommerce

ecommerce-marketplace

651

Design a Ticketing System (Ticketmaster)

Design a Ticketmaster-style ticketing platform that sells reserved seats for concerts and sports events, with the central challenge being a flash onsale where 1M users compete for 50K seats in five minutes. The interview centerpiece is the seat reservation lock: each unique seat (Section A, Row 12, Seat 7) cannot be split or sub-bucketed like fungible inventory, so contention is unavoidable. We cover seat-level pessimistic holds with TTL, the virtual waiting room that randomizes queue position to absorb flash demand fairly, anti-bot defenses, dynamic pricing tiers, and the read-replica explosion that interactive seat maps cause.

design-ticketing-system

ecommerce-marketplace

998

System Design

Premium

Design a Payment System (Stripe)

Design a Stripe-style payment platform that processes 100M payments per day across 50 currencies and dozens of payment methods, where the central requirement is financial correctness: never charge a customer twice, never lose a payment, always reconcile to the cent. The interview centerpiece is the trio of idempotency keys, the payment intent state machine, and the immutable double-entry ledger - together they make the system safe in the face of network failures, partial outages, and adversarial retries. We also cover webhook delivery with signing and exponential backoff, PCI scope minimization through tokenization, multi-region availability, and the reconciliation jobs that compare our ledger to the bank's settlement files every night.

design-payment-system

case-study

ecommerce-marketplace

stripe

payment-system

idempotency

double-entry-ledger

reconciliation

webhooks

pci

system-design

advanced

premium

Hard

Design a Key-Value Store (DynamoDB)

Design a Dynamo-style distributed key-value store that scales linearly to thousands of nodes, stays available during partitions, and offers tunable consistency through a quorum (N, W, R). The interview centerpiece is the trio that makes this work at scale: consistent hashing with virtual nodes for partitioning, N/W/R quorums for replication and consistency, and vector clocks for resolving concurrent writes. We cover the gossip protocol for membership, Merkle trees for anti-entropy, hinted handoff for transient failures, sloppy quorum for write availability during partitions, and the LSM-tree storage engine that powers each node.

design-key-value-store

infrastructure-storage

457

Design a Distributed Cache (Redis)

Design a Redis-style in-memory distributed cache that serves billions of GET/SET operations per day at sub-millisecond latency, with sharding across hundreds of nodes and explicit eviction when memory fills. The interview centerpiece is the eviction-and-partitioning combination: how LRU and LFU choose what to drop, and how a cluster picks which node owns each key without a central coordinator. We compare client-side hashing, proxy-based partitioning (twemproxy), and Redis Cluster's hash-slot model; we cover cache-aside as the dominant access pattern, replica failover, optional persistence, and the sub-ms latency budget that makes this design fundamentally different from the durable KV store covered in the previous case study.

design-distributed-cache

infrastructure-storage

System Design

Premium

Design Object Storage (S3)

Design an S3-style object storage service that stores trillions of immutable blobs ranging from 1 KB to 5 TB at eleven nines of durability and a fraction of the cost of triple replication. The interview centerpiece is the trio that makes this economical: erasure coding (typically 12 data shards plus 4 parity shards) instead of full replicas; a separate metadata service that maps object keys to chunk locations; and multi-part upload that lets a 5 TB object stream from many sources in parallel. We also cover the bucket/object namespace, lifecycle policies that move cold objects to colder tiers, immutability with versioning, pre-signed URLs for direct client transfer, and the move from eventual to strong read-after-write consistency that AWS shipped in 2020.

design-object-storage

case-study

infrastructure-storage

object-storage

erasure-coding

metadata-service

multi-part-upload

immutability

system-design

advanced

premium

1.1k

Hard

System Design

Premium

Design a Distributed File System (GFS/HDFS)

Design a Google-File-System or HDFS-style distributed file system that stores petabytes across commodity hardware, optimized for batch analytics workloads where files are large (gigabytes), reads are sequential, and writes are append-mostly. The interview centerpiece is the leader-based architecture: one strongly-consistent master node holds the entire file namespace and chunk locations in memory, while many chunkservers store the actual data in 64-128 MB chunks replicated three times across racks. We cover the lease-based primary-replica protocol that lets the master stay out of the data path, the heartbeat-and-chunk-report mechanism that keeps cluster state fresh, and the federation strategy for scaling beyond a single master's memory.

design-distributed-file-system

case-study

infrastructure-storage

gfs

hdfs

distributed-file-system

chunk-server

namenode

leader-based

system-design

advanced

premium

Hard

Design a Content Delivery Network

Design a Cloudflare/Akamai/Fastly-style content delivery network that offloads 95%+ of static traffic from origin servers, brings latency from hundreds of milliseconds down to single digits, and absorbs DDoS attacks at the edge. The interview centerpiece is the cache hierarchy and routing: hundreds of edge POPs anycast-routed to the user's nearest location, a regional shield layer that consolidates fetches, and the origin only seeing the long tail of misses. We cover cache key design with Vary headers, the TTL lifecycle and purge model, stale-while-revalidate for resilience under origin outages, and the moves CDNs make to keep dynamic content fast (programmable edge functions, smart routing).

design-cdn

infrastructure-storage

stale-while-revalidate

865

Design Uber / Lyft (Ride-Sharing)

Design a ride-sharing service like Uber that matches a rider's request to a nearby driver in under 5 seconds, streams driver locations every 4 seconds, computes ETAs, and applies surge pricing in real time at 1M concurrent active drivers and 100K rides/min globally. The interview centerpiece is the dispatch path: how to find the nearest available driver, hold them briefly, and confirm the match without race conditions. We compare geohash, S2, and H3 for the driver index and recommend H3 hex grid for ride-sharing because hex neighbors are equidistant. We cover the trip state machine, surge multipliers per cell, and how location updates fan out without melting the network.

design-uber

ride-sharing-and-maps

285

System Design

Premium

Design Google Maps

Design Google Maps: a global mapping service that renders the Earth from 256x256 tiles, computes the shortest driving route in under 200 ms, and folds live traffic into routing for 1B users issuing 5B route requests per day. The interview centerpiece is the routing engine: how Dijkstra is too slow on a continent-scale graph and how Contraction Hierarchies (CH) precompute shortcuts so the live query is logarithmic. We cover the tile pyramid (zoom 0-20, ~1 trillion possible tiles at zoom 20), how live traffic from 100M Android phones updates edge weights every minute, and how to keep navigation latency under 1 second when re-routing.

design-google-maps

case-study

ride-sharing-and-maps

google-maps

graph-algorithms

dijkstra

a-star

contraction-hierarchies

routing-engine

map-tiles

tile-rendering

real-time-traffic

cdn

geospatial

h3-hex-grid

system-design

advanced

premium

584

Hard

Design Food Delivery (DoorDash)

Design a food delivery service like DoorDash that links three actors (customer, restaurant, courier) with an end-to-end SLA of <40 minutes per order at 10M orders per day across 500K restaurants. The interview centerpiece is the courier dispatch problem, which is fundamentally different from ride-sharing: it is a 3-leg trip (courier-to-restaurant, wait for food, restaurant-to-customer) and the platform routinely batches multiple orders onto one courier to cut cost. We compare Uber's 1:1 matching to DoorDash's many-to-1 batching, design the ETA composition (prep time + assignment time + drive time + handoff), and walk through the order state machine that coordinates three independent humans.

design-food-delivery