System Design Article
WebSockets, Long Polling & SSE
Difficulty: Medium
Standard HTTP is a request-response protocol: the client asks, the server answers. But many modern applications need real-time, bidirectional communication - chat messages, live notifications, stock tickers, collaborative editing, and gaming. This lesson covers three techniques for real-time communication: long polling, Server-Sent Events (SSE), and WebSockets. You will learn how each works, their trade-offs, and when to use which in a system design interview.
WebSockets, Long Polling & SSE
Standard HTTP is a request-response protocol: the client asks, the server answers. But many modern applications need real-time, bidirectional communication - chat messages, live notifications, stock tickers, collaborative editing, and gaming. This lesson covers three techniques for real-time communication: long polling, Server-Sent Events (SSE), and WebSockets. You will learn how each works, their trade-offs, and when to use which in a system design interview.
379 views
9
The Problem: HTTP Is Not Built for Real-Time
Standard HTTP follows a strict request-response pattern: the client sends a request, the server sends a response, and the connection is effectively idle until the next request. The server cannot spontaneously send data to the client.
This creates a problem for real-time applications:
Scenario: Chat Application
Alice sends a message to Bob. The server receives Alice's message, but how does Bob's client know there is a new message?
Option 1: Client polls regularly (Short Polling)
Bob's client: GET /messages?since=last_id (every 2 seconds)
Server: 200 OK {"messages": []} - nothing new
... 2 seconds later ...
Bob's client: GET /messages?since=last_id
Server: 200 OK {"messages": []} - still nothing
... 2 seconds later ...
Bob's client: GET /messages?since=last_id
Server: 200 OK {"messages": [{"from": "Alice", "text": "Hi!"}]} - finally!Problems with short polling:
- Wasted resources: 90% of requests return empty responses.
- Latency: Average delay of half the polling interval (1 second with 2-second polling).
- Server load: 1 million connected users polling every 2 seconds = 500K requests/second of mostly empty responses.
We need better solutions. Enter long polling, SSE, and WebSockets.
Long Polling
Long polling is an improvement over short polling where the server holds the request open until new data is available (or a timeout expires).
How Long Polling Works
Bob's Client Server
| |
| GET /messages?since=last_id |
|------------------------------------>|
| | (server holds connection open)
| | ... waiting for new data ...
| |
| | Alice sends a message!
| |
| 200 OK {"messages": [{...}]} |
|<------------------------------------|
| |
| Immediately re-send: |
| GET /messages?since=new_last_id |
|------------------------------------>|
| | (holds again...)Key Characteristics
- Near real-time: Messages arrive as soon as the server has them (no polling interval delay).
- HTTP-based: Works with existing HTTP infrastructure - load balancers, proxies, firewalls all support it.
- Server holds connections: Each client has a pending HTTP request. With 100K users, the server holds 100K open connections.
- Timeout and reconnect: The server should time out after 30-60 seconds and return an empty response. The client immediately reconnects.
Advantages
- Works everywhere (every browser, every proxy, every CDN supports HTTP).
- No special protocol or library needed.
- Easy to implement on both client and server.
- Reduces wasted empty responses compared to short polling.
Limitations
- One-directional push: The server can push to the client, but the client cannot push to the server over the same connection. The client must send a separate POST request to send data.
- Connection overhead: Each response/reconnection cycle requires a new HTTP request with full headers.
- Server resource usage: Holding many open connections consumes server resources (threads, memory).
- Not truly bidirectional: Sending data from client to server requires a separate HTTP request.
When to Use Long Polling
- Simple notification systems: "You have 3 new messages" - low frequency, simple data.
- Email clients: Check for new emails - low to medium frequency.
- Environments where WebSockets are blocked: Some corporate firewalls block WebSocket connections.
- Prototype or MVP: Quick to implement, no special infrastructure needed.
Server-Sent Events (SSE)
Server-Sent Events (SSE) is a standard that allows the server to push events to the client over a single, long-lived HTTP connection. Unlike long polling, the connection stays open and the server can send multiple events over time.
How SSE Works
Bob's Client Server
| |
| GET /events (Accept: text/event-stream)
|------------------------------------>|
| |
| HTTP/1.1 200 OK |
| Content-Type: text/event-stream |
| Connection: keep-alive |
|<------------------------------------|
| |
| data: {"type":"message", |
| "from":"Alice", |
| "text":"Hi!"} |
|<------------------------------------|
| |
| ... connection stays open ... |
| |
| data: {"type":"typing", |
| "user":"Alice"} |
|<------------------------------------|
| |SSE Event Format
event: message
data: {"from": "Alice", "text": "Hi!"}
id: 42
event: typing
data: {"user": "Alice"}
event: notification
data: {"count": 3}
id: 43
retry: 5000- event: Event type (clients can listen for specific types).
- data: The payload (usually JSON).
- id: Event ID for resuming after disconnection.
- retry: Reconnection interval in milliseconds.
Built-in Browser Support
const source = new EventSource('/events');
source.addEventListener('message', (event) => {
const data = JSON.parse(event.data);
console.log('New message:', data);
});
source.addEventListener('notification', (event) => {
const data = JSON.parse(event.data);
updateBadge(data.count);
});
// Automatic reconnection with Last-Event-ID header
source.onerror = () => console.log('Connection lost, reconnecting...');Advantages
- Simple: Uses standard HTTP. The
EventSourceAPI handles reconnection automatically. - Efficient: Single connection, no reconnection overhead (unlike long polling).
- Automatic reconnection: The browser reconnects automatically with the
Last-Event-IDheader, so the server can resume from where it left off. - Text-based: Easy to debug (you can see events in browser dev tools).
Limitations
- Server-to-client only: SSE is unidirectional. The client cannot send data over the SSE connection - it must use separate HTTP requests.
- Limited connections: Browsers limit SSE connections to ~6 per domain (HTTP/1.1). HTTP/2 largely solves this with multiplexing.
- Text only: SSE transmits text data (UTF-8). Binary data must be Base64-encoded.
- No native support in some environments: While all modern browsers support SSE, some server frameworks and proxies need configuration.
When to Use SSE
- Live feeds: News tickers, social media feeds, sports scores - server pushes updates, client just displays them.
- Notifications: Real-time notification counts, alerts.
- Progress updates: File upload progress, long-running job status.
- Stock tickers: Continuous price updates from server to client.
WebSockets
WebSockets provide a full-duplex, bidirectional communication channel over a single, long-lived TCP connection. Both the client and server can send messages at any time without waiting for a request.
How WebSockets Work
The Handshake (HTTP Upgrade)
WebSocket connections start as a regular HTTP request with an Upgrade header:
Client -> Server:
GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Server -> Client:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=After the handshake, the protocol switches from HTTP to WebSocket. The TCP connection remains open.
Bidirectional Communication
Client Server
| |
| [HTTP Upgrade Handshake] |
|<----------------------------------->|
| |
| === WebSocket Connection Open === |
| |
| {"type":"message","text":"Hi!"} |
|------------------------------------>|
| |
| {"type":"message","text":"Hello!"} |
|<------------------------------------|
| |
| {"type":"typing","user":"Alice"} |
|<------------------------------------|
| |
| {"type":"message","text":"Bye"} |
|------------------------------------>|
| |
| Ping/Pong (keepalive) |
|<----------------------------------->|
| |Key Characteristics
- Full-duplex: Both sides can send messages simultaneously.
- Low overhead: After the handshake, messages have minimal framing (2-6 bytes overhead vs. hundreds of bytes for HTTP headers).
- Persistent connection: The connection stays open for the lifetime of the session.
- Binary and text: Supports both text (UTF-8) and binary data.
- Ping/Pong: Built-in heartbeat mechanism to detect dead connections.
Advantages
- True real-time: Sub-millisecond message delivery in both directions.
- Minimal overhead: No HTTP headers on each message. Ideal for high-frequency messaging.
- Bidirectional: Client and server are peers - either can initiate communication.
- Binary support: Efficient for sending images, audio, or binary protocols.
Limitations
- Stateful connections: The server must maintain an open connection for each client. This complicates scaling.
- Load balancing complexity: Standard HTTP load balancers distribute requests. WebSocket connections are long-lived, so you need sticky sessions or a connection-aware load balancer.
- Not cacheable: WebSocket messages cannot be cached by CDNs or HTTP proxies.
- Firewall/proxy issues: Some corporate networks block WebSocket connections (port 80/443 with Upgrade header).
- Reconnection logic: Unlike SSE, WebSockets have no built-in automatic reconnection. You must implement retry logic yourself.
When to Use WebSockets
- Chat applications: WhatsApp, Slack, Discord - bidirectional, high-frequency messaging.
- Collaborative editing: Google Docs - real-time cursor positions, text changes.
- Multiplayer gaming: Low-latency, bidirectional game state updates.
- Financial trading: Real-time order book updates, trade execution.
- Live dashboards: Monitoring systems with real-time metrics.
Comparing All Four Techniques
Head-to-Head Comparison
| Feature | Short Polling | Long Polling | SSE | WebSocket |
|---|---|---|---|---|
| Direction | Client -> Server | Server -> Client | Server -> Client | Bidirectional |
| Latency | High (polling interval) | Low-Medium | Low | Very Low |
| Connection overhead | New connection per poll | New connection per event | Single persistent connection | Single persistent connection |
| HTTP compatible | Yes | Yes | Yes | Initial handshake only |
| Proxy/CDN friendly | Yes | Mostly | Yes | Varies (some block) |
| Browser support | Universal | Universal | All modern browsers | All modern browsers |
| Auto-reconnection | N/A (client controls) | Client responsibility | Built-in | Must implement |
| Binary data | Yes (in body) | Yes (in body) | No (text only) | Yes |
| Server complexity | Low | Medium | Low-Medium | High |
| Scaling difficulty | Low | Medium | Medium | High |
| Message frequency | Low | Low-Medium | Medium-High | High |
Decision Matrix: Which Technique for Which Problem?
| Use Case | Best Choice | Why |
|---|---|---|
| Simple notifications ("3 new emails") | Long Polling or SSE | Low frequency, server-to-client only |
| Live sports scores | SSE | Server pushes updates, no client-to-server needed |
| Chat application | WebSocket | Bidirectional, high frequency |
| Stock ticker | SSE or WebSocket | SSE for display-only; WebSocket if client sends trades |
| Online multiplayer game | WebSocket | Low latency, bidirectional, binary data |
| Collaborative editing | WebSocket | Bidirectional, real-time cursor and text sync |
| Social media feed (new posts) | SSE | Server pushes new posts; "like" is a separate HTTP POST |
| IoT sensor data | WebSocket | High-frequency bidirectional data + commands |
| Long-running job progress | SSE | Server reports progress, client just displays |
| Presence indicators ("user is online") | WebSocket | Bidirectional heartbeats, real-time status |
Scaling Real-Time Systems
Real-time connections are stateful, which makes scaling fundamentally different from stateless HTTP APIs.
The Core Challenge
With a stateless REST API, any server can handle any request. With WebSockets, a client has a persistent connection to a specific server. If you have 3 WebSocket servers, User A might be connected to Server 1 and User B to Server 2. When A sends a message to B, Server 1 must somehow deliver it to Server 2.
Solution 1: Pub/Sub Message Broker
[Client A] <--WS--> [Server 1] --publish--> [Redis Pub/Sub] --subscribe--> [Server 2] <--WS--> [Client B]- When Server 1 receives a message from Client A destined for Client B, it publishes the message to a Redis Pub/Sub channel.
- Server 2 subscribes to relevant channels and forwards the message to Client B over the existing WebSocket connection.
- This is the most common pattern used by Socket.IO, Discord, and similar real-time systems.
Solution 2: Consistent Hashing
Route all connections for a specific chat room or user to the same server using consistent hashing. This reduces cross-server communication but limits horizontal scaling.
Solution 3: Dedicated Connection Manager
Separate the connection management (which client is on which server) from the business logic:
[Connection Manager] [Business Logic Servers]
- Tracks all WS - Stateless
connections - Processes messages
- Routes messages - Queries databases
to correct serverScaling Challenges & Solutions
| Challenge | Solution |
|---|---|
| Connection limits (OS file descriptors) | Increase ulimit; use epoll/kqueue for efficient I/O multiplexing. A single server can handle ~100K-1M connections. |
| Memory per connection | Minimize per-connection state. A WebSocket connection uses ~2-10KB of memory. 1M connections = 2-10GB RAM. |
| Load balancer routing | Use sticky sessions (based on cookie or IP) or use a layer 4 (TCP) load balancer that forwards the initial HTTP Upgrade. |
| Cross-server messaging | Redis Pub/Sub, Kafka, or a dedicated message broker for routing messages between servers. |
| Reconnection storms | If a server restarts, all its clients reconnect simultaneously. Implement exponential backoff with jitter. |
| Health monitoring | WebSocket connections do not generate regular HTTP requests. Use ping/pong frames to detect dead connections. |
Real-World Scale Numbers
- Discord: Handles millions of concurrent WebSocket connections across thousands of servers, using Rust-based connection gateways.
- Slack: Uses a combination of WebSockets for real-time messaging and HTTP for API calls, with a connection manager service.
- WhatsApp: Uses a custom protocol over TCP (not standard WebSockets) to handle 2+ billion users with remarkably small server fleet.
Real-Time Communication in Interviews
How to Choose in an Interview
When an interviewer asks you to design a real-time feature, follow this decision process:
Step 1: Identify the communication direction
- Server-to-client only? -> SSE or Long Polling
- Bidirectional? -> WebSocket
Step 2: Assess the message frequency
- Low frequency (notifications, email)? -> Long Polling or SSE
- Medium frequency (live feed, scores)? -> SSE
- High frequency (chat, gaming)? -> WebSocket
Step 3: Consider the scaling requirements
- Small scale or prototype? -> Long Polling (simplest)
- Large scale, server-to-client? -> SSE (efficient, simple)
- Large scale, bidirectional? -> WebSocket + Pub/Sub
Step 4: Address the scaling challenge
- Always mention how you would handle cross-server message delivery (Redis Pub/Sub, Kafka).
- Mention sticky sessions or layer 4 load balancing for WebSocket connections.
- Discuss reconnection strategy (exponential backoff with jitter).
Example: "Design a Notification System"
Good answer: "For real-time notifications, I would use Server-Sent Events. The client opens a single SSE connection to receive notifications. SSE is simpler than WebSockets and sufficient here because notifications only flow from server to client. If the connection drops, the browser automatically reconnects with the Last-Event-ID header so the server can replay missed events. For users who are offline, notifications are stored in a database and delivered on next connection."
Why this is strong: Justifies the choice (SSE over WebSocket), explains the reconnection strategy, and handles the offline case.
Example: "Design WhatsApp"
Good answer: "For real-time messaging, I would use WebSockets. Each client maintains a persistent WebSocket connection to a gateway server. When User A sends a message to User B, the gateway publishes the message to a Redis Pub/Sub channel keyed by User B's connection. The gateway server holding User B's connection receives the event and forwards it. For offline users, messages are stored in a queue and delivered when they reconnect. I would use consistent hashing to route connections by user ID to reduce cross-server communication."
Why this is strong: Chooses WebSocket with clear reasoning, addresses cross-server routing, handles offline delivery, and optimizes with consistent hashing.
Real-World Examples
How real systems implement this in production
Slack uses WebSockets for real-time messaging and presence indicators. Each client maintains a persistent WebSocket connection to a gateway service. When a message is sent, it is routed through backend services and pushed to all relevant connected clients via their WebSocket connections. Slack falls back to long polling if WebSockets are blocked.
Trade-off: WebSockets provide the best real-time experience but require connection management infrastructure and sticky load balancing. The long polling fallback ensures universal accessibility at the cost of slightly higher latency.
GitHub uses Server-Sent Events to stream build logs and workflow status updates in real-time. When you view a running GitHub Actions workflow, an SSE connection pushes new log lines as they are generated. This is unidirectional (server to client) and fits SSE perfectly.
Trade-off: SSE is simpler to implement and scale than WebSockets for this use case, since log streaming is purely server-to-client. The trade-off is that SSE does not support binary data, so log content must be text.
Discord handles millions of concurrent WebSocket connections using a custom gateway written in Rust. Each gateway server manages thousands of connections and communicates with backend services via a message broker. Discord uses heartbeating (ping/pong) to detect dead connections and implements session resumption so clients can reconnect without missing messages.
Trade-off: Discord invested heavily in custom infrastructure (Rust gateway, session resumption, connection management) to achieve sub-second message delivery at massive scale. This engineering investment is justified by their real-time chat and voice use case.
Quick Interview Phrases
Key terms to use in your answer
Common Interview Questions
Questions you might be asked about this topic
Sticky sessions via load balancer, pub/sub (Redis) for cross-server messaging, separate connection manager from business logic, horizontal scaling with consistent hashing, heartbeats for cleanup.
SSE for server-to-client only (dashboards, notifications, live scores). Simpler - works over HTTP, auto-reconnects, no special infrastructure. WebSockets for bidirectional (chat, gaming, collaboration).
Client-side exponential backoff, last-event-ID for resumption, server-side message buffering with TTL, queue undelivered messages per connection, idempotent message processing.
Short polling: repeated requests (wasteful). Long polling: hold request until data (near real-time). SSE: server push over HTTP (one-way). WebSockets: full-duplex persistent connection (both ways). Each has trade-offs in complexity, scalability, and browser support.
Interview Tips
How to discuss this topic effectively
Always justify your real-time technique choice. Do not just say 'I will use WebSockets.' Say 'I will use WebSockets because this chat system requires bidirectional communication with low latency. SSE would not work because the client also needs to send messages in real-time.'
When designing a WebSocket-based system, immediately address the scaling challenge: 'Since WebSocket connections are stateful, I need a message broker (Redis Pub/Sub) for cross-server message delivery, and I will use sticky sessions at the load balancer level.'
Mention the fallback strategy. 'If WebSockets are blocked by a corporate firewall, the client falls back to long polling.' This shows you think about real-world deployment constraints.
For notification systems, SSE is often the better choice over WebSockets. Notifications flow server-to-client, and SSE has built-in reconnection with event replay. Mentioning this shows nuanced understanding.
Know the connection limits: a single server can handle ~100K-1M WebSocket connections (depending on memory and message frequency). If the interviewer asks about 10M concurrent users, you need multiple gateway servers with a pub/sub layer.
Always handle the offline case: 'Messages sent to offline users are stored in a message queue. When the user reconnects, missed messages are replayed from the queue.'
Common Mistakes
Pitfalls to avoid in interviews
Using WebSockets for everything that needs 'real-time' updates
Many 'real-time' features only need server-to-client push (notifications, live feeds, progress updates). SSE is simpler, has built-in reconnection, and scales more easily than WebSockets. Reserve WebSockets for truly bidirectional use cases like chat and gaming.
Ignoring the statefulness of WebSocket connections when discussing scaling
WebSocket connections are stateful - each client is connected to a specific server. This means you cannot simply add more servers behind a stateless load balancer. You need sticky sessions, a connection registry, and a message broker for cross-server communication.
Forgetting to handle reconnection and missed messages
Connections drop. Networks fail. Servers restart. Your design must handle reconnection gracefully: implement exponential backoff with jitter, store messages for offline users, and use message IDs or timestamps to replay missed events.
Assuming the load balancer will 'just work' with WebSockets
Standard HTTP load balancers (layer 7) may not correctly handle the WebSocket upgrade handshake. You need either a layer 4 (TCP) load balancer or a layer 7 load balancer specifically configured for WebSocket support (e.g., AWS ALB supports WebSockets natively).
