The Stripe system design round I sat through was 60 minutes, anchored on a single prompt: design a webhook delivery system. I had been warned that Stripe loops over-index on payments-shaped reliability questions, and this round was the cleanest example of why. The diagram I produced in the first 20 minutes was fine. The diagram I produced in the last 20 minutes, after the interviewer pushed back twice, was the one that actually got me to the next round.
I received the offer about two weeks later and signed it. I am writing this so the next person who walks into a Stripe design round understands what the interviewer is actually grading, which is not the topology.
How a payments-shaped design round differs from a generic one
The prompt was deliberately under-specified. Verbatim paraphrase: "Design a system that lets us deliver event notifications to merchant endpoints over HTTPS, with delivery guarantees we can defend." That phrasing, with "guarantees we can defend", was the tell. In the loops I have done, FAANG-style design rounds tend to grade on whether you cover the standard four boxes (load balancer, queue, worker pool, store). The Stripe round graded on whether I could state the failure modes I was choosing to live with, by name, with the merchant impact spelled out.
I started with the obvious skeleton:
The interviewer let me run for about eight minutes on this and then asked the question I was not prepared for: "What does the merchant see when our worker crashes after the HTTPS request goes out but before we record the ack?"
The failure mode I missed
I had implicitly assumed at-least-once delivery and an idempotency key on the merchant side. That is the textbook answer. The interviewer pushed me past textbook by asking what the merchant sees in the specific failure window where the request lands at their server, their handler runs (charging a card, sending an email, mutating their DB), and our worker dies before persisting the ack. On the next retry, the merchant gets the same event again and, if their idempotency layer is anything less than rigorous, the side effect runs twice.
My first instinct was to wave at the merchant: "They should be idempotent." The interviewer's response, paraphrased: "That is true, and it is also the response that gets us paged at 2am when a real merchant has a real outage. What can our system do?"
This was the moment the round turned. The expected answer was not a single fix. It was a layered set of mitigations, each with a cost stated:
Each line of that list was a 4-5 minute sub-discussion. The interviewer was not looking for me to invent these. They were looking for me to acknowledge that any honest answer involved an explicit set of tradeoffs the platform owner has to defend in writing.
The artifact that closed the round
With about 12 minutes left, the interviewer asked for a sketch of the worker's persistence write path. I drew this, in pseudocode, on the whiteboard:
The comment about the crash window mattered more than the code. I said it out loud while writing it: "There is a window between the response landing and the persistence write where a process crash will cause a duplicate on retry. We cannot remove this window without two-phase commit on the merchant side, which they will not do. We make it observable instead."
The interviewer wrote that down. After the round, the recruiter relayed that the design panel had specifically called out "named the unfixable window and instrumented around it" as the moment the round turned positive.
The unfixable window is the round
The first 20 minutes of a Stripe design round will feel like a normal load-balancer-and-queue exercise. The signal you are being graded on starts when the interviewer asks about a specific failure window and you have to choose between three uncomfortable answers (live with it, push the cost to the merchant, or carry the cost on the platform). Have a position. Defend it with the cost stated. The interviewer is not testing the topology. They are testing whether you have ever owned the pager on a system shaped like this.
