Most system design rounds I have sat are 60 minutes. This one was 45, on purpose, and the interviewer told me at the start that the abbreviated length was the test. The company was a mid-size B2B SaaS in NYC, around 800 engineers, the kind of place that does not show up on prep guides because they are not public-facing brand names. I was interviewing for a senior fullstack role and the design round was one of four onsite rounds.
I received the offer two weeks later and I declined for unrelated reasons (I took a competing offer at a different mid-size company with better infrastructure tenure). I am writing this round up because the 45-minute format taught me something the 60-minute format had not: which decisions are actually load-bearing and which decisions are decoration.
Why 45 minutes is a different round than 60
The interviewer said it directly at the top: "In 45 minutes, you cannot cover the full surface area. We are looking at how you triage. Which decisions did you spend time on, and which did you skip on purpose?"
That framing is the round. The standard 60-minute round rewards thoroughness. The 45-minute version rewards explicit triage, where you say out loud "I am skipping the auth layer because for this prompt it is not differentiating" and the interviewer either accepts the skip or pulls you back if your skip was wrong.
The prompt, paraphrased: "Design an in-product activity feed for a B2B collaboration tool. Per-tenant. Tens of thousands of tenants, low-thousands of users per tenant on the larger end. Feed events come from many sources inside the product. Reads dominate writes by roughly 100 to 1."
How I prepped specifically for the abbreviated format
The recruiter told me a week before onsite that the design round would be 45 minutes, not the 60 I had been practicing for. That single piece of information changed everything about how I prepped. I ran six 45-minute mocks in five days, with a kitchen timer and a friend playing interviewer. The first three were rough. I kept producing the same 60-minute design, just compressed, which meant I was racing the clock and never quite landing any single component. My rule for the second three was different: pick the one or two load-bearing decisions for the prompt, decide them in the first ten minutes, and accept that everything else gets a one-sentence skip with a defensible reason.
The mocks taught me a second thing I had not learned from 60-minute practice. The round that goes well in 45 minutes is the round where you and the interviewer agree on what the load-bearing call is within the first five minutes. If you are still discovering the load-bearing call at minute fifteen, you have already lost the round, because the back-half of the round is supposed to be defending and refining a decision, not arriving at one. The kitchen-timer mocks made this visceral in a way that abstract advice never had.
My triage in the first six minutes
I started by writing a triage list on the whiteboard before drawing anything:
The interviewer wrote that down and said, paraphrased: "Good. Two of those skips I would have called out. The other two are fine." I asked which two and they declined to say, which is fair ("go figure it out from the round"). In retrospect, the skip the interviewer might have wanted me to defend was the per-tenant-isolation one, because at this scale the noisy-neighbor case is real. I did spend time on it, just later. I think the thing they would have called out was "schema migrations" because feed schemas evolve and a poorly-designed event envelope locks you in.
The fan-out decision and why I committed to hybrid in minute eight
The load-bearing call was push-vs-pull. With reads dominating writes 100:1 and tenants ranging from 5 to several thousand users, a pure-pull design is wasteful at read time and a pure-push design has fan-out amplification on the write side that gets expensive for active tenants. I committed to hybrid in minute eight, with the rule:
The interviewer pushed on the boundaries ("why 200?") and my answer was that the boundaries were illustrative and the real numbers would come from measuring read latency under load on a sample tenant of each size. They accepted that. They pushed harder on the middle band: "Active in last 24h is a moving target. What happens when a user comes back after 48h?" My answer was a small reconciliation step at session-resume time, where the client asks the server for any events newer than the last cursor it has, and the server runs a pull-style query to fill the gap. The interviewer liked this and pushed once more: "What if the gap is large? User has been gone for two weeks." My answer was that we cap the fill, show a banner, and the user can pull older history on scroll.
The small bit of code that anchored this:
The read-path latency budget I produced
For the read path I gave a quick budget, in the same shape as the cloudflare round I had practiced for:
The interviewer's question: "Your pull-tenant total is close to the budget. What is the regression mode?" My answer was that the pull query is bounded by the per-tenant event-table size and the permission filter is bounded by the user's group membership, so the regression mode is a single tenant whose event volume grows out of band, not a global scaling problem. The mitigation is per-tenant rate limiting on the write path, with an alarm when a tenant's daily event count crosses a threshold. They wrote that down.
Permissioning was the round's hidden test
With twelve minutes left, the interviewer asked about permissioning at read time. "Not all events are visible to all users. How do you filter without blowing the budget?" This was the round's hidden test. The naive answer is to filter at read time, which works for small tenants and falls apart for the large ones. The better answer is to denormalize the per-event ACL into a compact bitmap stored on the event, so the read-time filter is a single bitwise-and against the user's group bitmap. I drew this:
The interviewer's pushback was on the size of the bitmap ("what if the tenant has thousands of groups?") and my answer was that for tenants above a threshold, the bitmap moves to a compressed representation (Roaring) and the filter becomes an intersection-non-empty check rather than a bitwise-and. They accepted this.
The sentence the interviewer wrote down
With two minutes left I gave a one-sentence summary: "The thing I am most worried about in this design is the boundary between the push tenants and the pull tenants, because the boundary is a moving target as tenants grow, and the wrong move on a tenant straddling the boundary will spike either write cost or read latency."
The interviewer wrote that sentence down on their notes. The recruiter mentioned it later as the moment the round flipped from a hire to a strong hire. Triage early, name the load-bearing call, defend it with numbers, and end with the part of the design that scares you. Forty-five minutes is enough time for that. It is not enough time for anything else.
