Behavioral Interview Guide
Driving Results & Delivering Impact
Difficulty: Medium
Driving-results questions are the execution probe. They test whether you can take a project from kickoff to a measured outcome, owning the result rather than the activity. This lesson defines the difference between delivering work and driving results, walks through how to demonstrate end-to-end ownership when the credit is shared, breaks down the four sub-skills interviewers grade (anticipating blockers, removing them proactively, working through cross-team stalls, and not confusing effort with impact), and provides fully worked model STAR answers for the six prompts you will hear most. After this lesson you will be able to take any shipped project and tell the story so the rubric reads ownership of outcome, not just hours worked.
Driving Results & Delivering Impact
Driving-results questions are the execution probe. They test whether you can take a project from kickoff to a measured outcome, owning the result rather than the activity. This lesson defines the difference between delivering work and driving results, walks through how to demonstrate end-to-end ownership when the credit is shared, breaks down the four sub-skills interviewers grade (anticipating blockers, removing them proactively, working through cross-team stalls, and not confusing effort with impact), and provides fully worked model STAR answers for the six prompts you will hear most. After this lesson you will be able to take any shipped project and tell the story so the rubric reads ownership of outcome, not just hours worked.
970 views
10
Why This Competency Matters
When interviewers ask 'tell me about a time you exceeded expectations' or 'describe a project where you owned the outcome end to end', they are not testing whether you completed work assigned to you. They are probing four signals at once:
[ Outcome ownership ] Did you treat the result as yours, or just the output?
[ Anticipation ] Did you see blockers coming, or only react?
[ Cross-team push ] Did you move other teams when the project required it?
[ Calibration ] Did you measure what mattered, not what was easy?This matters at every level but the unit of work shifts. At L3 and L4, the result is usually a feature or a service component. At L5, it is a multi-team initiative. At staff and above, it is a metric the org cares about, often without a single team that owns it. The same competency, four different scales.
Candidates underperform on this competency for one of three reasons. They tell stories about work they did rather than outcomes they owned. They confuse effort (long hours, weekend work, late nights) with impact. Or they describe a delivery story but never say what changed downstream after the launch, leaving the rubric empty on the most important row. This lesson fixes all three.
The Difference Between Delivering Work and Driving Results
The distinction is sharp and the rubric grades it directly:
[ Delivered work ] The thing was built and shipped
[ Drove results ] The metric the work was supposed to move actually movedA candidate who shipped a feature on time but cannot say how it affected the metric the feature was meant to improve has delivered work, not driven results. A candidate who shipped a feature one week late but can show that engagement on the relevant flow rose 18% sustained over the next quarter has driven results.
The three hallmarks of a story that scores as 'drove results' rather than 'delivered work':
- The metric is named in advance. You can say what 'success' meant before launch, in numbers, and you measured against it after.
- The downstream impact is real. Something that mattered to the org changed because of the work, not just because you finished it.
- The ownership outlasted the launch. You kept watching the metric for a quarter or more and acted on what you saw, rather than declaring victory at the launch and moving on.
If any of those three are missing, the story you have is a delivery story. It can still score, but it will not score on this competency at the highest level.
What Great Looks Like (Rubric)
Strong driving-results answers tend to score on six named signals.
1. The success metric was defined before the work started.
Strong candidates name the success metric in the Situation or Task: 'we agreed up front that the project would be considered successful if monthly active users on the import flow rose by 15% sustained over a quarter'. Without this, the Result row reads as retrofit.
2. You anticipated blockers, not just reacted.
The rubric distinguishes between candidates who handled blockers as they came up and candidates who saw them coming and headed them off. Strong stories include at least one beat like 'I noticed in week two that the data team's pipeline was going to be the long pole, so I started the conversation about prioritisation in week two rather than week six when we would have hit the wall'.
3. You removed cross-team stalls actively.
Most projects of any size hit a stall on a peer team's roadmap. Strong stories show the candidate driving through that stall: a structured ask, an explicit escalation if needed, an offer to do part of the work. The story does not have to end with the stall fully resolved; it can end with 'we accepted the slip and adjusted scope', as long as the candidate owned the response.
4. The result is quantified and downstream.
Not just 'we shipped on time' but 'we shipped on time and the metric we cared about moved by X%, sustained for Y'. The downstream-and-sustained part is what separates results stories from delivery stories.
5. The story names what was deferred or cut.
Real driving-results work involves scope decisions. Strong stories name what was de-scoped explicitly: 'we cut the admin-UI work because it would have slipped the launch by three weeks; we shipped a Slack-based interim that covered 80% of the use case'. This signals that the candidate understands the difference between 'shipped everything' and 'shipped the right thing'.
6. The reflection is about what you would change in execution, not just outcome.
A reflection like 'I would have caught the data dependency two weeks earlier' is execution-level and reads as growth. A reflection like 'we should have done a different project' is outcome-level and reads as second-guessing the original decision rather than improving how you drive.
The Four Sub-Skills Interviewers Grade
1. Anticipating blockers. The cheapest way to remove a blocker is to see it before it lands. Strong candidates name a blocker they saw coming three weeks ahead of when it would have hit and what they did about it. This is the highest-leverage sub-skill in this competency.
2. Removing blockers proactively. When a blocker does land, did you wait or did you act? 'I noticed the integration test environment was unstable on day three; I spent half a day fixing it rather than letting it slow the team for two weeks' is the kind of small story that fits inside a bigger answer and signals proactive ownership.
3. Working through cross-team stalls. Most consequential projects stall on someone else's roadmap. The candidate who can describe a structured ask to the other team's lead, an offer to absorb part of the cost, and an escalation path held in reserve, scores on this sub-skill. Cross-link with the leading-without-authority lesson; the mechanisms overlap.
4. Not confusing effort with impact. A story that leans on 'I worked weekends' or 'we pulled long hours' as the proof of ownership reads as low-leverage. Strong candidates name the choices that made the work possible (scope cuts, sequencing, reusing existing tooling), not the hours that made it painful. Effort is sometimes part of the story, but it should not be the main signal.
Common Questions & Model Answers
The six prompts below cover roughly 90% of how this competency is probed. Each is a two-minute STAR answer that scores on the rubric above.
Prompt 1: 'Tell me about a time you exceeded expectations.'
Model answer (strong, beat-the-target with explicit metric)
'In Q3 2023 I was an L4 engineer at a 400-person SaaS company, owning the project to migrate our pricing-page conversion flow off our legacy form library. The team had committed to a 4% improvement in conversion as the success criterion, against a baseline of 7.8% checkout-completion rate. We had eight weeks.
I owned the outcome of the project, including the metric, not just the migration.
What I did differently from a straight migration was front-loading the user-experience review. In week one, I sat with our PM and pulled three months of session-replay data from the legacy form. I noticed two specific failure patterns: about 12% of users were dropping off at the credit-card field on mobile because of an autocomplete conflict, and another 6% were bouncing on the address field because the validation fired before the user had finished typing. Neither of these would have been fixed by a clean migration; they needed deliberate work. I added two narrowly-scoped fixes to the migration plan and de-scoped the redesign of the success page (which was on the original list) to make room.
I anticipated one cross-team blocker: our analytics team was the only group that could measure the new conversion metric reliably, and they had a packed quarter. I had a 30-minute conversation with their lead in week two, agreed on the specific instrumentation events we needed, and shipped the events into the new flow in week four so they had time to wire them into the dashboard before launch.
We launched on schedule. In the first 30 days post-launch, the checkout-completion rate hit 9.6%, against the 8.1% target (the original 4% target was relative to the 7.8% baseline). I kept watching for the next two quarters; the lift held. The downstream effect was approximately $1.8M in annualised revenue from the conversion lift, by the finance team's accounting. The thing I would do differently is run the session-replay analysis a quarter earlier; the two fixes that made the difference came from data we had had access to for months.'
What lands: explicit success metric defined upfront, an anticipated blocker handled in week two not week six, a scope cut that made room for higher-value work, the metric named precisely (with the relative-vs-absolute distinction), and a sustained-result horizon (two quarters of monitoring).
Prompt 2: 'Describe a time you delivered under pressure.'
Model answer (strong, time pressure with a credible scope decision)
'In Q4 2022 I was on a four-engineer team at a startup, and we had committed to launching a new compliance feature by the end of the quarter to support a contract we had signed with a regulated customer. About three weeks before the deadline, our customer added a requirement we had not expected: the feature needed to be operable from their existing single-sign-on identity provider, which we did not support. The contract terms made it effectively non-negotiable.
I owned the delivery outcome and the scope decisions to make the deadline.
What I did was sequence the new requirement against the rest of the work. I made three calls. First, I cut the admin UI from the launch scope, replaced it with a Slack-based workflow that covered the customer-facing case but skipped the internal management UI; I had pre-checked with the customer that they would not see the admin UI directly, so this was a defensible cut. Second, I split the SSO work into a minimum-viable integration (one provider, one identity attribute) rather than the multi-provider abstraction we had originally been considering. Third, I asked our security lead for two days of pair-programming time on the SSO integration in week one of the new requirement, because their context would save more time than two engineers grinding alone.
We shipped on time. The customer onboarded successfully in the first week post-launch. The admin UI shipped two months later, on the originally promised timeline minus the launch-pressure week. SSO has since been extended to two more providers using the same shape, and the early scope decision held up.
The reflection is about pressure tolerance: I knew within 36 hours of getting the new requirement that we were going to need to cut something. I sat with the discomfort for another week before making the call, hoping a better path would appear. It did not, and the week of delay narrowed our margin. I now make the scope cut decision within 48 hours of identifying the pressure, even when it is uncomfortable, because the cost of waiting is concrete and the upside of waiting rarely materialises.'
What lands: a real pressure source (contract, regulated customer), three specific scope decisions named, a structured pairing ask to a cross-functional partner, a sustained outcome (the SSO pattern reused twice), and a reflection that names a behavioural change in how the candidate now handles scope-cut pressure.
Prompt 3: 'Walk me through a project where you owned the outcome end to end.'
Model answer (strong, multi-quarter ownership with sustained metric)
'In Q2 2024 I was a senior engineer at a B2B startup leading the project to reduce our customer onboarding time, which had become the largest source of churn in the first 90 days. Average time from signup to first-value (defined as the customer running their first successful workflow) was 11 days against a target of under 3, and we were losing about 18% of new accounts in that window.
I owned the metric, not just the engineering work that fed into it.
I started by writing a one-page diagnosis: where the 11 days actually went. The data showed roughly 4 days waiting on us (manual provisioning steps), 3 days on the customer side (account verification), and 4 days in a back-and-forth on our integration setup that involved three internal teams. The work then split three ways: automating the provisioning (engineering), redesigning the verification flow (PM and design), and consolidating the integration handoff into a single contact (customer success). I owned the engineering work directly and acted as the project lead across the other two streams without formal title.
Anticipated blocker, handled early: I noticed in week two that the design work would not be done in time for an end-of-quarter launch unless we lifted constraints from the existing pattern library. I had a 45-minute conversation with our design lead in week three, we agreed to use existing patterns for v1 with a v2 redesign scheduled for the following quarter, and the launch stayed on track.
Cross-team stall, handled actively: in week six, customer success was overloaded and unable to write the new playbook. I drafted a first version myself based on the calls I had observed, sent it to their lead with a 'is this roughly right, please edit', and they returned it within three days with the changes they cared about. The work landed about 10 days later than originally planned but did not slip the launch.
Outcome: time-to-first-value dropped from 11 days to 4.3 in the first 60 days post-launch, against the under-3 target. We did not hit the target, but we cut the largest source of churn meaningfully. The 90-day retention rate rose from 82% to 91%, and we tracked it for the following four quarters; the lift held. The two follow-up projects (the v2 design, and a further automation pass) closed the remaining gap to 2.8 days within two more quarters. The thing I would do differently is set up the cross-team coordination on a weekly cadence from week one, not week three; the two-week lag in setting up the coordination cost us about a week of slack.'
What lands: end-to-end ownership including non-engineering work the candidate did not nominally own, anticipated and resolved blockers, sustained metric (four quarters), honesty about not hitting the target, named the specific number-on-number movement, and a process reflection.
Prompt 4: 'Tell me about a time you had to push through significant obstacles to deliver.'
Model answer (strong, multiple blockers in sequence)
'In Q1 2023 I was leading the rebuild of our reporting service, a six-week project that ended up taking nine. We hit three obstacles in sequence: an unexpected regulatory requirement that landed in week two and added two weeks of scope, a key engineer leaving the company in week four (which forced me to redistribute work across the remaining three), and a third-party API outage in week seven that cost us 18 hours of working time at the worst possible moment.
I owned the delivery and the response to each obstacle.
For the regulatory scope, I made the call in 48 hours to absorb the slip rather than cut a planned feature, because the feature had already been promised to a customer we did not want to renegotiate with. I communicated the slip to stakeholders in week three with a revised end date and a brief on the trade-off. For the engineer leaving, I redistributed work the same week and pulled in a half-time contractor for the most context-dependent piece. For the API outage, we already had a fallback plan, but the test coverage on the fallback was thin; I spent that day pair-programming on backup tests with the team rather than waiting for the API to come back, which turned the outage into useful coverage work rather than dead time.
We shipped at week nine, three weeks late versus the original plan, on the revised plan I had communicated in week three. Customer onboarding adopted the new reporting in the first month. The downstream effect over the next two quarters: the customer support team reported a 60% drop in reporting-related tickets, against the 40% improvement we had targeted. The fallback test coverage we wrote during the API outage caught two real failures in the following quarter that would have caused incidents. The reflection: I should have communicated the original timeline as a range, not a point estimate. I had built in 20% slack but the slack was implicit; communicating it as a 6-to-7 week range up front would have made the eventual three-week slip a smaller stakeholder cost.'
What lands: three real obstacles each handled with a different mechanism, the slip communicated honestly and early, a quantified outcome that beat the revised target, a serendipitous benefit (test coverage from outage time), and a specific communication-style reflection.
Prompt 5: 'Describe a project where you had to drive results through people you did not manage.'
Model answer (strong, cross-functional with explicit no-authority framing)
'In Q3 2022 I was an L5 engineer leading the project to reduce our support ticket volume on the billing service, which had been growing at about 8% per quarter. The work crossed four teams: our team owned the service, but the actual reduction came from changes in product (clearer error messages), customer support (better tagging and triage), docs (a new self-service section), and finance (a billing-policy clarification). I had no formal authority over any of the other three.
I owned the metric (ticket volume) and the cross-team coordination, even without title.
I started with the diagnosis: a one-week analysis of three months of tickets showed that 60% of the volume came from four issue patterns, each of which had a root cause owned by a different team. I wrote a one-page note, sent it to the four leads, and asked for a 30-minute meeting to align on which patterns each team would take on. Three of the four said yes immediately; the fourth (finance) initially said no because of bandwidth. For finance, I scoped the work down to a single policy clarification that would close one of the four patterns, framed it in terms of audit risk reduction (which they cared about), and they agreed to a half-day investment over two weeks.
The work shipped in waves over the following quarter. I held a weekly 20-minute sync where each team reported progress on their slice. I did the project tracking myself, in a single shared doc, because no one else had the time and the value of the visibility was concentrated. When the docs team slipped two weeks because of an unrelated launch, I picked up the most critical pattern myself and they shipped the rest on a slower cadence.
The outcome: support ticket volume on billing dropped 38% in the first 60 days post-launch, against a target of 25%, and the lift held over the next two quarters. The reflection: I underestimated how much value the weekly 20-minute sync added; it was the highest-leverage 20 minutes of my week for that quarter, and I now default to a similar cadence on any cross-team initiative.'
What lands: explicit cross-functional surface (four teams), data-led diagnosis, a concession on scope to bring the reluctant team along, the candidate doing project-management work themselves because no one else could, a quantified result, and a generalised lesson about cadence.
Prompt 6: 'Tell me about a time you delivered a result you were not sure you could.'
Model answer (strong, calibrated uncertainty before commit)
'In Q4 2023 I was an L5 engineer and I committed to delivering a 40% reduction in build time for our monorepo over a single quarter. Build time had crept from about 14 minutes to 38, and our directors had asked for proposals from any senior engineer who wanted to own the project. I volunteered with about 60% confidence I could hit 40%; the realistic range I gave my manager was 25 to 50%, with the midpoint as my honest expectation.
I owned the work and the metric.
I started with profiling and got a clear picture in the first two weeks: caching gaps, an over-eager test parallelism setting that was actually serialising under load, and a dependency graph that was forcing recompilation of unrelated modules. I sequenced the three streams: the cache work first because it was the lowest-risk and would also clarify the measurement, the test parallelism second because the cache fix would change the profile, and the dependency graph last because it was the deepest change.
Two thirds of the way through the quarter, I had cut build time from 38 minutes to 24, which was about 37%. The dependency-graph work, which I had estimated as the largest remaining lever, turned out to require a 1.5-quarter rewrite to do safely; I had under-estimated the blast radius. Rather than over-commit, I cut a smaller version of that work that addressed the highest-impact 30% of the graph, and shipped it in the last three weeks.
Final outcome at end of quarter: build time at 19 minutes, which was a 50% reduction. The remaining graph work was scheduled for the following two quarters and brought build time down to 14 minutes by month nine. The reflection: I had estimated my confidence at 60% going in, and the result landed in the upper half of my range. The thing I would change is that I had under-quantified the dependency-graph work in my proposal; I should have flagged in the original doc that this lever had a wide variance and could land as a follow-on. I now default to wider confidence intervals on any work that touches a dependency graph or a build system, because they are systematically harder to estimate than they look.'
What lands: explicit confidence and range stated up front, sequenced work that allowed for adjustment, an honest 'I under-estimated this lever', a final result that beat the calibrated commit, and a generalised lesson about estimation bias for specific kinds of work.
Pitfalls Specific to This Competency
Four traps that show up most often in driving-results answers:
1. Effort framed as impact. 'I worked weekends to make the deadline' reads as low-leverage. Strong candidates name the choices that made the deadline possible (scope cuts, sequencing, reusing existing tooling), not the hours. Effort is sometimes part of the story but should not be the main signal.
2. Delivery-only stories with no downstream metric. 'We shipped on time and the team was happy' is a delivery story, not a results story. Force the answer past the launch into what changed downstream over the following quarter or two.
3. The 'and then I worked harder' close. Stories where every obstacle is overcome by the candidate just doing more reads as either superhero (low credibility) or as a candidate with no leverage other than their own time. Strong stories overcome obstacles with structural moves: scope cuts, escalations, reusing existing work, getting help from the right person.
4. Retroactive metric-fishing. A success metric named only after the work was done, or chosen because it happens to look good, reads as retrofit. Strong stories name the success metric in the Situation or Task, ideally as a number agreed with stakeholders before the work started.
Practice Prompts & Exercises
For each prompt below, draft a 250 to 350 word STAR answer. For each story, mark explicitly the success metric (in numbers, defined before the work) and at least one anticipated blocker you handled before it landed.
- Tell me about a time you exceeded expectations on a project.
- Describe a time you delivered under significant pressure.
- Walk me through a project where you owned the outcome end to end.
- Tell me about a time you had to push through obstacles to deliver.
- Describe a project where you drove results through people you did not manage.
- Tell me about a result you were not initially sure you could deliver.
- Walk me through a delivery where you had to make a hard scope cut.
For every story, run the rubric: did you name the metric in advance, did you anticipate at least one blocker, did you have a sustained-result horizon, did you name what you cut, was your reflection about execution rather than choice of project?
Bridge / Cross-References
This lesson closes the Leadership & Ownership category and ties together threads from the prior three lessons. The most useful Foundations companions:
quantifying-impactis the lesson that powers the Result row in every model answer above. Real numbers, real baselines, real downstream effects come straight from that lesson.tailoring-stories-to-rolecovers how to scale the same delivery story for L4, L5, staff, and management contexts.interviewing-for-senior-rolescovers level-calibration for the unit of work; staff and above expect cross-team metric ownership, not feature-level delivery.
The next lesson begins the Teamwork & Collaboration category, starting with Cross-Team Collaboration. The two categories overlap heavily: most strong driving-results stories also have a cross-team collaboration beat, and many cross-team stories are also driving-results stories. The difference at the rubric level: this category grades the outcome ownership, the next grades the collaboration mechanics.
Quick Interview Phrases
Key terms to use in your answer
Test Your Understanding
Self-check questions to confirm you grasped this lesson
Delivering work means the thing was built and shipped. Driving results means the metric the work was supposed to move actually moved, sustained for a meaningful window. The three hallmarks of a results story: the success metric was defined before the work started, the downstream impact was real and measured, and the ownership outlasted the launch into the following quarter or two. A delivery story can still score, but not at the highest level on this competency.
Anticipation is the rare skill. Most candidates can describe how they handled a blocker once it landed. Far fewer can describe seeing a blocker three weeks before it would have hit and heading it off. Anticipation signals leverage: the candidate uses information available to them now to remove future cost. Reaction signals competence at execution but not foresight. Strong stories include at least one anticipated and pre-handled blocker.
Always, when the cut was deliberate and the criterion that drove it was named. Saying 'we cut the admin UI to make room for the SSO requirement, because the customer would not see the admin UI directly' shows judgement: that the candidate understood the difference between shipped-everything and shipped-the-right-thing. Stories that claim no cuts were necessary usually means the candidate did not pick, which is itself a low-judgement signal at senior levels.
An execution-level reflection is about how you would have done the work differently: 'I would have caught the data dependency two weeks earlier' or 'I should have set up the cross-team sync from week one'. An outcome-level reflection is about whether you should have done the work at all: 'we should have done a different project' or 'the original commit was wrong'. Execution-level reflections read as growth and tie back to driving-results execution sub-skills. Outcome-level reflections second-guess the original decision and belong in a different competency (decision-making) if they belong anywhere.
Common Interview Questions
Real prompts an interviewer might ask, with answer outlines
Open with the success metric defined upfront and the baseline. Show the choice that made the over-performance possible (often a scope decision or a session-replay-style insight). Anticipate one blocker and handle it early. Quantify the result against the explicit target. Track for a quarter or two post-launch. Close with a process reflection.
Establish the pressure source concretely (deadline, contract, regulatory). Name three scope decisions: what stayed, what was cut, what was sequenced. Show one structured ask to a cross-functional partner. Quantify the on-time outcome and the downstream effect. Close with a behavioural change you now apply when pressure lands again.
Establish that you owned the metric, not just the engineering. Show the diagnosis (where the bottleneck actually lived). Walk through the cross-functional surface (which streams of work touched which teams). Anticipate one blocker, work through one cross-team stall actively, do project-management work yourself if no one else could. Sustained metric over multiple quarters. Reflection on cadence or coordination.
Pick a project with multiple obstacles in sequence (regulatory scope change, attrition, third-party outage). Resolve each with a different mechanism: absorbing a slip with stakeholder communication, redistributing work, turning dead time into useful coverage work. Quantify the result against the revised plan. Reflection focused on communication or estimation.
Establish the cross-functional surface (three to four teams). Lead with data diagnosis. Be willing to scope down for a reluctant team to bring them along. Set a weekly cadence and do the coordination work yourself if needed. Quantify the metric and track for at least a quarter past launch. Close with a generalised principle about cadence or coordination.
Interview Tips
How to discuss this topic effectively
Name the success metric in the Situation or Task, before describing the work. Defining 'success' in numbers up front is the single biggest lever for moving a story from delivery to results in the rubric.
Show anticipation, not just reaction. Strong stories include at least one beat where you saw a blocker coming three weeks ahead and acted on it, rather than handling it as it landed. The rubric weights anticipation heavily because it is the rare skill.
Track the result for at least a quarter past launch. Stories that end at the launch date read as delivery; stories that include 'I watched the metric for two quarters and the lift held' read as ownership of outcome.
Name what you cut or deferred. Real driving-results work involves scope decisions; saying 'we shipped everything we wanted' usually means you did not pick. State the cut explicitly and the criterion that drove it.
Frame execution mistakes, not project mistakes, in the reflection. 'I would have caught the data dependency two weeks earlier' is execution-level and reads as growth. 'We should have done a different project' reads as second-guessing the original commit.
Common Mistakes
Pitfalls to avoid in interviews
Framing effort as the proof of ownership
'I worked weekends to make the deadline' reads as low-leverage, even when true. Strong candidates name the choices that made the work possible: scope cuts, sequencing, reusing existing tooling, getting help from the right person. Effort can be part of the story but should not be the main signal. If the only thing your story shows is that you worked hard, replace it with one where you also chose what to spend that time on.
Stopping the story at the launch with no downstream metric
'We shipped on time and the team was happy' is a delivery story, not a results story. Push past the launch into what changed downstream: revenue, retention, support volume, latency, customer count, sustained over a quarter or two. Without that, the rubric reads the highest-value row (outcome ownership) as empty.
Inventing the success metric after the work was done
A metric named only post-hoc, or chosen because it happens to look good, reads as retrofit. Strong stories name the metric in the Situation or Task, ideally as a number agreed with stakeholders before the work started: 'we agreed up front the project would be considered successful if X rose by Y%'. If you did not have a metric at the time, name that honestly and explain how you measured success after the fact.
Every obstacle solved by the candidate working harder
Stories where the candidate is the heroic answer to every problem read as either inflated or as low-leverage execution. Strong stories overcome obstacles with structural moves: a scope cut, a structured ask to a peer team, a sequenced re-plan, a partial deferral. Mix in at least one obstacle that was resolved by something other than the candidate's hours.
Reflection focused on the wrong project rather than the execution
'In retrospect, we should have done a different project' second-guesses the original decision and does not belong in the reflection on a results story. The reflection here should be about how you would execute differently: an earlier diagnosis, a tighter cadence, a bolder scope cut, a faster escalation. That kind of reflection reads as growth and ties back to the rubric for execution sub-skills.
