This is a real shape of code. I have written it. You have written it. It works. It also has eight responsibilities crammed into one function, an N+1 in the middle, three different kinds of error returns, two implicit business rules with magic numbers, and a tax calculation that lives where nobody will ever find it again. The next person to add a feature will copy the pattern and the function will grow. By the time the team notices, it is a 400-line monolith with a test file that mocks 14 things.
This article is the six refactoring moves I reach for almost weekly to keep code like this from happening, in the order I apply them. None of them are "rewrite the function". All of them are small enough that the resulting PR is easy to review and easy to revert. Together they turn the function above into something a junior can read on day two of the job.
Move 1: extract a named variable
The cheapest refactor in the catalog. Whenever I see a value used in a condition or expression with no name, I pull it into a variable whose name explains why the value is what it is.
No behavior change. The diff is two lines into seven. The payoff is that the next reader does not have to guess what 100 and 10 represent, and the next person who adjusts the threshold has one place to change it. I do this move dozens of times a week and it is by far my highest-frequency refactor.
Move 2: extract a named function
When I see a block of three to ten lines doing one logical thing, with comments above it that explain what it is doing, I delete the comments and turn the block into a function whose name is the comment.
The extracted function is now testable in isolation. The caller reads as a single line, const subtotal = await calculateLineItemTotal(cart.items), and the high-level structure of handleCheckout is no longer obscured by the loop body.
The rule of thumb that keeps me from over-extracting: I extract when naming the block makes the caller easier to read. I do not extract when the block is already obvious in context. A two-line block called calculateTax that just does return total * 0.13 is a worse function than the inline expression with a comment about which jurisdiction the rate is for.
Move 3: replace a flag argument with two functions
Any time I see a boolean parameter that switches a function's behavior at the top, my hand twitches toward the keyboard. The pattern is almost always cleaner as two functions.
The call sites get clearer too: priceLine(p, q, true) told me nothing without jumping to the definition; priceLineWholesale(p, q) tells me what is happening at the call site.
The exception I make to this rule: if the boolean is forwarded down through three layers of calls and turned into many flag arguments, the right move is sometimes to keep one function and pass a mode: 'retail' | 'wholesale' enum, then use a discriminated union at the top of the function. That is a topic for its own article. The rule still holds: a single boolean that flips behavior at the leaf is two functions in disguise.
Move 4: invert the conditional and return early
Deeply nested conditionals are exhausting to read. The fix is almost always to flip each guard and return early.
The "after" version reads top-to-bottom as a list of preconditions: if any of them fail, return null; otherwise ship. Indentation drops from five levels to one. The cognitive load drops more than the line count suggests.
This is the move I push back on most often when reviewers tell me single-return is cleaner. In my experience, single-return is cleaner if the function is short. As soon as the function has guards, early return is better than the pyramid.
Move 5: pull the data fetching out of the logic
The checkout function above mixes database calls with business logic. That mix is the single biggest reason it is hard to test. The refactor that fixes it is to take the data fetches and run them all up front, then pass the data to a pure function.
This pattern has names: "functional core, imperative shell" and "hexagonal architecture" both describe it. The name does not matter; the property does. The pure middle (computeOrderTotals) takes a cart, the user, the request body, and returns a number. It can be unit tested with a single object literal as input. The I/O at the edges (loadCartWithProducts, persistOrder) is small enough that integration tests cover it without a labyrinth of mocks.
In the 90% of services I work on that have any real complexity, this single move is the one that pays back the most over time. Every test I have ever written for computeOrderTotals ran in milliseconds with no database. Every test I have ever written for the original interleaved version needed a Postgres container.
Move 6: rename so the call sites read as English
The last move is the one I do near the end of a refactor when most of the structural work is done. I rename functions and variables so the code reads as English when I look at the call sites.
Good call site:
Bad call site (same code, worse names):
The "after" reads as a four-step recipe in plain language. The "before" reads as code that requires you to remember what t and p are. Renaming is the cheapest move on this list and one I do last because the right names usually only become obvious once the structure is clean.
The order matters
I apply these six moves in roughly the order I listed them, and the order is not random. Extracting variables (move 1) and inverting conditionals (move 4) are nearly free and clean up the diff before any restructuring starts. Extracting functions (move 2) and replacing flags (move 3) are next; they need the variables and the early returns to be in place to be obvious. Pulling the data fetching out (move 5) is the biggest move and benefits from steps 1-4 already being done because the pure function is now small enough to actually extract. Renaming (move 6) is the polish pass that only makes sense once the structure is final.
If you skip ahead to move 5 first, the resulting diff is hard to review because the structural change is mixed with all the cleanup. If you do moves 1-4 first as separate PRs, the move-5 diff is mostly reorganization, and the reviewer can see the shape at a glance.
How I avoid breaking things while refactoring
The failure mode of refactoring is the silent regression. The change passes the existing tests, looks cleaner, and ships. Two weeks later a customer hits an edge case the refactor accidentally moved, and the bug bisects to my supposedly-safe cleanup commit. I have caused this at least four times. The lessons are all the same.
Run the existing test suite before and after every move. Not at the end. After every move. If the tests passed before move 1 and fail after move 1, the diff for move 1 is small enough to read in a minute. If five moves have happened, the bisect is harder.
Add characterization tests before structural changes. If the function I am about to refactor has thin test coverage, I write a few tests that pin down the current behavior, even the behavior I think is wrong. That way, when my refactor changes something subtle, the test catches it. Once the refactor is done, I can decide whether the pinned behavior was intentional or a bug, and adjust accordingly.
Keep refactors and behavior changes in separate commits. If I am cleaning up a function and I notice a bug, I do not fix the bug in the same commit. I finish the refactor, ship it, then open a separate PR for the bug fix. This sounds pedantic. It saves me every time I have to bisect a regression.
A short rant about "the big rewrite"
The other path I see teams take, and the one I have stopped advocating for, is the big rewrite. "This whole module is bad, let's redo it from scratch". I have done it. It has worked twice. It has failed about ten times. The failure mode is always the same: the rewrite re-introduces the original bugs, the team loses six weeks, and morale takes a hit.
The six moves above are explicitly designed to be the opposite. Each one is small, each one ships independently, each one is reversible if reviewers find a problem. Over a quarter, a team using this playbook can refactor a 4,000-line module into shape with no scary commits in the history. I would much rather ship 30 small PRs than one large one, and the team I work with most has come around to that view too.
What I do on Friday afternoon
My Friday afternoon ritual, the one I have been running for years, is to pick the file I touched most this week, open it, and apply moves 1, 4, and 6 to anything that catches my eye. Twenty minutes of low-risk cleanup before the weekend. Over a year that is roughly a thousand small improvements to the code I work in most. The diffs are tiny, they merge with rubber-stamp reviews, and the codebase stays a place I want to come back to on Monday. That is the real argument for the playbook: it makes the next week of work cheaper, and the cumulative effect over a year is bigger than any rewrite I have ever shipped.
