On a codebase I joined in 2024, the team estimated they had "about 18 months of technical debt" to pay down before they could ship the next major feature. We spent a week doing a real audit of the codebase. The number we found was closer to 6 weeks. The other 16 months of "debt" was old code that worked, was a little awkward, and that the team would have rewritten if they were starting over but had no real reason to touch right now.
The difference between those two categories matters. One of them costs the business real money every day it goes unaddressed. The other is just code that has been around for a while. Conflating them is what burns six months of an engineering team on a refactor that does not produce any business outcome anyone can name. This article is the test I use to tell debt from age, and the rule I follow when deciding whether to actually pay it down.
The original metaphor
Ward Cunningham coined the phrase "technical debt" in 1992, and his original definition was specific: shipping code you knew was incomplete, in exchange for learning something from production, and paying it back by refactoring before the cost compounded. The key word is knew. Debt was a deliberate trade-off, not a description of any code that had aged.
The industry has stretched the term until it covers, roughly, "any code I do not like". This is a problem because it makes the metaphor useless. If everything is debt, nothing is. The bank loan analogy only works if some code is loans (with interest) and other code is paid-off equity (no interest, even if it is old).
The test I keep coming back to: code is debt if it is actively costing the team something measurable right now. Code is just old if it works, nobody touches it, and the cost of leaving it is approximately zero.
The four kinds of code on every codebase
When I audit a codebase for debt, I sort every file or module into one of four categories. The four categories tell me how to act. The mistake I see teams make is treating all four the same.
Live code, healthy. This is the code that is changed often, has tests, and is well-structured. The team enjoys working in it. No action needed. The test for this category is whether the engineer who last touched it can describe its boundaries in one sentence.
Live code, painful. This is real technical debt. The team changes this code often, every change is harder than it should be, and bugs come back to it. The test for this category is whether the team's velocity in this code is noticeably slower than in equivalently-sized, healthier code. If the answer is yes, this is debt and it is accruing interest.
Dormant code, healthy. This is the boring, working module nobody has touched in 18 months. It does its job, the tests pass, the API is stable. People sometimes propose refactoring it because it is "old". This is not debt. This is paid-off equity. Touch it only if you are going to materially change what it does.
Dormant code, broken. This is code that nobody touches, but when they do touch it, it breaks. Often a legacy auth flow, an old integration with a vendor that nobody understands, a billing report that runs once a quarter and surprises someone every time. This is the most expensive category to ignore because the cost is invisible until it explodes. The test: when this code last needed a change, how long did the change take, and how many bugs followed?
The two categories that demand action are Live code, painful and Dormant code, broken. The other two are usually a distraction.
How to tell live painful code from "just old"
This is the part teams get wrong most often. The proposal goes "we should refactor module X, it has 15 years of history and the abstractions are dated". The question I ask in response is always the same: "how often do we change module X this quarter, and what does each change cost?" If the answers are "once" and "a normal amount", the module is not debt. It is the dormant-healthy category, and the proposed refactor is, at best, a personal preference.
The diagnostics I look at:
- Change frequency over the last quarter.
git log --since='3 months ago' --oneline -- path/to/module | wc -l. If the count is below 5, it is dormant. If it is above 30, it is live. Change frequency is the closest thing to a free signal. - Bug recurrence. Has the same module shown up in three or more bug reports in the last quarter? Recurring bugs in one module mean the structure is not handling the use cases.
- Onboarding cost. When a new engineer touches this code, how long until their first PR ships? If the answer is "two weeks of pairing", the code is opaque even to the team that has worked in it for years.
- Test pain. When you change this module, are the tests easy to update or do they require restructuring? Tests that fight every change point at structural debt.
None of these are perfect, but two or three of them lighting up at the same time is a reliable signal. A module that is touched weekly, has had four bugs this quarter, and where the tests have to be rewritten on every change is debt by any reasonable definition.
The rule I follow when paying it down
My rule, after several false starts on "debt sprints" that did not produce business outcomes, is this: I do not pay down debt as a project. I pay it down as a tax on the work I am already doing.
In practice that looks like the boy-scout rule with teeth. When I touch a file in the live painful category to add a feature, I budget 20-30% of the PR for cleanup of the area I am touching. I rename the variables that confused me on the way in, I extract the function I had to read twice, I add the test the original author should have written. The feature ships in the same PR. The cleanup is part of the cost of the feature.
Why this works better than a debt sprint:
- The cleanup is targeted at the code that just hurt me, which is also the code that was about to hurt the next person. There is no question whether it was worth doing.
- The PR has a business outcome (the feature) and the cleanup is a side effect. It is much easier to justify to a non-engineering stakeholder than "we are spending two weeks on internal cleanup".
- The team's velocity goes up gradually instead of crashing during a debt sprint and recovering after.
- The compound effect over a year is enormous. Every painful file gets touched eventually, and every touch makes it slightly less painful.
The failure mode of the rule is that the dormant broken category never gets touched, because by definition nothing is forcing me to. For that category, I run a once-a-quarter pass: pick the single most-feared piece of dormant code, schedule one engineer for one week to make it understandable (add tests, add a runbook, document the assumptions), and then move on. One week per quarter is enough to keep the broken-dormant category from accumulating into the next emergency.
What about the rewrite?
There is a small set of cases where the right move is a rewrite, not incremental cleanup. The cases I have seen this be correct:
- The system's load profile has changed by more than an order of magnitude since the original design, and the original design's assumptions are now wrong (e.g. a single-server architecture that needs to become distributed).
- The platform underneath the system is being deprecated (e.g. the database vendor end-of-lifed the version) and the migration is large enough that you might as well restructure on the way.
- The team owns five very similar systems that should be one, and consolidating them is comparable in cost to maintaining them separately.
In each of those cases, the rewrite is justified by a business reason, not by "the abstractions are dated". If you cannot describe the business reason in one sentence, you are not in a rewrite case, you are in an itch-scratching case, and the boy-scout-rule approach is going to be more honest.
Talking about debt with non-engineers
The last piece of advice on this topic is about how to discuss debt with people who do not write code. The phrase "technical debt" is now a dog-whistle in a lot of organizations: when an engineer says it, a product manager hears "give me time off the roadmap", and the conversation gets adversarial.
The re-framing that has worked for me, on every team I have used it: I do not say "we have technical debt". I say "feature X will take 3 weeks because the area we are changing has accumulated structural problems. Without the cleanup, the same change in 6 months will take 5 weeks. Here is what the cleanup is." That sentence describes a real cost in real currency (engineer-weeks) and a real return (faster future work). It is a business case. It does not invoke the metaphor.
The second sentence I have learned to use: "the cost of not fixing this in this PR is that the same fragile area will block us again in the next feature". That sentence converts an internal-quality argument into a roadmap-velocity argument, which is the language the rest of the business is already speaking.
Your codebase is older than the company that depends on it
A good way to end this is to remind myself that most production codebases I work in are older than the customer relationships they support. The code carries the business. Treating every quirky-but-working part of it as something to fix on the next refactor is, statistically, a waste of the team's time. Treating the parts that hurt every week as worth fixing inside the work that hurts is, statistically, where the wins are. Sort your code into the four categories above, run the boy-scout rule on the painful one, schedule an occasional intervention on the broken-dormant one, and leave the rest alone. The codebase calms down. The team ships features faster. Nobody runs an 18-month "pay down debt" project that produces no customer outcome anyone can name.
