Community Article

Understanding Big-O Notation

What Big-O actually measures, why constants still matter at small N, and the four traps that catch even strong engineers in code review.

Understanding Big-O Notation

What Big-O actually measures, why constants still matter at small N, and the four traps that catch even strong engineers in code review.

big-o
asymptotic-analysis
time-complexity
fundamentals
interview-prep
astridcole

By @astridcole

November 30, 2025

·

Updated May 20, 2026

236 views

1

4.5 (14)

On a 64KB array, a quadratic loop runs about 4 billion comparisons. On the same hardware, a linearithmic sort runs about 21 million. The first takes roughly half a minute. The second takes a few hundred milliseconds. Same input, same machine, two orders of magnitude on the wall clock. That gap is what Big-O is trying to warn you about, and it is the only reason the notation has survived sixty years of engineering taste shifts.

I am going to argue that Big-O is the single most useful piece of theoretical CS for working programmers, but only if you understand what it does NOT measure. Most arguments I see on PRs are not about Big-O, they are about the things Big-O quietly hides.

What the notation actually says

Big-O describes how the work an algorithm does grows as the input grows. It is an upper bound on the rate of growth, ignoring constant factors and lower-order terms. When I write O(n^2), I am saying that for large enough n, the runtime grows no faster than some constant times n squared.

Three things follow from that definition that beginners often miss.

First, Big-O is asymptotic. It only describes behaviour as n heads toward infinity. At small n, an O(n^2) algorithm with a tiny constant can easily beat an O(n log n) one with a huge constant. This is exactly why TimSort (the algorithm Python and Java use for sort) switches to insertion sort below a small threshold.

Second, Big-O is an upper bound, not a tight bound. O(n) is technically also O(n^2), because anything that grows linearly also grows no faster than quadratically. If you want a tight bound, the right notation is Theta, but in industry we abuse O to mean Theta and that is fine as long as everyone knows we are.

Third, the constants you drop can dominate in practice. 100n and n are both O(n). They are not the same in production.

The growth rates that show up in real code

This is the table I keep open in a tab when I am estimating. The rightmost column is the one I actually look at. Most code does not need to be fast, it needs to not be catastrophically slow on the inputs you actually see.

Big-ONamen=10n=1,000n=1,000,000What it usually is in real code
O(1)constant111Hash table lookup, array index
O(log n)logarithmic31020Binary search, balanced tree op
O(n)linear101,0001,000,000Single pass over a list
O(n log n)linearithmic3310,00020,000,000Comparison sorting, FFT
O(n^2)quadratic1001,000,00010^12Nested loop over the same array
O(2^n)exponential1,024astronomicalimpossibleNaive subset / power-set generation

The practical line for most product code sits somewhere between O(n log n) and O(n^2). If your input never exceeds about 10,000 items, quadratic might be fine. Cross 100,000 and quadratic starts to bite. Cross a million and quadratic is over.

A worked example: why Array.includes inside a loop bites

This is the bug I see in code review more than any other, in every language. Nobody writes it on purpose. Everyone writes it once.

function findDuplicates(items) {
    const dupes = [];
    for (const item of items) {
        if (items.includes(item) && !dupes.includes(item)) {
            // do something
            dupes.push(item);
        }
    }
    return dupes;
}

It looks fine. It even works on the test data. The author probably ran it on 50 items and called it done. But every iteration of the outer loop triggers a linear scan inside includes, which makes the whole function O(n^2). At 10,000 items it does 100 million comparisons. At 100,000 it does 10 billion, and the user clicking the export button decides to take a coffee break.

The fix is straight Big-O improvement: trade one pass for a Set.

function findDuplicates(items) {
    const seen = new Set();
    const dupes = new Set();
    for (const item of items) {
        if (seen.has(item)) dupes.add(item);
        else seen.add(item);
    }
    return [...dupes];
}

The second version is O(n) because Set.has and Set.add are amortized O(1). On 100,000 items it finishes in a few milliseconds. The shape of the code did not change much. The growth rate did.

The four traps Big-O does not save you from

Big-O tells you how runtime grows with n, and that is all it tells you. The traps below are real performance issues that the notation will not catch.

  1. Cache locality. A linked-list traversal and an array traversal are both O(n). The array is roughly five times faster on modern hardware because it streams sequential memory into the L1 cache. The linked list pointer-chases and triggers a cache miss on most nodes. Same Big-O, very different stopwatch.

  2. Hidden multiplicative constants. Array.prototype.sort in V8 is O(n log n). So is your handwritten merge sort. The handwritten one is usually 3-10x slower because the engine sort has been micro-optimized for years. Reach for the standard library by default.

  3. Allocations and garbage collection. A pure functional version of an algorithm that returns a new array per step is the same Big-O as the in-place version. The in-place one runs faster because it does not pressure the GC. This is why Array.prototype.map chained five times is sometimes slower than one explicit for loop, even at the same growth rate.

  4. The amortized vs worst-case distinction. A dynamic array push is amortized O(1). The single push that triggers the resize is O(n). If you have a real-time deadline (say, a 16ms render budget), amortized is not good enough. You need the worst-case bound, and Big-O alone will not tell you which one you are reading.

How I estimate in code review

My quick read on any function is roughly this. I look for nested loops over the same collection (instant O(n^2) flag). I look for .includes, .indexOf, .find, or array .filter inside another loop (same flag). I look for recursive calls without memoization (could be exponential). I look for sorting (probably O(n log n)). I treat hashed lookups, set membership, and array indexing as O(1) and move on.

For the cases where it actually matters, I write down the worst case input that production might see, plug it into the table above, and ask whether the answer is comfortably under our latency budget. That whole exercise takes about ninety seconds and catches most of the issues I would otherwise miss.

Two rules and a warning

If you take three things from this article, take these.

First, the rule I follow: any time you have a loop that searches inside a collection you also iterate over, you almost certainly want a hash set or hash map. The shape change is small, the Big-O change is huge, and code review will be on your side.

Second, the habit: when you write a function that processes user input, ask out loud what n is in production. "It is small" is not an answer. Ten thousand, a hundred thousand, ten million? That number decides whether linear is fine or whether you need logarithmic.

The warning: Big-O is necessary but not sufficient. I have seen O(n) code crash a server because each step allocated a fresh megabyte object. I have seen O(n^2) code ship to production happily because the upstream system caps n at 200. The notation gives you a vocabulary for talking about scaling. The scaling problem still has to be solved with judgement, profiling, and a willingness to read the actual numbers when the scaling estimate is wrong.

Back to Articles