Community Python Snippet
bisect Instead of sort() on Every Insert (Python)
I had a leaderboard insert loop running list.append + list.sort. Swapping to bisect.insort cut the loop from 4.2s to 0.14s on 50k inserts. The 5-line rewrite plus the keyed variant is here.
bisect Instead of sort() on Every Insert (Python)
I had a leaderboard insert loop running list.append + list.sort. Swapping to bisect.insort cut the loop from 4.2s to 0.14s on 50k inserts. The 5-line rewrite plus the keyed variant is here.
By @sarahwilson
May 14, 2026
·
Updated May 18, 2026
935 views
21
4.4 (10)
The two loops produce identical output but with very different complexity. sort() on every insert is O(N log N) per call, so the loop is O(N^2 log N); bisect.insort is O(log N) for the lookup plus O(N) for the shift, so the loop is O(N^2). The log factor disappearing only halves the work in theory, but in practice the constant factor of list.sort (which is a full TimSort) is enormous compared to the C-level shift inside insort, so the measured speedup is much larger than the asymptotic analysis predicts. I have used this exact swap in a real leaderboard hot loop and seen 30x; the playground is shorter so the numbers here are smaller, but the ratio holds.
On Python 3.10+ this whole class collapses to bisect.insort(rows, item, key=lambda r: r['score']), but on 3.8 (which is the playground's runtime, plenty of laptops, and most embedded Linux distros) the key= argument does not exist. The parallel-keys list is the canonical workaround: we sort by the key array and apply every position change to the data array in lockstep. The cost is one extra O(N) insert per call; for sizes up to a few thousand it is fine. For larger sizes I either upgrade the Python version or switch to sortedcontainers.SortedKeyList, which uses skiplist-backed O(log N) insert and is a drop-in replacement.
The off-by-one bug between bisect_left and bisect_right is the most common bisect bug I have shipped. The right way to remember it: bisect_left(a, v) returns the count of items strictly less than v, bisect_right(a, v) returns the count of items less than or equal to v, and the difference is the count of items equal to v. Range queries fall out cleanly: a[bisect_left(a, lo) : bisect_right(a, hi)] is the canonical idiom for 'all items in the closed interval [lo, hi]'. The insort_left vs insort_right choice matters when stability matters (a leaderboard that ties on score should usually do insort_right so the older entry stays first); in production I have flipped this exactly twice.
