Community Python Snippet
Pandas Melt-Then-Pivot: The Shape I Always Need
I keep reaching for melt-then-pivot to reshape wide tables for charting. Here is the pandas-style transform written in pure stdlib Python so it runs anywhere, plus the multi-key pivot variant.
Pandas Melt-Then-Pivot: The Shape I Always Need
I keep reaching for melt-then-pivot to reshape wide tables for charting. Here is the pandas-style transform written in pure stdlib Python so it runs anywhere, plus the multi-key pivot variant.
By @carlosherrera
May 10, 2026
·
Updated May 20, 2026
705 views
21
4.3 (14)
Melt is the 'unpivot' direction: every metric column becomes a row, with the original column name living in a variable field and its cell in value. I reach for it before charting, because most plotting libraries want one row per data point rather than a wide grid. The 15-line stdlib version is enough for any in-memory workload up to a few hundred thousand rows; past that I switch to pandas or polars. The contract matches pandas.melt(df, id_vars=...) exactly so when the dataset grows you can swap implementations without rewriting downstream code.
Pivot fills the inverse role: turn (key, column-name, value) triples into a row-keyed dict where each column-name is its own field. The defaultdict(dict) keeps the per-key bag of measurements, and a final pass flattens it into uniform records with fill patching the holes. The most common bug I have hit is forgetting that fill matters: if alice has no 'mar' reading, the chart should show 0 (or None for 'no data', which is a different decision). Make the fill value explicit at the call site rather than letting it default.
The change from accordion 2 is small but powerful: the index is now a tuple (region + plan), and we collect a list of values per cell so the aggregator (sum, max, len, statistics.mean) decides how to combine duplicates. Real event logs always have duplicates, so the agg slot saves you from a silent overwrite bug. I have used this exact shape to build cohort signup grids and per-team latency tables; the only thing missing for production is column ordering control, which I usually add by passing a custom column-key sort. When the data outgrows memory I switch to a SQL GROUP BY or to polars; below that boundary, this function is dependency-free and obvious.
