Community Python Snippet

Streaming JSONL Parser Without Loading the File

When the file is 8GB you cannot json.load it. Here is the generator-based JSONL reader I ship in every data pipeline, plus the malformed-line policy that has saved me twice.

Streaming JSONL Parser Without Loading the File

When the file is 8GB you cannot json.load it. Here is the generator-based JSONL reader I ship in every data pipeline, plus the malformed-line policy that has saved me twice.

Python
Compiler
3 snippets
py-generators
stream-processing
data-pipeline
clarachoi

By @clarachoi

December 21, 2025

·

Updated May 18, 2026

1,157 views

5

Rate

The shape is a generator over a line iterator, which keeps memory at O(1) regardless of file size. I take Iterable[str] rather than a path because that lets me feed the same parser a real file, an io.StringIO, a network stream, or a gzip.open handle. Skipping blank lines is required by the loose JSONL format used in the wild; producers like Cloud Logging emit trailing blank lines all the time. The eight lines here are enough for a clean dataset.