feat(ingestion): streaming JSON tokenizer for high-volume ingestion (… by vaishnavidesai09 · Pull Request #552 · StellarFlow-Network/stellarflow-backend

vaishnavidesai09 · 2026-06-28T10:36:11Z

feat(ingestion): streaming JSON tokenizer for high-volume ingestion (#500)

Closes #500

Problem

Large nested JSON payloads were loaded entirely into memory before any
price attribute could be extracted, causing spikes in RSS during
high-volume ingestion bursts.

Solution

Added iter_price_events_from_stream and build_segments_from_stream to
src/ingestion/parser.py. Both functions use ijson's SAX-style event
emitter (ijson.parse) to walk the token stream incrementally — only the
current candidate frame dict is assembled at any point, so memory usage
is O(frame) rather than O(document).

API

from ingestion.parser import iter_price_events_from_stream, build_segments_from_stream

# file-like object or raw bytes
for asset_id, price, ts, seq, flags in iter_price_events_from_stream(response.raw):
    ...

segments = build_segments_from_stream(open("dump.json", "rb"), segment_size=256)

What's unchanged

iter_flat_ticker_tuples, flatten_telemetry_frames, build_telemetry_segments — untouched
src/ingestion/parsers.py compat re-exports updated to include new symbols

Tests

13 new tests in StreamingParserTests; all 17 tests pass.

…tellarFlow-Network#500)

drips-wave · 2026-06-28T10:36:19Z

@vaishnavidesai09 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

vaishnavidesai09 · 2026-06-28T10:37:56Z

Hi @Sadeequ! Here's my PR for issue #500.
Refactored src/ingestion/parser.py to add iter_price_events_from_stream and build_segments_from_stream which use ijson's SAX-style tokenizer to parse JSON incrementally — memory usage stays bounded to a single frame dict regardless of document size. Updated parsers.py compat exports and added 13 new tests (17 total, all passing).
Please let me know if any changes are needed!

feat(ingestion): streaming JSON tokenizer for high-volume ingestion (S…

33fe938

…tellarFlow-Network#500)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ingestion): streaming JSON tokenizer for high-volume ingestion (…#552

feat(ingestion): streaming JSON tokenizer for high-volume ingestion (…#552
vaishnavidesai09 wants to merge 1 commit into
StellarFlow-Network:mainfrom
vaishnavidesai09:issue-500-streaming-json

vaishnavidesai09 commented Jun 28, 2026

Uh oh!

drips-wave Bot commented Jun 28, 2026

Uh oh!

vaishnavidesai09 commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vaishnavidesai09 commented Jun 28, 2026

Problem

Solution

API

What's unchanged

Tests

Uh oh!

drips-wave Bot commented Jun 28, 2026

Uh oh!

vaishnavidesai09 commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant