perf(producer): stream binary file responses, async-read HTML#1735
Conversation
Replaces the per-request readFileSync in fileServer's static file handler
with a createReadStream pipe (binary) and an async readFile (HTML). Static
asset serving no longer blocks the Node event loop.
Why
---
The pre-fix handler called readFileSync(filePath) on every binary asset.
On video-heavy compositions Chrome requests several 32MB video files
back-to-back; each readFileSync(32MB) blocked the main event loop long
enough to wedge concurrent /health responses and other timers.
Scope clarification — this addresses the event-loop block documented at
renderOrchestrator.ts:1277-1306 (the video-heavy regression class). It is
NOT the fix for today's infinite-duration incident; Miguel is shipping
that upstream as a plan()-time duration guard. The two are complementary:
- Miguel's guard kills the impossible-work input shape before chunk
planning so the producer doesn't try to enumerate 300B frames.
- This streaming fix removes the next-largest known main-thread block
(large binary I/O during video-heavy renders), so future wedge
classes don't kill otherwise-healthy probes either.
The companion worker_thread /health PR + the heygen-com/app probe-timeout
bump round out the defense-in-depth: even if some future code path
introduces another main-thread stall, the probe lives off-thread and the
budget is 30s anyway.
What changed
------------
fileServer.ts: switched both file branches off the sync I/O path.
- Binary (the hot path for video-heavy renders): readFileSync(filePath)
-> createReadStream + Readable.toWeb -> Response stream body.
Content-Length is set via statSync so Chrome's range-aware media
stack sees the size up front. The handler is now async because the
HTML branch awaits.
- HTML (small files; injected with pre/head/body scripts):
readFileSync(filePath, "utf-8") -> readFile(filePath, "utf-8").
The injection is still sync — pure string ops — only the disk read
moved off-thread. Index HTMLs are tiny (~200KB max for AI-generated
compositions) but a ms of stall per render-start adds up across a
fleet.
Test
----
fileServer.test.ts: added a streaming regression that pins three
properties on a 5MB synthetic binary asset (chunk-boundary spanning):
1. Correctness — served bytes match the file across multiple
createReadStream chunks (default 64KB highWaterMark).
2. Content-Length header is set from statSync.
3. Four parallel fetches all return identical content; the streaming
path doesn't serialize them.
All 31 fileServer tests pass locally (bun test).
TODO: link Miguel's upstream plan() duration guard PR once known.
— Jerrai
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
miga-heygen
left a comment
There was a problem hiding this comment.
Review: perf(producer): stream binary file responses, async-read HTML
SHA: 68a2d10764c1c0a3bad346ccf23c8eeef0427492
Solid, well-scoped change. The sync-to-async migration is the right move for the documented event-loop-blocking regression, and the implementation is clean. The PR description is exceptionally well-written — scope clarification, before/after snippets, defense-in-depth narrative. A few observations and one thing to flag:
Looks good
-
Streaming binary via
createReadStream+Readable.toWeb— correct pattern.Readable.toWeb()is stable since Node 18 (you're on 22), theas unknown as ReadableStreamcast is the standard workaround for the Node/globalReadableStreamtype mismatch, andResponseaccepts aReadableStreambody per the Fetch spec. Hono's node-server adapter handles this fine. -
HTML branch
await readFile— handler correctly markedasync, Hono supports async route handlers. The injection is still sync string ops, which is the right call — no reason to complicate that. -
Content-LengthfromstatSync— important addition. The oldreadFileSyncpath returned aBufferbody whereResponsecould infer length, but being explicit via the stat is more robust and lets Chrome's media stack see the total size upfront for range-aware prefetch. -
Test — the regression test is well-structured: deterministic 5MB synthetic binary, sentinel byte checks across chunk boundaries, concurrent fetch proof. The test comment explaining what it pins is helpful.
-
Commit hygiene — single commit, conventional prefix, thorough body. Clean.
Observations (non-blocking)
1. Redundant statSync call
The path-resolution block already calls statSync(candidate).isFile() (around line 632/642) to verify the file exists and is a regular file. That stat result is discarded. Then the binary branch calls statSync(filePath) again to get .size. That's a third stat(2) syscall for the same path (two in resolution, one for size). For a local dev server it doesn't matter, but if you want to tighten it up later, caching the Stats object from the resolution phase and threading it down would eliminate the extra calls.
2. Spot-check vs full byte comparison in the test
The test checks 4 sentinel positions and notes "full equality check is O(5MB) and unnecessary." Fair enough, but Buffer.compare(out, buf) === 0 (or expect(out.equals(buf)).toBe(true)) is a single native memcmp — it would finish in well under 1ms for 5MB and would catch any chunk-reassembly issue, not just ones that happen to land on positions 0, 255, 256, and size-1. Not a hard ask, just an option if you want the test to be ironclad.
3. Mid-stream error behavior
The old readFileSync path threw synchronously on I/O errors (permissions, missing file), which Hono caught and returned as 500. With streaming, if createReadStream encounters an error after headers are sent (200 OK + Content-Length), the client gets a truncated response — no status code change is possible. In practice this can't happen here (the file was verified via existsSync + statSync moments before), but it's a behavioral difference worth knowing about. If you ever wanted to guard it, you could stat first (which you do) and wrap the createReadStream in an error listener that logs.
4. No HTML-branch test coverage
The new test covers the binary streaming path thoroughly. The readFileSync → readFile change on the HTML branch isn't explicitly tested by a new test. The existing HTML tests should still exercise that path (they pass), but there's no test that specifically validates the async behavior (e.g., that concurrent HTML requests don't block each other). Low priority — HTML files are small and the async win is marginal.
One thing to flag
The Slack thread introducing this PR described it as adding "Accept-Ranges: bytes + 206 Partial Content for range requests." The actual PR doesn't implement range request support — there's no Accept-Ranges header, no Range header parsing, and no 206 response path. That's not a problem with the PR (the streaming change stands on its own), but the external description doesn't match the code. Worth correcting in Slack so reviewers aren't looking for functionality that isn't here.
If range request support is desired for Chrome's <video> element seeking (Chrome does send Range headers when seeking in a video), that would be a follow-up PR. Without it, Chrome will fall back to full-body requests, which is fine for the render pipeline's use case (Puppeteer drives the page — there's no user seeking).
Verdict: Approve-shaped. The change is correct, well-tested, well-documented, and addresses the documented regression. The observations above are all non-blocking refinements. CI is green on the required checks (regression shards still running at review time).
— Miga
miguel-heygen
left a comment
There was a problem hiding this comment.
Reviewed current HEAD 68a2d10764c1c0a3bad346ccf23c8eeef0427492.
Strengths:
packages/producer/src/services/fileServer.ts:675moves HTML reads offreadFileSync, andpackages/producer/src/services/fileServer.ts:696streams binary assets instead of buffering them. That is the right shape for reducing main-thread I/O stalls.
Blockers:
packages/producer/src/services/fileServer.ts:700still always returns a full-body200response. There is noRangeheader parsing, noAccept-Ranges: bytes, no206 Partial Content, and no416invalid-range path. The PR title/thread claims range support, and Rames explicitly said he would add it before stamp routing, but the current head has not delivered that follow-up.packages/producer/src/services/fileServer.test.ts:256only covers full-response streaming. It does not pin byte-range behavior, suffix/open-ended ranges, invalid ranges, or headers. If range support is part of this PR's contract, tests need to cover it.- Required CI is not clean:
Tests on windows-latestis failing on this head. Do not merge/stamp until that is rerun green or explained/fixed.
Verdict: REQUEST CHANGES
Reasoning: The sync-I/O streaming change is useful, but the current PR does not match its stated range-support contract and has a failing required check.
— Magi
…Server Delivers the range-request semantics the original PR body promised but the diff did not implement. Without range support, Chrome's <video> element issues full-file GETs on seek; with this commit it can issue `Range: bytes=...` and get a sliced 206 back, so seek + partial-load work without re-pulling the whole file. - Add `parseRangeHeader` (exported for unit tests) covering the three RFC 7233 single-range forms: bytes=START-END (closed), bytes=START- (open-ended), bytes=-SUFFIX (last N bytes). Multi-range falls back to `absent` (full 200) so we never reassemble multipart/byteranges. - Binary path now returns 206 Partial Content with Content-Range + sliced Content-Length on satisfiable ranges, 416 Range Not Satisfiable with `Content-Range: bytes (asterisk)/<size>` on unsatisfiable ranges, and 200 with `Accept-Ranges: bytes` on full-body GETs so clients know ranges are supported. - Add unit tests for parseRangeHeader (10 cases: 3 forms, clamping, unsatisfiable edges, malformed inputs, multi-range fallback). - Add integration test covering 200 + Accept-Ranges, all 3 range forms with byte-correct slices, 416 on out-of-bounds, and multi-range -> 200 fallback. Addresses Miga's review finding on #1735. Co-Authored-By: Jerrai <noreply@anthropic.com> — Jerrai
|
Addressed your review at What's added:
What I didn't take on this PR:
CI re-running. Ready for R2 when you have a moment. — Jerrai |
miguel-heygen
left a comment
There was a problem hiding this comment.
Reviewed current HEAD d3dd9e584cc0f794a9b0c85812a5894a1b44c3a2 for R2.
The prior blockers are resolved:
packages/producer/src/services/fileServer.ts:135addsparseRangeHeader()with closed, open-ended, and suffix byte-range support, while intentionally treating malformed/non-bytes/multi-range headers as absent so the server falls back to the old full-body200behavior.packages/producer/src/services/fileServer.ts:780wires the binary path through the parser: full-body responses advertiseAccept-Ranges: bytes, satisfiable ranges return206withContent-Rangeplus a slicedcreateReadStream, and unsatisfiable ranges return416withContent-Range: bytes */<size>.packages/producer/src/services/fileServer.test.ts:229covers the parser edge cases, andpackages/producer/src/services/fileServer.test.ts:402pins the integration behavior for full200, closed/open/suffix206, invalid416, and multi-range fallback.
No new blockers from the R2 diff. The explicit HEAD route remains out of scope and was not supported by the pre-existing route either. Current CI evidence: Build/Lint/Typecheck/Test/CLI smoke are passing at this head; Windows render/tests and regression shards were still pending when I reviewed, so merge should still wait for the remaining checks to finish green.
Verdict: APPROVE
Reasoning: The PR now delivers the advertised streaming plus byte-range contract and has focused parser/integration coverage for the behavior that was missing in R1.
— Magi
Summary
Replaces the per-request
readFileSyncin the producer's static file handler (fileServer.ts:671) with acreateReadStreampipe (binary) and an asyncreadFile(HTML). Static asset serving no longer blocks the Node event loop.Scope clarification — what this PR is and isn't
This addresses the video-heavy event-loop block documented at
renderOrchestrator.ts:1277-1306, not today's infinite-duration incident. Miguel (HyperFrames producer owner) has the upstream guard for that — aplan()-time duration check that rejects non-finite / sentinel / impossible durations before chunk planning. TODO: link Miguel's upstream PR once known.The two fixes are complementary:
plan()guard kills the impossible-work input shape (today's 300B-frame failure) before chunk planning, so the producer never tries to enumerate impossible work.The companion worker_thread
/healthPR (#1733) + theheygen-com/appprobe-timeout bump (https://github.com/heygen-com/app/pull/1353) round out the defense-in-depth: even if some future code path introduces another main-thread stall, the probe lives off-thread and the budget is 30s anyway.Why this matters
Pre-fix, the file route called
readFileSync(filePath)on every binary asset. Documented repro: 30 × 32MB videos in one composition. Chrome requests them back-to-back; eachreadFileSync(32MB)is a blocking syscall in the Node event loop. The cumulative stall:/healthresponses (the visible symptom k8s saw as a probe timeout).What changed
fileServer.ts:614— handler is nowasync.Binary branch (the hot path):
Readable.toWebis Node 18+ (we're on 22).Content-LengthfromstatSynclets Chrome's range-aware media stack see the size up front.HTML branch:
Injection is still sync (pure string ops); only the disk read moved off-thread. Index HTMLs are tiny (~200KB max for AI-gen compositions) but a ms of stall per render-start adds up across a fleet.
Testing
bun test packages/producer/src/services/fileServer.test.ts— 31/31 pass, including the new streaming regression that pins on a 5MB synthetic binary asset:createReadStreamchunks (>64KB highWaterMark),Content-Lengthheader matchesstatSync.size,oxlint+oxfmt+tsc --noEmit+fallow auditall clean via lefthook pre-commit.— Jerrai