fix(engine): hold last frame when a clip's media is shorter than its slot#1726
Conversation
c915485 to
3ee90e8
Compare
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewing at 3ee90e82 (rebase of c9154852, tree unchanged). No prior peer reviews — first pass; Via may layer on the player-semantics side independently. Focused this read on cross-stack consumers, sibling-handler symmetry, observability shape, and the constant-drift between core's clamp epsilon and the engine's hold tolerance.
Net: solid fix. The freeze trigger condition (source-exhausted AND globalTime within holdTolerance of clip end) is the right shape — not racing browser ended events, not relying on scheduled-tick timing. The hold path mirrors what the runtime already does at the visibility layer (init.ts:480-482 keeps elements visible through currentTime <= computedEnd), so the recorded output now matches preview. The recording / preview divergence risk I was specifically looking for doesn't materialize — both paths go through the same FrameLookupTable.getActiveFramePayloads, and audio's apad=whole_dur=${totalDuration} (audioMixer.ts:396) silently pads the gap with silence rather than holding the last audio sample, which is the correct symmetry (frozen frame + silence is what a held playback would produce). Nice end-to-end thinking.
A few things worth surfacing, in order of weight:
🟡 Constant duplication between core and engine — videoFrameExtractor.ts:973 declares MEDIA_SLOT_SHORTFALL_TOLERANCE_SECONDS = 0.05 and the comment explicitly says "Mirror of core's MEDIA_DURATION_CLAMP_EPSILON_SECONDS." But the source constant at core/src/compiler/timingCompiler.ts:52 is not exported from core/src/compiler/index.ts or core/src/index.ts. The engine duplicates the literal 0.05 instead of importing. The whole point of the hold-tolerance flooring is that it's tied to the compiler's clamp decision — if someone bumps the core epsilon to 0.07 for a future ffprobe-precision concern, the engine still uses 0.05 and the half-frame seam re-opens above 40fps. Cheap fix: export MEDIA_DURATION_CLAMP_EPSILON_SECONDS from core/src/compiler/index.ts (and re-export at core/src/index.ts) and import it directly in videoFrameExtractor.ts. The risk is low today but the comment already promises mirroring; making the dependency real costs nothing.
🟡 Producer warning is video-only; audio clips clamp silently too — producer/services/htmlCompiler.ts:259 gates the log.warn on r.tagName === "video", but clampResults includes audio clips (line 239 filters by !loop only, no tag restriction) and clampList.push lines 253-254 apply the clamp to audio too. So a <audio data-duration="10"> with a 6s source gets silently shortened — the same author-confusion mode the warning is trying to prevent. The validate audit at validate.ts:93 also only walks video[data-duration]. Worth either extending the warning to audio (a separate [compile] line with audio-appropriate wording, since there's no hold-frame analogue) or documenting in the warning thesis why audio is intentionally exempt. The PR body says the warning is for "videos" specifically, which is fine if the design is "audio-clamp-silent is OK because there's no visible analogue to the black flash" — just say it.
🟡 Hold tolerance can intentionally swallow a 50ms shortfall, but the warning floor is the same 50ms — MEDIA_SLOT_SHORTFALL_TOLERANCE_SECONDS = 0.05 is doing double duty: (a) the render path holds for shortfalls ≤ 50ms (no warning needed, no visible seam), (b) the warning fires for shortfalls > 50ms. A clip 51ms short of its slot would hold the last frame for one frame past the source end AND emit a warning — that's the intended belt-and-suspenders. But it also means the warning catches the case where the renderer is already covering it cleanly (51ms hold is still seamless at 60fps: 2 frames). Not a bug, but worth one line in the warning explaining "this case renders fine but the longer slot is probably author error" so an author with a 51ms shortfall doesn't think they have a render bug to chase.
🟡 Test pins frameIndex, not pixels — the new test at videoFrameExtractor.test.ts:379-404 asserts the held frameIndex === 42, which is the right unit invariant. The PR body says "Verified end-to-end: the boundary that rendered pure black (luma 14) now holds the last frame (luma 237)" — that's a manual verification, not a regression test. Per the parity-test reflexivity worry: the test asserts the function returns the index it's supposed to, but doesn't pin that the rendered pixels at t=3.44 actually display the held frame instead of the page background. If there's a producer-side integration / regression-harness suite that exercises a real ffmpeg-trimmed clip end-to-end with pixel assertions, hooking the 1.45s-slot/1.433s-media case in there would close the loop. Existing-state question: is there a fixture in producer/regression-harness.ts that could carry this? Not blocking, but the manual verification doesn't re-run on every change to getActiveFramePayloads.
🟡 Metadata-loaded race in the validate audit — cli/commands/validate.ts:246 waits opts.timeout ?? 3000 ms after domcontentloaded before running auditVideoClipDurations. Line 112 silently skips clips with clip.duration non-finite or zero ("metadata not loaded yet"). For local video files that resolves fast. For S3-hosted videos behind a slow CORS preflight, 3s might not be enough — and the failure mode is silent no-warn, which is the worst kind for a "we'll catch the mistake" workflow. Worth either waiting on loadedmetadata for each <video> explicitly before reading .duration, OR logging "skipped N videos awaiting metadata" so the author knows the audit was incomplete. Per the observability-failure-path lens: the audit's whole reason for existing is to flag mistakes, and silently passing on a slow load is exactly the case it should NOT miss.
💭 Telemetry / counter for "hold-frame applied" — small thought: this fix is intentionally invisible when working. Without any signal (counter, debug log, anything), we won't know how often it fires in production renders. A log.debug at videoFrameExtractor.ts:1127 ("held last frame for clip X for Ys") in the render pipeline would let you grep render logs after this lands to confirm the fix is exercising on real compositions, not just unit tests. Optional, but cheap insurance.
✅ Things I checked and was happy with:
- Loop branch (
videoFrameExtractor.ts:1106) is gated before the hold branch — looping clips don't accidentally hit the new code path. - Active-set drop at
globalTime > video.end(refreshActiveSetline 1077,<= candidate.endinclusive) means the held frame doesn't bleed past the clip end into a subsequent black region. Clean inverse-operation handoff. - Audio side (
audioMixer.ts:396apad=whole_dur) produces silence in the trailing window — matching the held frame's visual stillness rather than producing audible glitches or unintended loops. compileStage.ts:124correctly threadslogintocompileForRender. Other consumers (compilationRunner.ts:49,121,regression-harness.ts:906) call withoutlog, but those are test-harness paths where suppressing the warning is fine.analyzeClipMediaFithas solid unit coverage including theloop=true, NaN, zero-slot edge cases.
— Rames D Jusso
vanceingalls
left a comment
There was a problem hiding this comment.
Fixes a one-frame black flash at fast-cut boundaries where a <video> clip's source media is sub-frame shorter than its declared data-duration slot (the common case — ffmpeg -t 1.45 at 30 fps emits 43 frames = 1.433 s, ~17 ms short). FrameLookupTable.getActiveFramePayloads now holds the last extracted frame for the tail of the slot once the source is exhausted, gated by holdTolerance = max(2 frames / fps, MEDIA_SLOT_SHORTFALL_TOLERANCE_SECONDS=0.05s) (videoFrameExtractor.ts:1120-1124). The flooring at the compiler's clamp epsilon is the load-bearing trick: above 40 fps the pure 2-frame term is smaller than 0.05 s, so without the floor 60 fps renders would still seam at the clamp boundary. A shared analyzeClipMediaFit({slotSeconds, mediaSeconds, loop?}) at videoFrameExtractor.ts:981 underpins two new warnings on the same threshold: a [compile] log.warn at the producer clamp site (htmlCompiler.ts:260-268) and an auditVideoClipDurations pass in validate that reads each <video>'s live .duration via headless Chrome (validate.ts:75-127).
Verdict: LGTM at 3ee90e823d2e63527720569ff74022d9b17b5213 — converging with Rames's read at the same SHA (rebased from d6f3bb5a between read-start and pre-post check; comment-tightening pass only, verified via compare/...).
Verified the tolerance math at the regime boundaries.
- 30 fps clip [2, 3.45], 43 source frames (1.433 s media) — the documented sub-frame case:
holdTolerance = max(2/30, 0.05) = max(0.067, 0.05) = 0.067 s.- At t=3.4: frameIndex=42,
42 < 43→ normal path returns frame 42. ✓ - At t=3.44: frameIndex=43, source exhausted, hold branch entered.
3.44 ≥ 3.45 − 0.067 = 3.383→ hold frame 42. ✓ - At t=3.45: same, hold. ✓ (matches the new test at
videoFrameExtractor.test.ts:379-403.)
- 60 fps clip same shape —
holdTolerance = max(2/60, 0.05) = max(0.033, 0.05) = 0.05. The floor wins, hold window stays ≥ the compiler's clamp epsilon, so the seam is closed at any fps the compiler decides to leave unclamped. The test ofanalyzeClipMediaFit(1.45, 1.433)returning null confirms the threshold matches between hold-window and warning-emission.
Verified the deliberately-much-shorter case still mostly blanks. Existing test holds the last frame at the clip end even when the source is shorter than the window (videoFrameExtractor.test.ts:357) — clip [0, 5], 30 source frames @ 30 fps (1 s of media in a 5 s slot):
- At t=1.5: hold branch entered,
1.5 ≥ 5 − 0.067 = 4.933→ false → blank. ✓ - At t=5.0:
5.0 ≥ 4.933→ true → hold last frame. ✓
So the binary "much-shorter still blanks for the tail" framing in the PR body is correct in spirit, but technically the last holdTolerance window (~67 ms at 30 fps, 50 ms at 60 fps) will now freeze instead of blank — below human perception threshold and consistent with the inclusive-end hold the runtime already shows, but worth being precise that the change isn't strictly "only sub-frame shortfalls now hold": it's "any clip's last 50-67 ms now holds once source is exhausted."
Verified the runtime/recording symmetry (concur with Rames). Rames's specific check on init.ts:480-482 (runtime keeps elements visible through currentTime <= computedEnd) confirms the recording now matches the preview's visibility contract — the recording/preview divergence risk that's the historical failure mode here doesn't materialize. Both go through the same FrameLookupTable.getActiveFramePayloads. And the audio side: audioMixer.ts:396 apad=whole_dur=${totalDuration} silently pads the slot's audio tail with silence, which is exactly the right symmetry — held frame + silence is what a held playback would produce, not held frame + repeating final-audio-sample (which would be perceptually wrong).
Findings (numbered, with severity tag):
-
NIT (cross-package coupling) — concur with Rames #1 + adopt his exact fix shape —
videoFrameExtractor.ts:973introducesMEDIA_SLOT_SHORTFALL_TOLERANCE_SECONDS = 0.05with a comment that explicitly says "Mirror of core'sMEDIA_DURATION_CLAMP_EPSILON_SECONDS." But the source constant atcore/src/compiler/timingCompiler.ts:52is not exported fromcore/src/compiler/index.tsorcore/src/index.ts. Rames's proposed fix is correct: export fromcore/src/compiler/index.ts, re-export atcore/src/index.ts, and import directly invideoFrameExtractor.ts. The whole point of the hold-tolerance flooring is its tie to the compiler's clamp decision — if someone bumps the core epsilon to0.07for a future ffprobe-precision concern, the engine still uses0.05and the half-frame seam re-opens above 40 fps. The comment already promises mirroring; making the dependency real costs nothing. -
NIT (audio-clamp coverage gap) — concur with Rames #2, full credit, this was a miss on my pass —
producer/services/htmlCompiler.ts:259gates thelog.warnonr.tagName === "video", butclampResults(line 239) filters only by!loop— no tag restriction — and the clamp at lines 253-254 applies to<audio>too. So<audio data-duration="10">with a 6 s source gets silently shortened, same author-confusion mode the warning is meant to prevent. Thevalidate.tsaudit at line 93 also walksvideo[data-duration]only, so neither warning surface catches audio shortfalls. Two paths from here, author's call: (a) extend the warning to<audio>with audio-appropriate wording (no hold-frame analogue — slot just gets shorter), or (b) document the intentional exemption in either the warning thesis or the PR body. Right now the PR body says "videos" specifically without explaining why audio is silent, and the design is defensible if the answer is "audio-clamp-silent is fine because there's no visible analogue to the black flash" — just make that explicit. -
NIT (double-duty 0.05 constant + warning floor) — sharper framing from Rames #3 —
MEDIA_SLOT_SHORTFALL_TOLERANCE_SECONDS = 0.05does double duty: (a) the render path holds for shortfalls ≤ 50 ms (no visible seam), (b)analyzeClipMediaFitreturns non-null and fires the warning for shortfalls > 50 ms. A clip 51 ms short of its slot would BOTH be held seamlessly AND emit a warning. That's the intended belt-and-suspenders, but the warning text reads like "your render is broken" when in fact 51-ms shortfalls render fine at 60 fps (2 frames). Rames's proposed fix is right: one line in the warning ("this case still renders cleanly, but the longer slot is probably author error") clears up the implied severity. -
NIT (precision in PR body / docstring framing) — As noted in the verified-walkthrough above: "much-shorter clips still blank for the tail" is true for the bulk of the tail but the last
holdTolerancewindow (~50-67 ms) now holds instead of blanking. Below human perception, but the framing in the engine docstring ("a video that is substantially shorter than its slot still goes blank for the tail (unchanged)") slightly overstates the unchangedness. Adjusting to "blanks for the tail except the lastholdTolerancewindow before slot end" would be precise. Optional copy-edit. -
NIT (convention question —
ponytail:strikes again) —htmlCompiler.ts:257-259continues the same// ponytail: ...prefix pattern as HF #1725 yesterday (same Miguel-author shape). Rames read it on the previous PR as "probably TL;DR:" / dictation residue. If this is intentional shorthand for "callout that's not a TODO but you want a reader to notice," great — but no other contributor in the repo will know the meaning, so it lands cryptic. Two paths: (a) replace withNOTE:/CAVEAT:/// XXX:so the repo's grep tools recognize it; (b) keep it but document the convention. Asking because the second occurrence in 24 h makes it look like an intentional pattern rather than a typo. -
NIT (validate-time metadata-loaded race) — concur with Rames #5 —
validateInBrowserwaitsopts.timeout(default 3 s) afterdomcontentloadedbefore runningauditVideoClipDurations. TheNumber.isFinite(clip.duration) || clip.duration <= 0guard silently skips clips with metadata not yet loaded. For local files that's fast; for S3-hosted videos behind a slow CORS preflight, 3 s might be insufficient — and the failure mode is silent no-warn, which is exactly the case the audit exists to catch. Rames's two proposed fixes are both right: (a)page.waitForFunction(() => Array.from(...).every(v => Number.isFinite(v.duration) && v.duration > 0), { timeout: 5000 })before the audit, or (b) log "skipped N videos awaiting metadata" so the author knows the audit was incomplete. Author discretion. -
NIT (test coverage —
frameIndexpinned, pixels not) — concur with Rames #4 — The new test atvideoFrameExtractor.test.ts:379-404asserts the correct unit invariant (frameIndex === 42), but the PR body's "luma 14 → 237" verification is manual, not a regression test. Ifproducer/regression-harness.tscarries a fixture that runs a real ffmpeg-trimmed clip end-to-end with pixel assertions, hooking the 1.45 s-slot / 1.433 s-media case there would close the loop and re-run on everygetActiveFramePayloadschange. Not blocking; the unit test is the right granularity for the function, but the pixel-level promise is currently load-bearing on author memory. -
NIT (sub-composition warning coverage — undersold in PR body) — Per
htmlCompiler.ts:1391"Inline sub-compositions into the main HTML so the runtime takes the same path" — sub-composition<video>elements get inlined into the top-level document in the default render mode. Thevalidate-timeauditVideoClipDurationsusesdocument.querySelectorAll("video[data-duration]")on that inlined document, so it likely already covers inline-mode sub-comp videos. The iframe-render exception (the runtime-hint athtmlCompiler.ts:112-116switches to iframe mode when<iframe>is detected) is the genuine gap for the validate-time audit. The PR body and theponytail:comment frame the gap as "sub-comp videos aren't warned at all" — strictly true for the compile-time warning (logger not threaded throughparseSubCompositions), but conservative for validate-time. Either confirm I'm reading the inlining correctly and update the description, or if iframe-mode sub-comps are common in practice, document the iframe-mode coverage gap explicitly. The PR is more thorough than its own description.
Side observation (not a finding): the warning text mentions three remediation options ("Set data-duration to ~Xs, trim data-media-start, or use a longer/looping source if that isn't intended") which is a nicely actionable triple. The corresponding compile-time log.warn message in htmlCompiler.ts:260-268 lists the same three. Consistency between the two warning surfaces is good — an author who hits one and fixes it will recognize the other if they switch tools.
Things I checked and was happy with (overlapping with Rames):
- Loop branch gating:
video.loop && localTime >= loopDuration → localTime %= loopDuration(line 1119) keeps looping clips in range; the defensiveloop && frameIndex >= totalFramesclamp at 1123-1129 still pins to the last frame for the wrap edge. The new hold-branch logic only fires for non-looping clips after source exhaustion, so loop semantics are unchanged.analyzeClipMediaFit({loop: true})correctly returns null. - Active-set drop at
globalTime > video.end(refreshActiveSetline 1077,<= candidate.endinclusive) means the held frame doesn't bleed past the clip end into a subsequent black region. Clean inverse-operation handoff (Rames's framing). compileStage.ts:124correctly threadslogintocompileForRender. Other consumers (compilationRunner.ts:49,121,regression-harness.ts:906per Rames's sweep) call withoutlog, but those are test-harness paths where suppressing the warning is fine. So the compile-time warning only fires in production-render path.analyzeClipMediaFithas solid unit coverage includingloop=true, NaN, zero-slot edge cases.
Cross-PR coherence with HF #1725 (yesterday) and EF #40577 (2 days ago): Three Miguel-author retry / playback correctness PRs in rapid succession. EF #40577 is the outer Temporal-workflow retry envelope; HF #1725 is the inner producer-side adaptive-parallelism retry; HF #1726 is the render-correctness layer at the frame-lookup table. Together they're shaping up as a coordinated render-reliability arc — broken renders now bail fast at two layers AND a render that succeeds doesn't have a one-frame black flash at the cut. The observability gaps each PR has (telemetry counter on the no-progress bail; debug log on hold-frame application per Rames's thought-bubble #6 here; per-cohort counters across the stack) cohere into a single oncall query story: "broken render bailed fast in producer ⟶ workflow gave up after 3 attempts" + "render succeeded, hold-frame fired N times for clip X" → could be a useful Datadog board if it ever gets built.
CI state at 3ee90e823d2e63527720569ff74022d9b17b5213: new rollup just kicked off after the rebase — most fast gates green (Format, Semantic PR title, Detect changes across workflows, CodeQL Python + actions, File size check); slower gates (Build, Lint, Typecheck, Test, Tests on windows-latest, Render on windows-latest, regression-shards 1-8, Preview parity, CLI: npx shim matrix, CLI smoke (required), Smoke: global install, SDK unit/contract/smoke, Test: runtime contract, Studio: load smoke, Fallow audit) IN_PROGRESS or QUEUED. Test: skills SKIPPED (path-filtered — no skills/ changes). The change between SHAs is comment-tightening only (verified above), so the in-flight checks should track the previous-SHA results. No required failures expected; should land clean modulo regression-shard luck.
Prior reviewer state: Rames COMMENTED LGTM at 3ee90e82 (22:41 UTC) with five findings + a thought-bubble (telemetry counter on hold-frame application) + a five-bullet "happy with" list. Convergent with my read at the same SHA — Rames went deeper on the audio-clamp coverage gap (Finding #2) which was a substantive miss on my first pass; full credit. My distinct angles: tolerance math at the 30/60 fps regime boundaries, the "last 50-67 ms of much-shorter clips also now holds" precision point, and the validate-time sub-composition coverage analysis. No daylight on verdict.
Review by Via
…slot Renders showed the page background (a one-frame black flash) right before a cut when a video clip's source media was a hair shorter than its data-duration slot — the common case, since `ffmpeg -t 1.45` emits 43 frames = 1.433s at 30fps. The frame lookup only held the last frame at the exact clip end, so the sub-frame remainder rendered blank. - Hold the last extracted frame for the rest of the slot once the source is exhausted, within a tolerance floored at the compiler's 0.05s clamp epsilon so the seam is covered at any fps (2 frames alone is < 0.05s above 40fps). Clips deliberately much shorter than their slot still blank for the tail (unchanged). - Warn when the compiler clamps a video's data-duration down to its media length (slot longer than source by more than the clamp epsilon): a render-time `[compile]` warning in the producer, plus a matching `validate` warning that reads each <video>'s live duration in headless Chrome (static HTML lint can't see media durations). A shared `analyzeClipMediaFit` keeps both on one threshold. Adds engine unit tests for the hold behavior and the analyzer.
3ee90e8 to
4de9233
Compare
|
Thanks — addressed the actionable ones (pushed):
|
Problem
Fast-cut compositions showed a one-frame black flash right before each cut. Root cause: when a
<video>clip's source media is even a hair shorter than itsdata-durationslot — the common case, sinceffmpeg -t 1.45emits 43 frames = 1.433s at 30fps — the frame lookup only held the clip's last frame at its exact end. The sub-frame remainder rendered the page background instead.Fix
FrameLookupTable.getActiveFramePayloadsnow holds the last extracted frame for the rest of the slot once the source is exhausted, within a tolerance floored at the compiler's 0.05s clamp epsilon (MEDIA_DURATION_CLAMP_EPSILON_SECONDS). Flooring matters:2 frames / fpsalone is< 0.05sabove 40fps, so 60fps renders could still seam. Clips deliberately much shorter than their slot still blank for the tail (existing behavior, still tested).Warnings (so the mismatch isn't silent)
When the compiler clamps a video's
data-durationdown to its media length (slot longer than source by more than the clamp epsilon), it's now surfaced:[compile]log.warnat the clamp site (the only place the authored slot is still known; everything downstream sees the clamped value).validate— reads each<video>'s live.durationin headless Chrome and warns. Staticlintcan't do this (no media durations), sovalidateis the right home.A shared
analyzeClipMediaFitkeeps both on one threshold. The warning fires for top-level videos; sub-composition videos still get the hold-frame fix (the warning would need a logger threaded throughparseSubCompositions— skipped as an edge case).Tests
Engine unit tests for the hold behavior (sub-frame shortfall held; large shortfall still blanks) and for
analyzeClipMediaFit. Verified end-to-end: the boundary that rendered pure black (luma 14) now holds the last frame (luma 237), both warnings fire on a slot-longer-than-media clip, and no false positive when durations match.