Skip to content

fix(engine): support AMD AMF GPU encoding#1082

Merged
miguel-heygen merged 1 commit into
mainfrom
fix/amd-amf-gpu-encoder
May 26, 2026
Merged

fix(engine): support AMD AMF GPU encoding#1082
miguel-heygen merged 1 commit into
mainfrom
fix/amd-amf-gpu-encoder

Conversation

@miguel-heygen

Copy link
Copy Markdown
Collaborator

Problem

hyperframes render --gpu only detected and mapped NVENC, VideoToolbox, VAAPI, and QSV. On Windows machines with AMD GPUs, FFmpeg exposes h264_amf / hevc_amf, but HyperFrames had no AMF encoder type or argument path.

Closes #1079

What this fixes

  • Adds AMD AMF as a supported GPU encoder backend.
  • Maps AMF to FFmpeg's h264_amf and hevc_amf encoder names.
  • Uses AMF-compatible constant-QP arguments when no target bitrate is provided.
  • Applies the same H.264 B-frame disable compatibility guard to AMF as the other hardware encoders.
  • Updates rendering, CLI, producer, engine, and HDR docs to list AMF support.

Root cause

GPU detection used FFmpeg's compiled encoder list as the selection truth and stopped at the first known encoder. Some FFmpeg Windows builds can list encoders that are compiled in but unusable on the current hardware, so a machine with AMD AMF support could still be routed toward an unusable NVENC path. The encoder name mapper also fell back to CPU encoders for unknown backends, so AMF could never be selected intentionally.

This changes detection into a two-step flow: parse compiled hardware encoder candidates, then probe each candidate with a tiny FFmpeg encode and choose the first usable backend.

Verification

Local checks

  • bun run build:hyperframes-runtime
  • bun run --cwd packages/engine test -- src/utils/gpuEncoder.test.ts src/services/chunkEncoder.test.ts src/services/streamingEncoder.test.ts
  • bun run --cwd packages/engine typecheck
  • bunx oxlint packages/engine/src/utils/gpuEncoder.ts packages/engine/src/utils/gpuEncoder.test.ts packages/engine/src/services/chunkEncoder.ts packages/engine/src/services/chunkEncoder.test.ts packages/engine/src/services/streamingEncoder.ts packages/engine/src/services/streamingEncoder.test.ts
  • bunx oxfmt --check packages/engine/src/utils/gpuEncoder.ts packages/engine/src/utils/gpuEncoder.test.ts packages/engine/src/services/chunkEncoder.ts packages/engine/src/services/chunkEncoder.test.ts packages/engine/src/services/streamingEncoder.ts packages/engine/src/services/streamingEncoder.test.ts packages/cli/src/docs/rendering.md docs/packages/engine.mdx docs/guides/rendering.mdx docs/packages/cli.mdx docs/guides/hdr.mdx docs/packages/producer.mdx
  • git diff --check
  • ffprobe on the rendered MP4 confirmed H.264, 1920x1080, 10fps, 15s.
  • ffmpeg -v error -i <rendered mp4> -f null - decoded the render without errors.

Browser verification

  • Rendered registry/examples/kinetic-type with local GPU encoding:
    • bun run --cwd packages/cli dev -- render /Users/miguel07code/.codex/worktrees/0da0/hyperframes-oss/registry/examples/kinetic-type --gpu --quality draft --workers 1 --fps 10 --output /tmp/hyperframes-issue-1079-artifacts/issue-1079-amf-gpu-render.mp4
  • detectGpuEncoder() returned videotoolbox on this Mac, so the end-to-end render exercised the shared GPU encode path locally.
  • agent-browser opened the rendered MP4 and verified browser playback advanced with:
    • readyState=4
    • currentTime=1.460343
    • duration=15.020996
    • videoWidth=1920
    • videoHeight=1080
    • paused=false
  • Screenshot: /tmp/hyperframes-issue-1079-artifacts/issue-1079-browser-playback.png
  • Recording: /tmp/hyperframes-issue-1079-artifacts/issue-1079-browser-playback.webm

Notes

I could not exercise real AMF hardware on this Mac. AMF-specific behavior is covered by unit tests for detection fallthrough, encoder naming, and FFmpeg argument generation; the real render/browser proof used the locally available VideoToolbox backend.

The pre-commit hook's fallow step still flags inherited duplication across chunkEncoder.ts and streamingEncoder.ts when both files are touched, plus an existing unused type re-export. I did not include a broad encoder refactor in this issue fix.

@mintlify

mintlify Bot commented May 26, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hyperframes 🟢 Ready View Preview May 26, 2026, 4:44 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown

Fallow audit report

Found 62 findings.

Dead code (1)
Severity Rule Location Description
major fallow/unused-type packages/engine/src/services/streamingEncoder.ts:32 Type re-export 'EncoderOptions' is never imported by other modules
Duplication (57, showing 50)
Severity Rule Location Description
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:86 Code clone group 1 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:120 Code clone group 1 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:232 Code clone group 2 (15 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:264 Code clone group 3 (14 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:275 Code clone group 4 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:294 Code clone group 5 (5 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:321 Code clone group 6 (10 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:424 Code clone group 7 (9 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:426 Code clone group 8 (6 lines, 3 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:440 Code clone group 7 (9 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:442 Code clone group 8 (6 lines, 3 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:466 Code clone group 9 (5 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:472 Code clone group 10 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:502 Code clone group 10 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:567 Code clone group 8 (6 lines, 3 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:587 Code clone group 9 (5 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:709 Code clone group 11 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:730 Code clone group 12 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:745 Code clone group 12 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:751 Code clone group 11 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:809 Code clone group 5 (5 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:821 Code clone group 6 (10 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.test.ts:857 Code clone group 13 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:110 Code clone group 14 (65 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:258 Code clone group 15 (38 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:303 Code clone group 16 (24 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:352 Code clone group 17 (197 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:411 Code clone group 18 (22 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:414 Code clone group 19 (19 lines, 3 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:431 Code clone group 20 (9 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:546 Code clone group 21 (9 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:590 Code clone group 21 (9 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:657 Code clone group 22 (11 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/chunkEncoder.ts:698 Code clone group 22 (11 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:73 Code clone group 13 (6 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:179 Code clone group 23 (11 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:194 Code clone group 23 (11 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:251 Code clone group 2 (15 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:272 Code clone group 3 (14 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:282 Code clone group 4 (7 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:427 Code clone group 24 (7 lines, 5 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:449 Code clone group 24 (7 lines, 5 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:473 Code clone group 24 (7 lines, 5 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:497 Code clone group 25 (8 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:522 Code clone group 24 (7 lines, 5 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:550 Code clone group 24 (7 lines, 5 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:572 Code clone group 25 (8 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:596 Code clone group 26 (14 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.test.ts:629 Code clone group 26 (14 lines, 2 instances)
minor fallow/code-duplication packages/engine/src/services/streamingEncoder.ts:200 Code clone group 14 (65 lines, 2 instances)

Showing 50 of 57 findings. Run fallow locally or inspect the CI output for the full report.

Health (4)
Severity Rule Location Description
critical fallow/high-crap-score packages/engine/src/services/chunkEncoder.ts:82 'buildEncoderArgs' has CRAP score 68.0 (threshold: 30.0, cyclomatic 57)
minor fallow/high-cognitive-complexity packages/engine/src/services/chunkEncoder.ts:484 'encodeFramesChunkedConcat' has cognitive complexity 16 (threshold: 15)
critical fallow/high-crap-score packages/engine/src/services/streamingEncoder.ts:140 'buildStreamingArgs' has CRAP score 58.4 (threshold: 30.0, cyclomatic 50)
minor fallow/high-cognitive-complexity packages/engine/src/utils/gpuEncoder.ts:91 'getGpuEncoderName' has cognitive complexity 16 (threshold: 15)

Generated by fallow.

@jrusso1020 jrusso1020 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the full diff + traced buildEncoderArgs / buildStreamingArgs end-to-end. Two-step detection (parse compile-list → probe each candidate) is a strict upgrade over the prior first-match-wins — the exact "AMD Windows machine with NVENC compiled in but no NVIDIA hardware" scenario the PR exists to fix.

Pinned invariants worth keeping

  • getCompiledGpuEncoders returns candidates in GPU_ENCODER_CANDIDATES priority order (nvenc → videotoolbox → vaapi → qsv → amf). The test in gpuEncoder.test.ts asserts the exact returned array; if anyone reorders the candidate array, that test breaks. AMD goes last — correct, since on a mixed Windows host (NVIDIA + AMD discrete + Intel iGPU) the existing nvenc/qsv-first preference is preserved.
  • selectUsableGpuEncoder test pins fallthrough from compiled-but-unusable NVENC → AMF. That's the load-bearing assertion for this PR; keep it as-is, don't refactor into a generic parameterized case.
  • Intra-file parity between chunkEncoder.ts and streamingEncoder.ts is symmetric — AMF case in the rate-control switch, AMF added to the H.264 B-frame guard, parallel tests in both files. The pre-disclosed Fallow duplication red is exactly because both files duplicate this logic; the parity fix did the right thing by touching both at once.

Non-blocking observations

  1. AMF rate-control = CQP, not CRF-equivalent. The branch picks -rc cqp -qp_i N -qp_p N for the no-bitrate case. This is consistent with how the other GPU paths handle quality (-cq on nvenc, -qp on vaapi, -global_quality on qsv) — i.e. fixed-quantizer, not perceptual rate-control. If the Onee/Donato thread today is steering toward CRF-like behavior across the board, AMF won't be CRF-aligned after this PR even though it's now supported. AMF has -rc qvbr (quality-defined VBR, closer to CRF semantics) — would be the migration target for the broader CRF direction. Not for this PR; just noting AMF inherits the same fixed-QP shape as the other GPU paths.

  2. No preset/speed mapping for AMF. The switch only emits rate-control args — no -preset or -quality push. Compare:

    • nvenc / qsv: explicit -preset mapped via mapPresetForGpuEncoder
    • videotoolbox / vaapi / amf: no preset arg at all

    Side effect: --quality draft (which maps to preset="ultrafast") on an AMD box uses FFmpeg's h264_amf default of -quality quality (the slowest of the three AMF modes). Users who chose --quality draft for speed will be confused about why their AMF encode is slow. Same gap exists for videotoolbox/vaapi today, so not a regression — but the AMF "balanced" / "speed" / "quality" trio is well-defined and an obvious follow-up for the next AMF iteration.

  3. Probe is silent on failure. When canUseGpuEncoder(...) returns false, there's no log of why — stderr is "ignore". A future "I have an AMD GPU and it's falling through to CPU" debug session has no signal. Consider a process.env.HYPERFRAMES_DEBUG_GPU_PROBE=1 toggle that flips stderr to inherited and logs the chosen encoder + each probe's exit code. Cheap; saves a long debug loop the first time a real AMD user hits a probe-but-fails path.

  4. Probe exercises 1 frame at 16x16. Says "encoder can encode at all." Doesn't validate B-frame support, color-space tagging, HDR transfer, or real-frame parameters. Real failures from those will surface at first chunk-encode time — acceptable for a baseline detection check, just noting the fidelity ceiling.

  5. No real AMF hardware run — author pre-disclosed in PR body. First AMD Windows user is the canary; structural correctness covered by unit tests, real-render covered by the locally-available videotoolbox path. Fine for the kind of change this is, but worth a Slack call-out if/when an AMD-Windows internal test machine becomes available.

  6. Fallow audit red — pre-disclosed in PR body as inherited chunkEncoder.tsstreamingEncoder.ts duplication, not introduced. ✓ Acknowledged appropriately; not blocking.

Verdict

Structurally sound. The "compile-list + probe" detection is a real upgrade. AMF arg generation matches the FFmpeg h264_amf / hevc_amf contract. Test coverage pins the AMD-Windows fallthrough scenario.

CI is mid-flight (regression shards + Windows tests in progress) — assuming those land green, James/Miguel to merge.

— Rames Jusso (hyperframes)

@vanceingalls vanceingalls left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strengths

  • getCompiledGpuEncoders + canUseGpuEncoder probe (gpuEncoder.ts:29-149) is a meaningful correctness improvement beyond just adding AMF — it kills the "compiled means usable" false-positive class for all backends, including NVENC on machines where the driver is absent. That was the actual bug behind #1079.
  • B-frame guard applied symmetrically to AMF in both chunkEncoder.ts:160 and streamingEncoder.ts:243. No asymmetry between the two encoder paths.
  • AMF included in the mapPresetForGpuEncoder pass-through parameterized test — author thought about the preset surface, not just the codec name.

Important

1. void selectUsableGpuEncoder(...).then(resolve) — no rejection handler (gpuEncoder.ts:59)

If selectUsableGpuEncoder rejects (it can't today, but the contract isn't enforced), detectGpuEncoder's outer Promise hangs forever instead of resolving to null. getCachedGpuEncoder (and every render call behind it) would stall indefinitely. One line fix:

void selectUsableGpuEncoder(candidates, canUseGpuEncoder)
  .then(resolve)
  .catch(() => resolve(null));

2. SIGTERM-only on probe timeout — no SIGKILL escalation (gpuEncoder.ts:121)

ffmpeg.kill("SIGTERM") works for clean exits; a hung or unresponsive driver (exactly the failure mode this PR guards against) may ignore SIGTERM. finish(false) is called immediately so the JS side continues correctly, but the child process stays alive. On a render workload that spawns multiple probes, zombie ffmpeg processes accumulate. Recommend SIGKILL after ~1 s:

const kill = setTimeout(() => ffmpeg.kill("SIGKILL"), 1000);
// in finish():
clearTimeout(kill);

3. First-call latency is now serial × 5 s per candidate — up to ~25 s worst case (gpuEncoder.ts:42-60)

Before this PR: detection was one ffmpeg -encoders call, instant. After: one ffmpeg -encoders call plus up to 5 serial probe processes (5 s timeout each). On a Windows AMD machine with NVENC compiled in but unavailable (common), NVENC probe times out before AMF is tried — 5 s added to the first --gpu render. Worst case (5 candidates, only the last usable): ~25 s.

getCachedGpuEncoder memoizes, so this is one-time, not per-chunk. That contains the blast radius, but it's still a UX regression for first render on Windows. Consider parallelizing probes with Promise.all across candidates and picking the first winner by priority order, or at least capping the per-probe timeout lower (2 s is plenty — a usable driver responds fast).

4. Hardcoded VAAPI device /dev/dri/renderD128 in probe (gpuEncoder.ts:102)

Pre-PR behavior: selected VAAPI immediately based on compiled list, then failed at render if the device was wrong — also wrong. Post-PR behavior: better on the common case, but on multi-GPU Linux where the usable VAAPI device is renderD129, the probe reports VAAPI unusable and falls through to the next candidate. Silently breaks VAAPI for those setups.

This is pre-existing behavior + a net improvement, not a regression introduced by this PR. Fine to track as a follow-up issue rather than blocking.


Nit

  • GPU_ENCODER_CANDIDATES priority order (gpuEncoder.ts:19-25): on a Windows hybrid Intel iGPU + AMD discrete, QSV probes successfully first and AMF is never tried, even though AMD discrete is usually faster. The PR solves the reported bug. Reordering for perf is a follow-up consideration.

Note on Fallow

The fallow/code-duplication flood is pre-existing duplication across chunkEncoder.ts and streamingEncoder.ts that this PR mirrors for AMF, not new structural debt it introduces. Author calls this out in the PR body and explicitly scoped out the broad refactor. Acknowledged.

fallow/high-crap-score on buildEncoderArgs (CRAP 68, cyclomatic 57) and buildStreamingArgs (CRAP 58.4): these are pre-existing and grow with every encoder added. Not a blocker for this PR, but the complexity ceiling for these functions is real — worth a dedicated refactor before the next encoder lands.


Audited: packages/engine/src/utils/gpuEncoder.ts (end-to-end), packages/engine/src/services/chunkEncoder.ts (new AMF paths), packages/engine/src/services/streamingEncoder.ts (new AMF paths), all test files touched.

Verdict: APPROVE
Reasoning: The detection refactor is a correctness fix that benefits all GPU backends, not just AMF. No correctness blockers. The important findings (rejection-unhandled promise, SIGTERM-only kill, serial probe latency) are real concerns worth fixing but don't break the primary case — a Windows AMD box with only AMF usable — which is what the PR ships.

— Vai

@miguel-heygen miguel-heygen force-pushed the fix/amd-amf-gpu-encoder branch from 157ce9e to a53aad5 Compare May 26, 2026 16:55
@miguel-heygen miguel-heygen merged commit 0e052e4 into main May 26, 2026
42 of 43 checks passed
@miguel-heygen miguel-heygen deleted the fix/amd-amf-gpu-encoder branch May 26, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AMD coder (AMF) support in --gpu mode

3 participants