Skip to content

feat(gpu): Apple Silicon MPS (Metal) GPU Support#136

Open
vizionik25 wants to merge 3 commits into
calesthio:mainfrom
vizionik25:feat/apple-silicon-mps-support
Open

feat(gpu): Apple Silicon MPS (Metal) GPU Support#136
vizionik25 wants to merge 3 commits into
calesthio:mainfrom
vizionik25:feat/apple-silicon-mps-support

Conversation

@vizionik25

Copy link
Copy Markdown

This PR adds comprehensive support for Apple Silicon M-series Metal GPU acceleration (using PyTorch's mps backend) across all LOCAL_GPU tools in OpenMontage.

Summary of Changes

  1. MPS Device Resolution: Added a unified get_torch_device() helper in tools/video/_shared.py to select the best available hardware: cuda -> mps -> cpu.
  2. Video Tools: Patched load_diffusers_pipeline() in _shared.py to route to mps on Apple Silicon. Guarded enable_model_cpu_offload() (which is CUDA-only) and restricted bfloat16 to CUDA since it behaves unstably on current MPS/CPU diffusers backends.
  3. Enhancement Tools: Updated upscale.py and face_restore.py to pass the resolved MPS device to RealESRGANer and GFPGANer and fallback to float32 on MPS (since half-precision is only safe/supported on CUDA for these architectures).
  4. Docs & Registry: Updated local install instructions for all local GPU tools (Wan, LTX, Hunyuan, CogVideo, Upscale, FaceRestore) to correctly guide macOS users on M-series chips.
  5. Environment Parser Fix: Resolved a silent bug in tools/base_tool.py where a Unicode em dash () inside .env comments was parsed as a variable value, causing HuggingFace token validation or other network requests to throw ASCII encoding exceptions.

Physical Verification

  • Tested tensor math and NN backpropagation on M-series Metal GPU.
  • Verified that the wan_video local tool correctly runs end-to-end on mps and successfully outputs the video in 17.87 seconds.
  • Unit tests (tests/tools/test_mps_device.py) pass.

@vizionik25 vizionik25 requested a review from calesthio as a code owner June 22, 2026 22:23
@calesthio

Copy link
Copy Markdown
Owner

Thanks for putting this together. I do think Apple Silicon/MPS support is a very useful direction for OpenMontage, especially for contributors and users running local workflows on Macs. The core idea here is worth pursuing.

Before this is merge-ready, I think the PR needs a cleanup pass so we can review and land it safely:

  1. Please remove unrelated churn from this PR:

    • remotion-composer/package.json
    • remotion-composer/package-lock.json
    • diagram.png
    • docs/superpowers/plans/2026-06-22-apple-silicon-mps-support.md unless a small part of it is converted into concise user-facing docs
  2. Keep the PR scoped to the actual MPS support path:

    • tools/video/_shared.py
    • tools/enhancement/upscale.py
    • tools/enhancement/face_restore.py
    • focused tests for device selection / MPS behavior
    • any minimal setup docs directly needed for Apple Silicon users
  3. Please revisit the tool_registry.py change. Treating local_gpu tools as simple setup-offer candidates may be misleading, because local GPU setup is not always a quick env-var fix. It can involve PyTorch install details, model downloads, hardware support, memory limits, and provider-specific constraints.

  4. Please tighten the device/dtype handling:

    • CPU fallback should likely use torch.float32, not float16.
    • MPS detection should be guarded for torch builds where torch.backends.mps may not exist.
    • Consider checking both MPS build support and availability where the torch API supports it.
  5. Please verify RealESRGANer / GFPGANer compatibility with the added device= argument. The current mocked tests do not prove that the installed dependency versions accept that constructor argument. A version/signature guard or fallback would make this safer.

  6. Please remove unrelated test/schema changes unless they are strictly required to keep this branch green against current main. In particular, the TTS provider assertion change looks unrelated to MPS support and should probably be a separate PR.

Overall: useful PR and directionally aligned, but it needs to be narrowed and hardened before we can merge it confidently.

- Add get_torch_device() helper in _shared.py: cuda > mps > cpu
- Guard MPS detection for torch builds lacking torch.backends.mps
- Check both is_built() and is_available() for MPS
- Route load_diffusers_pipeline() to resolved device instead of hardcoded cuda
- Use float32 on CPU (float16 is emulated/unreliable), float16 on MPS, bfloat16 on CUDA
- Guard enable_model_cpu_offload() to CUDA-only; fall back to .to(device) on MPS
- Enable attention slicing for MPS memory safety
- Add inspect-based signature guard for device= arg on RealESRGANer/GFPGANer
- Update install_instructions on all LOCAL_GPU tools to mention MPS/Apple Silicon
@vizionik25 vizionik25 force-pushed the feat/apple-silicon-mps-support branch from 13115e0 to 7cd8fbf Compare June 24, 2026 21:33
@vizionik25

Copy link
Copy Markdown
Author

Hi @calesthio,

I have cleaned up the PR and squashed the history as requested. Here is a summary of the updates:

  1. Reverted Unrelated Churn:

    • Reverted remotion-composer/package.json, remotion-composer/package-lock.json, and diagram.png back to main.
    • Reverted tools/tool_registry.py (restoring the previous local GPU tool listing behavior).
    • Reverted unrelated test files (tests/contracts/test_phase3_contracts.py and tests/qa/test_08_end_to_end.py).
    • Removed the internal design/plan document (docs/superpowers/plans/2026-06-22-apple-silicon-mps-support.md).
  2. Scoped the PR to the MPS Path:

    • Added get_torch_device() helper (verifying both is_built() and is_available() for torch.backends.mps and handling missing backend environments) in tools/video/_shared.py.
    • Integrated inspect-based signature guards for RealESRGANer/GFPGANer device parameters in upscale.py and face_restore.py, ensuring half-precision is only enabled on cuda to prevent NaN artifacts on MPS.
    • Added focused unit tests in tests/tools/test_mps_device.py.
    • Created a concise user-facing guide detailing local setup for Apple Silicon users in docs/apple-silicon-mps.md.
  3. Squashed Commit History:

    • Squashed into exactly 3 clean, focused commits covering implementation, unit tests, and documentation.

The PR is now ready for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants