feat(gpu): Apple Silicon MPS (Metal) GPU Support#136
Conversation
|
Thanks for putting this together. I do think Apple Silicon/MPS support is a very useful direction for OpenMontage, especially for contributors and users running local workflows on Macs. The core idea here is worth pursuing. Before this is merge-ready, I think the PR needs a cleanup pass so we can review and land it safely:
Overall: useful PR and directionally aligned, but it needs to be narrowed and hardened before we can merge it confidently. |
- Add get_torch_device() helper in _shared.py: cuda > mps > cpu - Guard MPS detection for torch builds lacking torch.backends.mps - Check both is_built() and is_available() for MPS - Route load_diffusers_pipeline() to resolved device instead of hardcoded cuda - Use float32 on CPU (float16 is emulated/unreliable), float16 on MPS, bfloat16 on CUDA - Guard enable_model_cpu_offload() to CUDA-only; fall back to .to(device) on MPS - Enable attention slicing for MPS memory safety - Add inspect-based signature guard for device= arg on RealESRGANer/GFPGANer - Update install_instructions on all LOCAL_GPU tools to mention MPS/Apple Silicon
13115e0 to
7cd8fbf
Compare
|
Hi @calesthio, I have cleaned up the PR and squashed the history as requested. Here is a summary of the updates:
The PR is now ready for your review! |
This PR adds comprehensive support for Apple Silicon M-series Metal GPU acceleration (using PyTorch's mps backend) across all LOCAL_GPU tools in OpenMontage.
Summary of Changes
get_torch_device()helper intools/video/_shared.pyto select the best available hardware:cuda -> mps -> cpu.load_diffusers_pipeline()in_shared.pyto route tompson Apple Silicon. Guardedenable_model_cpu_offload()(which is CUDA-only) and restrictedbfloat16to CUDA since it behaves unstably on current MPS/CPU diffusers backends.upscale.pyandface_restore.pyto pass the resolved MPS device toRealESRGANerandGFPGANerand fallback to float32 on MPS (since half-precision is only safe/supported on CUDA for these architectures).tools/base_tool.pywhere a Unicode em dash (—) inside.envcomments was parsed as a variable value, causing HuggingFace token validation or other network requests to throw ASCII encoding exceptions.Physical Verification
wan_videolocal tool correctly runs end-to-end onmpsand successfully outputs the video in 17.87 seconds.tests/tools/test_mps_device.py) pass.