Metal backend: Add operator implementations #15023

manuelcandales · 2025-10-10T21:01:51Z

Adds bfloat16/float32 working implementations of the following AOTI shim ops:

aoti_torch_mps_mm_out
aoti_torch_mps_convolution
aoti_torch_mps__scaled_dot_product_attention_math_for_mps

Adds a stub implementation of aoti_torch_mps_addmm_out

[ghstack-poisoned]

manuelcandales · 2025-10-10T21:01:52Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-10-10T21:01:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15023

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 4367977 with merge base 6e0c9f6 ():

NEW FAILURE - The following job has failed:

pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 11deac715a6050bb49e61b228d677b755973a19b0083817e367c501d49b48dc4 /exec failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-models-linux (linear, portable, linux.2xlarge) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

backends/apple/metal/runtime/shims/et_metal_ops.mm

mergennachin · 2025-10-12T17:36:31Z

backends/apple/metal/runtime/shims/et_metal_ops.h

+ * ExecutorTorch implementation of aoti_torch_mps_mm_out.
+ * Performs simple matrix multiplication: out = self @ mat2
+ */
+AOTITorchError aoti_torch_mps_mm_out(


Does custom ops use caching mechanism like the ETMetalShaderLibrary?

No, not yet. These fallback ops are implemented using MPSGraph, so, here we would be caching the graph. This is something I want to look into later when optimizing performance. But this deserves time. In particular, since I never understood why MPSGraph operations have a non-trivial CPU overhead in PyTorch, in spite of PyTorch having a caching mechanism for MPSGraphs.

mergennachin · 2025-10-12T17:41:41Z

backends/apple/metal/runtime/shims/et_metal_ops.mm

+        // For attention weights, zero-fill the GPU buffer (shared memory allows CPU memset)
+        std::memset(attn_contents_ptr, 0, attn_size_bytes);


do you need zero filling here

Well, I though it was nicer to return 0, rather than some random stuff.

mergennachin · 2025-10-12T17:46:20Z

backends/apple/metal/runtime/shims/et_metal_ops.mm

+
+        // Set output tensor handles
+        *ret0 = out_tensor_handle;
+        *ret1 = attn_tensor_handle;


Is ret1 actually populated or just zerod

This is just zeroed.
We are using MPSGraph's scaledDotProductAttention which only returns the output tensor.
We need to return an attention tensor because we need to match _scaled_dot_product_attention_math_for_mps signature. But we don't really need it, it gets thrown away here

[ghstack-poisoned]

Adds bfloat16/float32 working implementations of the following AOTI shim ops: - aoti_torch_mps_mm_out - aoti_torch_mps_convolution - aoti_torch_mps__scaled_dot_product_attention_math_for_mps Adds a stub implementation of aoti_torch_mps_addmm_out ghstack-source-id: 61b8cc4 ghstack-comment-id: 3392300522 Pull-Request: pytorch#15023

[ghstack-poisoned]

Update

3bea537

[ghstack-poisoned]

manuelcandales requested review from cccclai and shoumikhin as code owners October 10, 2025 21:01

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 10, 2025

manuelcandales requested review from larryliu0820 and mergennachin and removed request for cccclai and shoumikhin October 10, 2025 21:03

Update

de83a9f

[ghstack-poisoned]

mergennachin reviewed Oct 12, 2025

View reviewed changes

manuelcandales added 2 commits October 13, 2025 12:46

Update

2f092af

[ghstack-poisoned]

Update

e9b3372

[ghstack-poisoned]

manuelcandales added the release notes: none Do not include this in the release notes label Oct 13, 2025

manuelcandales added 2 commits October 13, 2025 18:47

Update

aec8796

[ghstack-poisoned]

Update

3229b92

[ghstack-poisoned]

manuelcandales added 5 commits October 14, 2025 20:57

Update

7f178d3

[ghstack-poisoned]

Update

780d883

[ghstack-poisoned]

Update

61ead64

[ghstack-poisoned]

Update

750badf

[ghstack-poisoned]

Update

6a6ba04

[ghstack-poisoned]

manuelcandales added 2 commits October 15, 2025 15:40

Update

7c1b9b2

[ghstack-poisoned]

Update

4367977

[ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metal backend: Add operator implementations #15023

Metal backend: Add operator implementations #15023

Uh oh!

manuelcandales commented Oct 10, 2025

Uh oh!

manuelcandales commented Oct 10, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergennachin Oct 12, 2025

Uh oh!

manuelcandales Oct 15, 2025

Uh oh!

mergennachin Oct 12, 2025

Uh oh!

manuelcandales Oct 15, 2025

Uh oh!

mergennachin Oct 12, 2025

Uh oh!

manuelcandales Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// For attention weights, zero-fill the GPU buffer (shared memory allows CPU memset)
		std::memset(attn_contents_ptr, 0, attn_size_bytes);

Metal backend: Add operator implementations #15023

Are you sure you want to change the base?

Metal backend: Add operator implementations #15023

Uh oh!

Conversation

manuelcandales commented Oct 10, 2025

Uh oh!

manuelcandales commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15023

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mergennachin Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

manuelcandales commented Oct 10, 2025 •

edited

Loading

pytorch-bot bot commented Oct 10, 2025 •

edited

Loading