-
Notifications
You must be signed in to change notification settings - Fork 638
[Group Partitioner] leverage group partitioner for config-based partitioner #12845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Stack from ghstack (oldest at bottom): |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12845
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 97908c2 with merge base 00e3f99 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@mcr229 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This PR needs a
|
We use the new group based partitioner in the ConfigerationBasedPartitioner. This solves issues in the XnnpackPartitioner when required dependencies end up in different partitions. For example, consider the following case:
In this case, we have two linear layers sharing the same activation and thus the same dynamically quantized linear chain. With the capability based partitioner, we do greedy partitioning from the bottom up, this means we could end up with something like this
This is bad because when we are processing the graph, in the second partition, we lose the semantics of the dynamically quantized tensor! We need the dynamic quant chain the be grouped with the linears. Which is why the XNNPACK Partitioner needs the group based partitioner. This allows us to enforce that dependencies will stay in the same partition, giving us something more correct like such:
This ends up resolving the issues we've seen with mobilebert model, and allows us to efficiently partition and lower the model.
Dynamically Quantized Mobilebert
Differential Revision: D79020721