-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: How to generate an MPS gputrace #6506
Comments
Unfortunately we don't have any docs. At some point I spent a considerable amount of time trying to learn Metal Debugger / Xcode Instruments in order to generate some useful information about the Metal performance, but I just got completely lost. If someone who is more familiar with Metal and is interested in contributing, it can be a very useful addition to write some instructions how to do profiling with this tools. |
Thanks for the information, I'm familiar from the Rust side so let me see if it's easy enough to port to this repository 👍 |
Is there any method to see each metal shader time cost when inference now? I am lost how to profile each shader, is anyone provide a method, thanks~ |
Hey @bitxsw93, I realize I forgot to post back here with the change required to dump timings: Here is the change you can make to output a gputrace file to
|
@ggerganov pretty small change required to get these outputs during development, not sure if you think it's worth integrating this in with a more configurable setting somewhere to enable the dump. |
I tried to apply the path and see if it works, but I get the following error:
Any ideas what could be wrong? |
AH right, there is a super secret super fun environment variable you also need to use: METAL_CAPTURE_ENABLED=1 https://developer.apple.com/documentation/xcode/capturing-a-metal-workload-programmatically |
Oh wow, this is super cool! This is very useful and it looks like something we can use to improve the Metal backend performance (some compute gaps between the encoders are immediately visible): ![]()
Sure, we should add some way to do this. Open to suggestions. Maybe an environment variable specifying the |
Maybe rename the existing |
Yup, that's an option. But maybe to avoid changing the |
We're doing some work over at https://github.com/huggingface/candle to improve our Metal backend, I've been collecting various gputraces for the different frameworks and was wondering if there was a documented/known way to generate one for llama.cpp during model inference.
Specifically talking about this type of debugger output: https://developer.apple.com/documentation/xcode/metal-debugger
The text was updated successfully, but these errors were encountered: