-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metal : reduce command encoding overhead #9698
Conversation
examples/perf-metal/perf-metal.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make more sense to move this example to ggml instead, since it does nothing specific to llama.cpp. Also applies to the benchmark-matmult
example, although that one could probably be removed entirely, since test-backend-ops
can now measure mat mult FLOPs.
699eaab
to
43b9d69
Compare
// TODO: how to avoid this allocation? I tried initializing it in ggml_backend_metal_set_n_cb but it crashes. | ||
ctx->encode_async = ^(size_t iter) { | ||
const int cb_idx = iter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This callback should not be created each time here. Instead, it should be created once in ggml_backend_meta_set_n_cb()
. But for some reason when I do it like this, we crash on the first compute. I'm missing some understanding of how Obj-C lifetime works - hopefully someone will figure this out in the future and fix it. For now, we keep creating the callback on each compute.
* metal : reduce command encoding overhead ggml-ci * metal : add comments
* metal : reduce command encoding overhead ggml-ci * metal : add comments
* metal : reduce command encoding overhead ggml-ci * metal : add comments
fix #9507
Submit the first 128 nodes from the main thread and while it is processing, enqueue and submit the rest of the command buffers.
API Changes
ggml_backend_metal_set_n_cb
Benches