Skip to content

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Sep 13, 2025

ref #15832 (comment)

Instead of allocating buffers from MTLHeap we expand the dst tensors with enough space to write the necessary scratch data.

TODO:

  • Better way to keep the get_alloc_size sizes in-sync with the actual implementation (Edit: use simple functions to get the extra sizes)
  • Think if this could somehow interfere with the memory ranges logic introduced in metal : allow ops to run concurrently #15929. Might need to start using the extended sizes in the ggml_mem_ranges instead of just using ggml_nbytes

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Sep 13, 2025
@ggerganov ggerganov marked this pull request as ready for review September 13, 2025 16:37
@ggerganov ggerganov force-pushed the gg/metal-remove-mem-pool branch from cfe86b2 to 25e82c9 Compare September 14, 2025 08:46
@ggerganov
Copy link
Member Author

@slaren The trick with expanding the allocated size of the tensors seems to work without any problems. Let me know if you have any additional thoughts about utilizing this approach.

@ggerganov ggerganov force-pushed the gg/metal-remove-mem-pool branch from 3bc8fd4 to 158526e Compare September 14, 2025 16:25
Copy link
Member

@slaren slaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I suppose there could be an issue where the extra data is not aligned in some cases, but since these operations always return F32, I guess it will always be aligned to at least 4 bytes. Not sure if there could an advantage to using a larger alignment.

@ggerganov ggerganov merged commit 9dcd200 into master Sep 14, 2025
53 of 55 checks passed
@ggerganov ggerganov deleted the gg/metal-remove-mem-pool branch September 14, 2025 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants