Skip to content

Conversation

@kpouget
Copy link
Contributor

@kpouget kpouget commented Jan 30, 2026

This PR improves the code of the ggml-virtgpu backend to make it thread safe, by using mutex for accessing the host<>guest shared memory buffers, and by pre-caching, during the initialization, the constant values queried from the backend.

The unused buffer_type_is_host method is also deprecated.

@github-actions github-actions bot added python python script changes ggml changes relating to the ggml tensor library for machine learning labels Jan 30, 2026
Copy link
Collaborator

@taronaeo taronaeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that some of the GGML_ABORT statements do not include __func__. Would it be better to include them for debugging and error tracing in the future? :)

@kpouget
Copy link
Contributor Author

kpouget commented Jan 30, 2026

thanks for the feedback, I followed them.

I see that some of the GGML_ABORT statements do not include __func__. Would it be better to include them for debugging and error tracing in the future? :)

I'm not sure about this one, that's not a common pattern in the code base,

$ gg GGML_ABORT | grep __func__ | wc -l
37 # including it
$ gg GGML_ABORT | grep -v __func__ | wc -l
446 # not including it

and when I hit an abort, I get automatically get stack trace:

ggml/src/ggml-virtgpu/virtgpu-common.cpp:19: calling abort
bin/libggml-base.so.0(+0x75ce) [0x720ce69285ce]
bin/libggml-base.so.0(ggml_print_backtrace+0x25f) [0x720ce692884c]
bin/libggml-base.so.0(ggml_abort+0x160) [0x720ce6928a1b]
libggml-virtgpu.so.0(+0x8be5) [0x720ce6a14be5]
libggml-virtgpu.so.0(+0x8c08) [0x720ce6a14c08]

so I don't think it's necessary, do you?

@taronaeo
Copy link
Collaborator

taronaeo commented Jan 30, 2026

so I don't think it's necessary, do you?

I usually do it for bug reporting purposes so it's easier to identify which line of code is failing, and the order of sequence it happened. But it's just a suggestion, feel free to ignore it if we aren't expecting X specific lines of GGML_ABORT to hit/fail often haha

Edit: Also another thought. Most users are end-consumers who do not compile from source and rather, use a pre-built release binary. IIRC, I may be wrong, release builds do not show full backtrace information on the failing lines of code leading to the abort, or may have just been optimized out; much harder to debug.

@kpouget
Copy link
Contributor Author

kpouget commented Jan 30, 2026

Edit: Also another thought. Most users are end-consumers who do not compile from source and rather, use a pre-built release binary. IIRC, I may be wrong, release builds do not show full backtrace information on the failing lines of code leading to the abort, or may have just been optimized out; much harder to debug.

good point, I missed that
but I was thinking about it as well, and I think what matters is that the message is unique, so that you can locate it easily when you get user log trace.

anyway, I'll think about it, I want to improve the error message when running in an unsupported environment (no virtgpu, this one should be good, but unpatched virglrenderer, this one can be improved I guess)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants