-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests, examples fail to build with GGML_BACKEND_DL=ON #1120
Comments
Many tests are written assuming that the backends are linked statically and are not compatible with The way to deal with this and support dynamically loadable backends is to use |
Ah I see the problem, I thought we were on the llama.cpp repository. The problem is that the tests We should instead address that by moving all ggml tests and build scripts to the ggml directory, rather than having this weird situation where llama.cpp has a completely different way of building and running the ggml tests. |
Thanks for the clarification, and for the example.
That's fine. I was more curious from pure user POV: do users need to treat an installation of ggml built with The tests themselves are not that important to me, they were just a proxy for how one might typically develop against libggml. Concrete example: I'm building Debian packages, and I'd like to offer all possible backends, but as individually installable packages ( But I'd not go down that path if this is not the typical use case. |
That's great and that's exactly the way it is intended to be used, so you are absolutely on the right path. However I feel that we need to polish some details before this can be done. For one, the only example of using dynamically loadable backends is llama.cpp, we need to update and add examples to the ggml repository. We need to start versioning ggml so that different applications can use the same ggml package. We also need to deal with issues such as duplicated devices from different backends, e.g. if you install both the CUDA and Vulkan backends, both backends may expose the same devices and that needs to be dealt with. We may also need to change the way dynamic backends are located, because as it is, they are searched only in the same directory as the executable, and that may not fit very well with the way the filesystem is organized in most linux distributions. I imagine that you would like to install the backends to So I would say it's pretty much a work in progress at this point. Any input that would help us create better packages for linux would be appreciated, ultimately that's one of the reasons |
Awesome, you touched on many other open questions that I collected during the packaging. Knowing that these are open challenges or at least topics with potential for change, rather than already-made "not our use case" design decisions, helps a lot downstream. Some patches we ad locally because they are Debian-specific, but work we can merge back upstream is strongly preferable.
I'm very glad to hear that and look forward to sharing feedback back upstream in future. Just a brief outlook on the Debian side: for now, a (local) design decision will probably be for ggml to be installed in a private libdir, as by policy, the lack of a stable SOVER precludes installation in one of the system paths ( We'd also just build "fat" packages for now, so eg: The goal is to get something usable and performant (on par with built-from-source) out as soon as possible, and to frequently iterate on improvement. Knowing what improvement ideas are on or off the table really helps for future work. Side note: Incidentally, being open-source focused, we've been doing a lot of work getting HIP packaged, and have our own CI with currently 17 different AMD GPU architectures ranging from consumer to Instinct. So on that end, we may be able to provide useful feedback upstream. That's also where the tests come in, as we continuously evaluate our built packages on dependency/reverse-dependency changes. |
If the private libdir also includes the executables, then using Note that whisper.cpp does not support |
As I have been discussing separately with @ckastner, this is the approach I have taken for our internal deb packaging since
(for i386 and arm64 only the libggml-cpu variant is built.) This could be improved when the mechanisms searching for the backends are refined. But on the other hand it feels like it fits well with the ggml architecture to only have libggml and libggml-base exposed as shared libraries at system level (and for building upon), while the backends are flexibly gathered "out of sight" in libexec, depending on a given set of constraints and requirements. But, as I understand from @ckastner, such an approach causes problems at this stage with other architectures/settings, and portability is crucial for Debian as well as for the ggml-related projects (especially given how fast hardware is moving in this field).
Yes, that is blocking me for finalizing a clean whisper.cpp packaging based on the above approach. But I also had issues with parts of whisper.cpp's code requiring a libggml-cpu backend in order to build. It does not work cleanly in terms of packaging if a shared library is not properly exposed and packaged like libggml and libggml-base are above. Is it something I could raise in the whisper.cpp issues? Ideally, whisper.cpp code would not need to link with libggml-cpu(-*). |
Thanks for the detailed explanation, that's very helpful.
Can you give me more details about the issues that you are seeing in other systems? It may also make sense to add a symbolic link in
whisper.cpp should support |
When building with
GGML_BACKEND_DL=ON
, linker failures start to appear. For example, from test-opt.c:843:I was looking into a PR to fix this, but I'm not sure to best go about it, or if this even needs fixing because it's working as intended.
Using the above example, when using
GGML_BACKEND_DL
, how would upstream guide to useggml_backend_cpu_set_n_threads
correctly? Should these bedlsym
ed by the caller?My naive approach would have been to expect that there is some shim that takes care of this and that errors out when certain functionality is not available, which could be guarded by a
ggml_backend_is_<foo>
, but the latter two seems to be defined in the modules. But I'm not that familiar with the codebase yet, so perhaps this already exists.The text was updated successfully, but these errors were encountered: