-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia-modprobe to potentially early out when nvidia blacklisted (Wayland + driver init issue) #5
Comments
Is it the I would have expected that |
You're right... it appears By default, it appears to mmap the entire module before a syscall out to init_module which tries to map it into kernel space (and only then does the blacklisting seem to apply as part of the init_module syscall) Would there be any downside to Edit: Yes, it is |
This allows modprobe to early out in the event that the nvidia driver has been blacklisted NVIDIA#5
FWIW, the pull request (#6) has been working well in my usage across Arch and Fedora |
I've been exploring the purpose of nvidia-modprobe recently, and the implications for anyone using a dual-gpu setup and occasionally needing to blacklist the nvidia drivers. I'm using Wayland exclusively.
It's my understanding that
nvidia-modprobe
is provided as a fallback mechanism to ensure the nvidia driver is initialised with root priveleges (should it not already be properly initialised). The mechanism for callingnvidia-modprobe
appears to be triggered by the nvidia libraries themselves when they are invoked by the relevant ICDeg:
I've found that even when the nvidia drivers themselves are blacklisted, any program that tries to invoke or interrogate the ICD's for available devices causes
nvidia-modprobe
to be called (which in turns, attempts tomodprobe nvidia
as root)Unfortunately, modprobe isn't the quickest in town and it takes a while for it to fail when the nvidia drivers are blacklisted (close to 1 second in my testing).
The problem is compounded by diagnostic tools such as
inxi
For example,
inxi -Fxz
will repeatedly poll the ICD layer (approximately 33 times), which in turn loads the nvidia shared libraries (33 times) which triggersnvidia-modprobe
(33 times)This chain of events takes approximately 30 seconds to complete, while my journal logs shows (correctly) that
Module nvidia is blacklisted
(33 times).This isn't the end of the world, though I've tried to mitigate the issue as follows:
Workaround
It's been suggested that I should be able to move
nvidia-modprobe
out of the way, short circuiting this chain of events somewhat. This does have the desired effect when the nvidia drivers are blacklistedProblem
This has a side effect when the nVidia drivers are not blacklisted.
Specifically, despite the nvidia module being present and accounted for (via lsmod) it seems the appropriate device files have not been created (or the driver otherwise not fully initialised).
This is evidenced by the likes of eglinfo / vulkaninfo not showing the nVidia device whatsoever.
This can be rectified by one of the following approaches
nvidia-modprobe
vulkaninfo
as rootnvidia-debugdump --list
as rootTheory
I believe that this isn't an issue for X11 users, as the Xorg service runs as root and thus has no trouble when the nvidia shared libraries are instantiated (thus, the driver fully initialises without need for the nvidia-modprobe fallback mechanism.
For GDM and Wayland users, this isn't the case.. since these services do not run with superuser priveleges, the nvidia drivers will ultimately be loaded without special priveleges and will try to initiate the fallback mechanism by default. That obviously does not work if
nvidia-modprobe
cannot be foundSo, to restate the problem (with the above taken into account)...
A linux system running Wayland without nvidia-modprobe will be unable to initialise the nVidia device without user intervention
Potential paths forward
nvidia-modprobe
will trigger a modprobe any time a userspace application tries to query or use the ICD's available - and that this may not be immediate.or we could consider a check within
nvidia-modprobe
(or indeed the shared libraries/drivers themselves) such that:nvidia-modprobe
proactively check if the nvidia drivers are blacklisted before calling out to/sbin/modprobe
and fail fast if that is the casenvidia-modprobe
mechanism# 1 is a minor irritation (it drove me to research this issue)
# 2 could be scripted around via user code or udev rules, but doesn't help the wider community.
Perhaps # 3 or # 4 could be considered, if it doesn't introduce too much complexity?
Background:
My specific setup includes a GTX 960 with drivers 545.29.06
I've been testing across both Arch Linux and Fedora Linux (same drivers + kernel). It's worth noting that on Arch I'm using regular kernel modules, while Fedora uses akmods. I do not observe any difference in behaviour between the two.
I'm also running with an AMD RX 580
For development purposes, I frequently switch between nvidia, nouveau and amdgpu drivers using boot time kernel parameters to blacklist as appropriate.
Related forum posts here and here
The text was updated successfully, but these errors were encountered: