Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cri-o fails to start after nvidia-ctk runtime configure: conmon executable file not found in $PATH #681

Open
kznrluk opened this issue Sep 7, 2024 · 2 comments

Comments

@kznrluk
Copy link

kznrluk commented Sep 7, 2024

I performed the setup in a cri-o environment based on the document below, but afterward, cri-o started failing to launch.

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-cri-o

sudo nvidia-ctk runtime configure --runtime=crio

When I checked journalctl, it seemed that the conmon command could not be found.

Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283245578Z" level=info msg="AppArmor is disabled by the system or at CRI-O build-time"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283252429Z" level=info msg="No blockio config file specified, blockio not configured"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283259598Z" level=info msg="RDT not available in the host system"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283270449Z" level=info msg="Using conmon executable: /usr/libexec/crio/conmon"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283969678Z" level=info msg="Conmon does support the --sync option"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.283985508Z" level=info msg="Conmon does support the --log-global-size-max option"
Sep 07 10:11:39 aki1 crio[7354]: time="2024-09-07 10:11:39.284016218Z" level=fatal msg="validating runtime config: monitor fields translation: failed to translate monitor fields for runtime nvidia: exec: \"conmon\": executable file not found in $PATH"
Sep 07 10:11:39 aki1 systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE

In my environment, I resolved the issue by executing ln -s /usr/libexec/crio/conmon /usr/local/bin/conmon, but some kind of modification might be necessary.

> uname -a
Linux aki1 6.8.0-41-generic #41-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug  2 20:41:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

> crictl -v
crictl version v1.31.1
@Serret
Copy link

Serret commented Sep 10, 2024

Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.865508935+01:00" level=info msg="Installing default AppArmor profile: crio-default"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.890008458+01:00" level=info msg="No blockio config file specified, blockio not configured"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.890036521+01:00" level=info msg="RDT not available in the host system"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.890057593+01:00" level=info msg="Using conmon executable: /usr/libexec/crio/conmon"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.890994267+01:00" level=info msg="Conmon does support the --sync option"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.891010121+01:00" level=info msg="Conmon does support the --log-global-size-max option"
Sep 10 01:02:44 odin crio[8386]: time="2024-09-10 01:02:44.891055289+01:00" level=fatal msg="validating runtime config: monitor fields translation: failed to translate monitor fields for runtime nvidia: exec: \"conmon\": executable file not found in $>
Sep 10 01:02:44 odin systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE
Linux odin 6.8.0-41-generic #41-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug  2 20:41:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

crictl version v1.29.0

Slightly different version of crictl but same situation here.
I can confirm the suggestion above worked!

@plaurin84
Copy link

I can also confirm that the symlink trick from @kznrluk works perfectly.

Tested with:
CRI-O 1.31.1
Kubeadm v1.31.2
NVIDIA Container Runtime Hook version 1.17.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants