Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run website in a container via make container-serve due to the image absence #49460

Open
shurup opened this issue Jan 16, 2025 · 12 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/docs Categorizes an issue or PR as relevant to SIG Docs. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@shurup
Copy link
Member

shurup commented Jan 16, 2025

This is a Bug Report

Problem:

Running make container-serve in kubernetes/website:main (to run the website locally in a container) leads to an error:

Unable to find image 'gcr.io/k8s-staging-sig-docs/k8s-website-hugo:v0.133.0-af5f894e895c' locally
docker: Error response from daemon: manifest for gcr.io/k8s-staging-sig-docs/k8s-website-hugo:v0.133.0-af5f894e895c not found: manifest unknown: Failed to fetch "v0.133.0-af5f894e895c" from request "/v2/k8s-staging-sig-docs/k8s-website-hugo/manifests/v0.133.0-af5f894e895c".
See 'docker run --help'.
make: *** [Makefile:119: container-serve] Error 125

It happens on Linux/amd64. This behaviour is confirmed by a few people.

Proposed Solution:

Following the #49444 discussion, it should be fixed not by prior executing make container-image (to build images locally), but by the availability of Hugo images that can be pulled from GCR instead. As @sftim noted in Slack, they might be absent due to a recent Docsy upgrade. We need to have them back.

@shurup shurup added kind/bug Categorizes issue or PR as related to a bug. sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Jan 16, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 16, 2025
@niranjandarshann
Copy link
Contributor

niranjandarshann commented Jan 16, 2025

I am also facing the same issue.

@sftim
Copy link
Contributor

sftim commented Jan 16, 2025

/triage accepted
/priority important-soon

Only affects contributors, not website visitors, but we should get a fix in place; we may need to revisit how we build and publish the container image(s)

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 16, 2025
@sftim sftim pinned this issue Jan 17, 2025
@sftim
Copy link
Contributor

sftim commented Jan 21, 2025

Until we fix this, there are two workarounds you can use:

  • reference an older container image
    make container-serve CONTAINER_IMAGE=gcr.io/k8s-staging-sig-docs/k8s-website-hugo:v0.133.0-a5ef70d3da97
  • build your own image
    make container-image # only needed one time
    make container-serve

If you like helping out, SIG Docs can guide you towards working on a longer term fix. The best place to offer help is the #sig-docs channel on Slack.

@SayakMukhopadhyay
Copy link
Contributor

SayakMukhopadhyay commented Jan 22, 2025

Got the link to the failing prow job on the 15th Jan https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/post-website-push-image-k8s-website-hugo/1879257326494420992

A cursory glance makes me think that the issue might have something to do with npm and package.json but I will need to look into it more.

EDIT: Maybe its related to ed1cc81

@sftim
Copy link
Contributor

sftim commented Jan 22, 2025

We should check that the container build works for AArch64 (ideally: we also add some CI checks to reject PRs that break image builds)

I think it builds fine for AMD64, but maybe only for AMD64. In the cloud we build a multiarch image.

@ameukam
Copy link
Member

ameukam commented Jan 23, 2025

@sftim
Copy link
Contributor

sftim commented Jan 23, 2025

I (still) suspect that a local AArch64 build would also fail.

@ameukam
Copy link
Member

ameukam commented Jan 24, 2025

Yeah. I see:

#24 84.50 npm error npm error ld-linux-aarch64.so.1: /root/.npm/_cacache/tmp/git-cloneXXXXXXMFelHl/node_modules/hugo-extended/vendor/hugo: Not a valid dynamic program
#24 84.50 npm error npm error ✖ Hugo installation failed. :(
#24 84.50 npm error npm error node:internal/errors:984
#24 84.50 npm error npm error   const err = new Error(message);

Looks like an issue with NPM rather than the infrastructure. Possible the alpine image no longer have this lib for aarch64

@SayakMukhopadhyay
Copy link
Contributor

SayakMukhopadhyay commented Jan 30, 2025

I was able to reproduce this locally when building a multi-arch build. So, I did some testing and found that this issue is not AArch64 related rather it's Alpine+AArch64 related. The issue is that the hugo-extended binary just doesn't run on Alpine + AArm64. It's not the hugo-extended npm library's fault as all it does is download the binary and attempts to execute it once. This attempt is what causes the npm ci to fail. And this failure is a sign of a bigger problem.

Thing is, onlyh the hugo binary works in Alpine+AArch64 whereas both hugo and hugo-extended binaries work in Debian+AArch64. Looking into the binaries, here's what the file command gives:

  • hugo-extended-amd64

hugo: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, Go BuildID=diiSa0Qe8VOXD3Ch_MwU/k9QnCktDCLZbp9fOWtYK/kXU2EiN95-OVeBmcnn3D/TED5qvYgX7CVfxfU1NBx, BuildID[sha1]=bb899409ab5a900e5105fae2bbf6ce0dd9b09f3d, stripped

  • hugo-extended-aarch64

hugo: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, Go BuildID=1Tr54E5QnsvcLulJcEu8/zbqdZMB13x98JUbG7mGA/jPKlLSBJNWtSLfS0TLy_/s-5yF5kRotNo1p5bWIJW, BuildID[sha1]=eb227841f2f9dcccdc1c2e61c257f307f8a39c47, stripped

  • hugo-aarch64

hugo: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, Go BuildID=WKoC5cPWKFCuD4_Wbl-d/Xk1LdOTak19kxpdVP_Cr/7zko5eJrtSPBTFZZmx1J/GCKiCIfXOUUZzTeSZBJC, stripped

I think alpine is missing something critical since hugo-extended is dynamically linked.

On a side note, existing images built for AArch64 won't work on it as the Dockerfile downloads the amd64 binary only (see https://github.com/kubernetes/contributor-site/blob/148e2934d70ecc89ab2a770407defb973dbab8d4/Dockerfile#L27)

EDIT: Welp, I found this issue gohugoio/hugo#10839 and the "solutions" don't work either, the solutions being adding the libstdc++ package and symlinking ln -s /lib/libc.so.6 /usr/lib/libresolv.so.2.

@sftim
Copy link
Contributor

sftim commented Jan 31, 2025

On a side note, existing images built for AArch64 won't work on it as the Dockerfile downloads the amd64 binary only (see https://github.com/kubernetes/contributor-site/blob/148e2934d70ecc89ab2a770407defb973dbab8d4/Dockerfile#L27)

That's a link to a different Git repository; did you mean to link there @SayakMukhopadhyay ?

@sftim
Copy link
Contributor

sftim commented Jan 31, 2025

If we can avoid NPM wanting to download Hugo, that'll help. But not a small fix.

@SayakMukhopadhyay
Copy link
Contributor

SayakMukhopadhyay commented Jan 31, 2025

On a side note, existing images built for AArch64 won't work on it as the Dockerfile downloads the amd64 binary only (see https://github.com/kubernetes/contributor-site/blob/148e2934d70ecc89ab2a770407defb973dbab8d4/Dockerfile#L27)

That's a link to a different Git repository; did you mean to link there @SayakMukhopadhyay ?

I did mean to point to the contrib site Dockerfile as thats the one I was testing the npm install on (since its smaller and faster). But now I see that the website uses Go as a base image and builds Hugo from source. Let me test as building might be the best option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/docs Categorizes an issue or PR as relevant to SIG Docs. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants