-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vagrant boxes caching to CI #11485
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: VannTen The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We're hitting 429 responses when running lots of jobs, presumably because we are downloading lots of boxes again and again. This should cache them instead.
e7af709
to
2803946
Compare
This is apparently not enough because the CI runners can't extract the cache because they don't have gitlab-runner : |
@ant31 if you have an idea |
It's 429 from which service ? |
https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/7698277164#L524
I think it's app.vagrantup.com for the vagrant boxes used by molecule.
|
/retest |
@VannTen: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/hold
This needs change in the CI cluster (configure either an S3 or volume cache for the gitlab-runner) before it can work.
|
I feel like this is happening a lot recently, but maybe we just have more activity in the CI... |
Yes, we're probably rate limit quicker we are running more job in parallel and also since it's quicker so we probably can DL more at time. can we host the box ourself somewhere ? |
can we host the box ourself somewhere ?
Probably, I guess vagrant get boxes over http.
But I think using using the runner caching mechanism would have a broader use (pip packages, ansible collections, pre-commit hooks etc).
I've never deployed a self-hosted minio S3 on k8s, but I was skimming their docs last week and it does not look too hard.
|
Is there anything I can do to help with HTTP 429 CI issues? |
I haven't finished the minio installation+ gitlab-runner S3 integration on the CI cluster unfortunately and I'm on holidays :/
One thing that can help is doing the gitlab-ci side, this pr is a very rough draft.
(In particular, it seems the gitlab-ci semantics have cache policy push/pull/pull-and-push which we should leverage to pull once per pipelines / have a default cache common to all pipelines)
I'm not sure how practical that is without the cache working in the first place though. Gitlab-ci has a local executer I think, but I don't know if it can help for that kind of stuff.
|
Another possibility I've thought about (but not worked on) is to convert everything (molecule, vagrant jobs) to use the "packet" provisioning method. Molecule for instance has an example in their docs to use kubevirt that we could convert/apply to our setup.
This would also have the advantage long-term that we could ditch the custom gitlab-ci executors ***@***.***)
|
Yes, droping vagrant could be a good idea if we can replace it with kubevirt Instead of s3-cache we could also host the vagrant box ourself too (if it's easier) in the cluster itself. would be kind of equivalent |
Self hosted Vagrant server unless we have the resources to do permissions control (and maintain the custom gitlab-ci executors), I think it would be better to transfer to packet. To summarize, maybe I can try to give some of the vagrant work to packet first. |
I don't know it seems a lot of work without immediate upside. |
Minio is working. Still missing configuration is the custom kubevirt gitlab executor (looks like need to add gitlab-runner binary inside the container) |
Minio is working.
Thanks !
Imo the benefits of moving to "packet" (btw, maybe we should have a more descriptive name ?) are also to reduce the need for custom executors and the associated maintenance burden
|
/close replaced by : #11671 |
What type of PR is this?
/kind feature
What this PR does / why we need it:
We're hitting 429 responses when running lots of jobs, presumably
because we are downloading lots of boxes again and again.
This should cache them instead.
Does this PR introduce a user-facing change?:
/ok-to-test