Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image garbage collection in dev clusters #4228

Open
nicks opened this issue Feb 19, 2021 · 7 comments
Open

image garbage collection in dev clusters #4228

nicks opened this issue Feb 19, 2021 · 7 comments
Labels
enhancement New feature or request

Comments

@nicks
Copy link
Member

nicks commented Feb 19, 2021

In the #tilt channel, Dan C-P writes:

We seem to be seeing bloat in our tilt-managed kind control plane that I've narrowed down to what I think are dangling images (or whatever) that build up over time.

This problem is probably unique to Tilt, because Tilt can create a lot of temporary images.

When I dug around a bit, I saw a comment that implied that image garbage collection is disabled in Kind

https://github.com/kubernetes-sigs/kind/blob/master/pkg/cluster/internal/kubeadm/config.go#L231

I'm not totally sure yet how/if tilt (or ctlptl) should address this. Need to talk to Kubernetes experts who know more about the expected interop here than I do. You could imagine a controller that went into the cluster and deleted these images, similar to Tilt's local garbage collector

@nicks nicks added the enhancement New feature or request label Feb 19, 2021
@nicks
Copy link
Member Author

nicks commented Feb 19, 2021

I verified that execing into the node like

docker exec -it kind-control-plane bash

and running

crictl rmi --prune

can sometimes fix a lot of problems

@nicks
Copy link
Member Author

nicks commented Feb 19, 2021

here's a good discussion of this problem in more detail: kubernetes-sigs/kind#735

feels like this also ties into #2102

in that tilt knows (in some cases) that an image is ephemeral and isn't going to be used again, but needs a way to propagate that to everyone who needs to know

@djcp
Copy link

djcp commented Feb 25, 2021

In our experience, running crictl rmi --prune cleans up A LOT of space held by dangling images in the control plane, and is exacerbated by long-lived kind clusters.

We recently implemented a feature to run the in-control-plane pruning on tilt down a maximum of every four hours. It's still early but I'm hopeful we see promising returns.

@majidaldo
Copy link

majidaldo commented Nov 7, 2021

Is there a way to log out old EXPECTED_REFs?

i think tilt should have a custom_prune counterpart to custom_build

@nicks
Copy link
Member Author

nicks commented Nov 8, 2021

@majidaldo for what it's worth, the existing garbage collector doesn't look at old expected refs. it just searches the image index for images build by tilt. The code is here:

https://github.com/tilt-dev/tilt/blob/master/internal/engine/dockerprune/docker_pruner.go#L144

(There's not really a strong reason why it needs to be part of Tilt itself, it could pretty easily be run as an extension. You can read the images currently used from the tilt api as tilt describe imagemap, to ensure you're not deleting an in-use image)

I think what we should probably do is move docker-pruner into a separate repo, port it to use https://github.com/google/go-containerregistry (a library for interacting with different container image indexes), and have tilt periodically run it against every image store it knows about (both the local docker image store and any registries it pushes to and to the Kind CRI)

@BenTheElder
Copy link

FWIW as mentioned upstream I'd like to see kind handle this but it's tricky, kubelet's existing GC isn't terribly well suited to kind.

There's maybe some layering violations but it's possible that tilt could leverage better information here since AIUI it could theoretically know when it has loaded a newer version of an image and periodically request deleting older versions ... 🤔

@nicks
Copy link
Member Author

nicks commented Aug 9, 2022

ya, tilt attaches a bit more metadata to the image about "why" you're building it. We already do GC in the local image store, we just need to augment it a bit to tell kubernetes what images we think should be cleaned up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants