Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom choom helper to lower crocochrome's OOM score #122

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,25 @@ ARG TARGETARCH
# Build with CGO_ENABLED=0 as grafana-build-tools is debian-based.
RUN --mount=type=cache,target=/root/.cache/go-build \
--mount=type=cache,target=/root/go/pkg \
CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -o /usr/local/bin/crocochrome ./cmd
CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -o /usr/local/bin/crocochrome ./cmd/crocochrome/ &&\
CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -o /usr/local/bin/choom ./cmd/choom/

# For setting caps, use the same image than the final layer is using to avoid pulling two distinct ones.
FROM ghcr.io/grafana/chromium-swiftshader-alpine:131.0.6778.139-r1-3.21.0@sha256:d3071cfe8721cee56fecf8e5d0bf77031d531bc1091b04b05bccf5f50a32365b AS setcapper

RUN apk --no-cache add libcap

COPY --from=buildtools /usr/local/bin/crocochrome /usr/local/bin/crocochrome
COPY --from=buildtools /usr/local/bin/choom /usr/local/bin/choom

# The following capabilities are used by sm-k6-runner to sandbox the k6 binary. More details about what each cap is used
# for can be found in /sandbox/sandbox.go.
# WARNING: The container MUST be also granted all of the following capabilities too, or the CRI will refuse to start it.
RUN setcap cap_setuid,cap_setgid,cap_kill,cap_chown,cap_dac_override,cap_fowner+ep /usr/local/bin/crocochrome
# Grant sys_resource capability to custom choom binary, so it can lower OOM scores, and dac_override so it can do it for
# other processes.
# Same warnign as above applies.
RUN setcap cap_sys_resource,cap_dac_override+ep /usr/local/bin/choom

FROM ghcr.io/grafana/chromium-swiftshader-alpine:131.0.6778.139-r1-3.21.0@sha256:d3071cfe8721cee56fecf8e5d0bf77031d531bc1091b04b05bccf5f50a32365b

Expand All @@ -36,6 +42,7 @@ RUN find / -type f -perm -4000 -delete

# The crocochrome binary has extra capabilities, so we make sure only the k6 user (and not nobody) can run it.
COPY --from=setcapper --chown=k6:k6 --chmod=0500 /usr/local/bin/crocochrome /usr/local/bin/crocochrome
COPY --from=setcapper --chown=k6:k6 --chmod=0500 /usr/local/bin/choom /usr/local/bin/choom

USER k6

Expand Down
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ buildtools = $(docker) run --rm -i \

.PHONY: build
build:
CGO_ENABLED=0 go build -v -o build/sm-k6-archiver ./cmd
CGO_ENABLED=0 go build -v -o build/crocochrome ./cmd/crocochrome/
CGO_ENABLED=0 go build -v -o build/choom ./cmd/choom/

image ?= test.local/sm-k6-archiver
.PHONY: build-container
Expand Down
15 changes: 15 additions & 0 deletions cmd/choom/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# `choom`

`choom` is a portable, simplified, not compatible version of [`chroom(1)`](https://man7.org/linux/man-pages/man1/choom.1.html), which allows to change a process' `oom_score_adj` to make it more or less attractive to the kernel's OOM killer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you are leaving something out here, and I cannot spot what that might be.

If I look at choom.c it's as simple as it gets. The only difference with the program you are adding is 1) the ability to display a PID's score and adjust value; 2) reporting what the previous adjust value was before changing it. That later point is arguably a useful piece of data.

So... what am I missing?

Copy link
Member Author

@nadiamoe nadiamoe Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing special on this choom implementation w.r.t. choom.c, they do the same thing. It exists because choom is not installed on alpine by default, and thought that implementing this simple thing was going to be just as easy as apk installing something, keeping the Dockerfile simple, and not having to deal with however alpine wants to run choom (I suspect they will rely on setuid, or just being root, instead of granting the specific capabilities so a regular user can use them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no good reason for doing this, then just install the corresponding package providing choom (util-linux-misc) and copy over the file to the final image.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a very strong reason to do it, but I neither see a strong reason to not do it.

Copying the file is tricky: There might be dependencies that need to be brought along, it might need special requirements, it might use setuid at some point without us noticing, etc. It may not be doing these things today, but it may do them tomorrow. The way I see it, procfs is a stable API, the inner workings of that particular choom implementation (which we are coupled with) are not.

I still see the homemade one as the safest option here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying the file is tricky: There might be dependencies that need to be brought along, it might need special requirements, it might use setuid at some point without us noticing, etc. It may not be doing these things today, but it may do them tomorrow. The way I see it, procfs is a stable API, the inner workings of that particular choom implementation (which we are coupled with) are not.

Uhm...

You are putting forward a hypothetical issue with backwards compatibility for a utility that hasn't seen any significant changes in over 6 years while introducing a custom incompatible version of that utility... seriously?

If this was less trivial (external libraries, complex logic, etc), I might be open to the argument, but at this level of triviality it doesn't make sense. I honestly thought there was something I was missing and that's why I asked.


This tiny program exists because binaries need a specific capability to lower OOM scores, `cap_sys_resource`. This capability, however, is not granted in the [default set docker uses](https://github.com/moby/moby/blob/master/oci/caps/defaults.go#L6-L19). Due to how linux capabilities work, binaries with a given capability added will fail to start if the container does not have that capability added as well. In practice, what this means is that if we granted `cap_sys_resource` to the main binary of the container, and attempted to run the container naively with `docker run crocochrome:latest`, the container would cryptically fail to start.

This is a UX problem, where users unfamiliar with the codebase would need to troubleshoot a cryptic error message, but also a problem for testing, where developers would need to go out of the "standard route" to figure out how to add specific capabilities to local kubernetes clusters, or testcontainers, in order to perform tests to the container.

Instead of that, using a tiny helper with the capability added, and calling this helper from the main binary, allows the oom score adjust process to fail gracefully. If the container does not have the required `sys_resource` capability, the OOM score will not be adjusted and a log will be errored:

```
{"time":"2025-01-30T12:50:37.771068928Z","level":"ERROR","msg":"Error changing OOM score. Assuming this is a test environment and continuing anyway.","err":"fork/exec /usr/local/bin/choom: operation not permitted"}
```

But execution will continue, and as OOM score adjusting is not critical to functionality whatsoever, tests will do their job just fine. In production, we grant the container the `sys_resource` capability to handle low-memory situations better.
32 changes: 32 additions & 0 deletions cmd/choom/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package main

import (
"log"
"os"
"path/filepath"
)

func main() {
if len(os.Args) < 3 {
log.Fatalf("Usage: %s <pid> <oom_score_adj>", os.Args[0])
}

pid := os.Args[1]
newOOMScore := os.Args[2]

adjFile, err := os.OpenFile(filepath.Join("/", "proc", pid, "oom_score_adj"), os.O_WRONLY, os.FileMode(0o600))
if err != nil {
log.Fatalf("Opening adjust file: %v", err)
}

defer func() {
if err := adjFile.Close(); err != nil {
log.Fatalf("Closing file: %v", err)
}
}()

_, err = adjFile.Write([]byte(newOOMScore))
if err != nil {
log.Fatalf("Writing oom_score_adj: %v", err)
}
}
21 changes: 21 additions & 0 deletions cmd/main.go → cmd/crocochrome/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ import (
"log/slog"
"net/http"
"os"
"os/exec"
"strconv"

"github.com/grafana/crocochrome"
crocohttp "github.com/grafana/crocochrome/http"
Expand All @@ -18,6 +20,17 @@ func main() {
Level: slog.LevelDebug,
}))

const oomScore = -500
if out, err := choom(oomScore); err != nil {
logger.Error(
"Error changing OOM score, assuming this is not a production environment and continuing anyway",
"err", err,
"choomOutput", string(out),
)
} else {
logger.Info("Main process OOM score adjusted successfully", "oomScore", oomScore)
}

mux := http.NewServeMux()

registry := prometheus.NewRegistry()
Expand Down Expand Up @@ -54,5 +67,13 @@ func main() {
err = http.ListenAndServe(address, mux)
if err != nil {
logger.Error("Setting up HTTP listener", "err", err)
os.Exit(1)
return
Comment on lines +70 to +71
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to this PR, just couldn't resist.

}
}

// choom runs the choom helper (source included in this repo) to lower the current process OOM score.
func choom(score int) ([]byte, error) {
choom := exec.Command("choom", strconv.Itoa(os.Getpid()), strconv.Itoa(score))
return choom.CombinedOutput()
}