Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go/internal/test: test cache id does not include umask #69508

Open
twpayne opened this issue Sep 17, 2024 · 6 comments
Open

cmd/go/internal/test: test cache id does not include umask #69508

twpayne opened this issue Sep 17, 2024 · 6 comments

Comments

@twpayne
Copy link
Contributor

twpayne commented Sep 17, 2024

Go version

go version go1.23.1 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE='on'
GOARCH='arm64'
GOBIN='/Users/twp/.local/bin'
GOCACHE='/Users/twp/Library/Caches/go-build'
GOENV='/Users/twp/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/twp/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/twp'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.23.1/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.23.1/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='on'
GOTELEMETRYDIR='/Users/twp/Library/Application Support/go/telemetry'
GCCGO='gccgo'
GOARM64='v8.0'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/Users/twp/src/go.googlesource.com/go/src/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/sc/4d1t02x92z73bqkq4dvn25h40000gn/T/go-build3190885029=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I have several tests that create files and check the permissions of the created files. On UNIX systems, the permissions of the created files depend on the value of umask, which is a process-global variable managed by the OS, the same way that environment variables are process-global variables managed by the OS.

I expect to be able to change umask, run go test, and have go test re-run my tests with the new umask. However, after changing umask, go test incorrectly re-uses its cached test results:

$ go version
go version go1.23.1 darwin/arm64

$ umask 002                               # <--- set umask to 002

$ go test .
ok      tmp     0.252s                    # <--- tests pass when umask is 002

$ umask 022                               # <--- change umask to 022

$ go test .                               # <--- re-run tests
ok      tmp     (cached)                  # <--- BUG: test results re-used from cache

$ go test . -count=1                      # <--- disabling cache gives correct test result
--- FAIL: TestUmask (0.00s)
    umask_test.go:34: got 644, want 664
FAIL
FAIL    tmp     0.249s
FAIL

The example test is:

package main

import (
	"io/fs"
	"os"
	"path/filepath"
	"testing"
)

func TestUmask(t *testing.T) {
	// This tests passes when the umask is 002 (e.g. on Debian-based systems),
	// but fails when the umask is 022 (e.g. on RedHat-based systems and macOS).
	filename := filepath.Join(t.TempDir(), "file")
	if err := os.WriteFile(filename, nil, 0o664); err != nil {
		t.Fatal(err)
	}
	fileInfo, err := os.Stat(filename)
	if err != nil {
		t.Fatal(err)
	}
	// Ensure that the file's permissions are 0o666 &^ 0o002 == 0o664.
	if got, want := fileInfo.Mode().Perm(), fs.FileMode(0o664); got != want {
		t.Fatalf("got %03o, want %03o", got, want)
	}
}

Note that Go creates a test input ID that includes the values of environment variables and the mtimes and sizes of files accessed by the test. This test input ID should also include the initial umask value. I will create a CL that fixes this.

What did you see happen?

go test re-used a cached test result, when it should not have.

What did you expect to see?

go test should not re-use cached test results when the umask is changed.

@twpayne
Copy link
Contributor Author

twpayne commented Sep 17, 2024

@seankhliao
Copy link
Member

@ianlancetaylor
Copy link
Contributor

Copying in my comment from the mailing list thread:

We do not want to record all aspects of the external environment that could possibly affect the test. That would be too much. So we need some way to know what should be recorded and what should not. What should that be? Our current approach is simple: we record environment variables that the test reads and we record files opened within the module but not files opened outside of the module. Note that if a test opens some file outside of the module, we cache the test results even if that file changes.

What approach should we use for test caching that will tell us that umask should be cached?

@twpayne
Copy link
Contributor Author

twpayne commented Sep 19, 2024

Firstly, thank you Ian for your patience and open mind in discussing this!

We do not want to record all aspects of the external environment that could possibly affect the test. That would be too much.

Agreed.

So we need some way to know what should be recorded and what should not. What should that be?

Given that, as there is so much external state, it is impossible in practice for the go tooling to know what should be recorded.

As a crazy idea, how about enabling the tests themselves to say what should be recorded?

Our current approach is simple: we record environment variables that the test reads and we record files opened within the module but not files opened outside of the module.

This is fair but also a slight simplification. Currently, the go test cache only checks these files sizes and modification times.

For sure, we all agree that reading the full state that the cache depends on is prohibitively expensive, and that cache invalidation is a hard problem.

What approach should we use for test caching that will tell us that umask should be cached?

So, radical idea. What if tests could explicitly declare what they depend on?

Say I could write, for my tests that depend on umask, something like:

func TestThatDependsOnUmask(t *testing.T) {
    t.DependsOn("umask " + getUmask().String())
    // ...
}

where testing.T.DependsOn(s string) adds s to the test input IDs (i.e. modifies the test cache key).

This totally frees the Go authors from having to decide what to include or exclude from the test input ID. Instead, the decision is deferred to users, who are the people who know their test cache requirements best.

Thoughts?

@ianlancetaylor
Copy link
Contributor

It's an interesting idea but I think it would require cmd/go/internal/test/test.go to recognize the "umask" command. So DependsOn seems too flexible: it would be too easy to add a dependency that would not be recognized.

As a separate concern, more knobs is harder to use. It's already straightforward to test differing umask values within a test. And if someone wants to set umask outside of the test binary, it's easy to disable caching by passing -test.count=1. So I don't yet see the compelling argument for this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants