Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple hashes per release without comments makes --generate-hashes outputs hard to audit. #2155

Open
deeglaze opened this issue Jan 9, 2025 · 1 comment

Comments

@deeglaze
Copy link

deeglaze commented Jan 9, 2025

What's the problem this feature will solve?

When I run pip-compile --generate-hashes and suggest its output to update a repository's dependencies, it needs to get reviewed. The fact that the --hash output strings are alphabetically sorted makes its deterministic, but not in an auditable fashion. If I say, here's greenlet-3.1.1's hashes, and there are 30-some odd digests, a meticulous reviewer will need to go double-check each digest or complain that there are too many and that we only need to support x,y,z platforms, so drop unnecessary trust in other digests.

Describe the solution you'd like

Since line continuation syntax makes commenting each hash "line" hard to contextualize with a comment that provides the release archive's file name it corresponds to, I'd ask that --generate-hashes produce its hashes in the order that the files appear on the release page: Source tarball, then each binary distribution tarball in displayed order.

I'm recommending software supply chain integrity improvements to the EDK2 firmware toolkit project since it uses Python in its build system tooling, and a supply chain attack can lead to firmware rootkits. Given the sensitive nature of firmware and this project's importance to many hardware manufacturers, reviewers set a high bar of scrutiny. Being asked to review the ~450 hash pip-compile output for https://github.com/tianocore/edk2/blob/master/pip-requirements.txt is troublesome for these experts.

Alternative Solutions

  • If package index release orderings are inconsistent, then generate a comment that includes the archive names that correspond to the hashes in the order they appear. I haven't experimented with index ordering as an invariant, so this may be a necessary fallback. The downside is even more verbose requirements files.

  • Amend the pip requirements grammar to allow for in-place comments, like ;#(<comment>) so the filenames can be next to their digests. This is more intrusive since it requires every pip installation to support the new syntax.

  • Avoid pip-compile altogether and cherry-pick each hash I think is useful? But there are too many industry partners with development environments that may or may not require some platform that has pre-built binaries for.

@webknjaz
Copy link
Member

I don't think that dependency resolvers a guaranteed to “see” the dists in specific order. Especially, if they're available across multiple indexes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants