Skip to content

zipfile: add a structural validation feature #136891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Jul 20, 2025

[DRAFT] I had Claude Sonnet 4 examine the zipfile module with an eye for what validation we were not doing and implement strucural validation.

Status / Trustworthiness: This needs further review. Both for to see if it is sufficient, understand what is missing, and running it over a corpus of actual zip-ish format files to see what surprises pop up.

Is this the right API? unknown / undecided. it is the type of thing I was thinking of though - opt-in via an API flag. the "strict" CRC validation option is probably the least important. the "structural" validation is more interesting.

I would remove the Enum in favor of module level constants. If the CRC validation is kept I might suggest these be combinable flags rather than that only being a stricter mode than structural validation. People who want CRCs checked should be able to do that regardless - CRCs are rather poor in this day and age, most people validate data using a secure hash outside of old file formats.

quick pass first review thoughts: Some of this is the right theme. Some bits are silly or not enough. Good start. Wouldn't ship this as is. Likely to just close this PR, I put it up for purposes of sharing.

The kinds of issues Claude wanted to fix are common zip format footguns. It saved an analysis and plan first, I could share those as GH comments.

Author: me prompting Claude Sonnet 4


📚 Documentation preview 📚: https://cpython-previews--136891.org.readthedocs.build/

I had Claude Sonnet 4 examine the zipfile module with an eye for what
validation we were not doing and implement strucural validation.

Trustworthiness: This needs further review. Both for to see if it is
sufficient, understand what is missing, and running it over a corpus of actual
zip-ish format files to see what surprises pop up.

The kinds of issues Claude wanted to fix are common zip format footguns.  It
saved an analysis and plan first, I could share those as GH comments.

Author: me prompting Claude Sonnet 4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant