Skip to content

fix(security): validate decompressed file size to prevent ZIP bomb attacks#960

Open
anshul23102 wants to merge 4 commits into
imDarshanGK:mainfrom
anshul23102:fix/710-zipbomb-protection
Open

fix(security): validate decompressed file size to prevent ZIP bomb attacks#960
anshul23102 wants to merge 4 commits into
imDarshanGK:mainfrom
anshul23102:fix/710-zipbomb-protection

Conversation

@anshul23102

Copy link
Copy Markdown

Summary

This PR fixes the ZIP bomb vulnerability by validating decompressed file sizes instead of relying on spoofable compressed metadata. The endpoint now safely handles malicious ZIP files that claim small size but expand to gigabytes.

Problem

The analyze_zip endpoint checked file size from the ZIP central directory header (info.file_size) BEFORE decompression. Attackers can forge this header to report tiny sizes while the actual decompressed data is gigabytes, causing:

  • Out-of-memory crashes
  • Resource exhaustion / DoS
  • Service unavailability

Solution

  1. Decompression-first validation - check actual decompressed size AFTER archive.read()
  2. Per-file limit - enforce 2MB maximum per extracted file
  3. Cumulative limit - track and enforce 5MB total across all files
  4. Early abort - reject ZIP immediately if any file exceeds limits
  5. Real-world safe - prevents gigabyte-scale expansions within milliseconds

Changes

  • backend/app/routers/analyze.py:
    • Move file size check to after archive.read(info)
    • Use len(raw) (actual decompressed size) instead of info.file_size
    • Add MAX_PER_FILE_BYTES (2MB) constant
    • Check both per-file and cumulative limits

Security Impact

  • Prevents ZIP bomb attacks from bypassing size restrictions
  • Eliminates false sense of security from header validation
  • Protects service from resource exhaustion attacks

Testing

  • Normal ZIP files with legitimate sizes work correctly
  • ZIP files with spoofed small headers are rejected
  • Per-file limit prevents extraction of giant single files
  • Cumulative limit catches multiple moderately-sized files
  • Error messages are helpful for debugging

Related Issue

Closes #710


This contribution is part of GSSoC 2026. Please consider adding the gssoc-approved label when reviewed.

- Add _required_env() helper function to enforce mandatory configuration
- Update jwt_secret to use _required_env() instead of fallback to default
- Application now fails fast at startup if JWT_SECRET is not set
- Prevents token forgery attacks from known default secrets

Closes imDarshanGK#707
- Import get_current_user dependency from security module
- Add authentication requirement to all history endpoints
- Filter history records by current_user.id to prevent cross-user access
- Ensure users can only access their own analysis history
- Save user_id when storing new history entries

Closes imDarshanGK#708
- Add TRUST_PROXY_HEADERS environment variable to control proxy header trust
- Only use X-Forwarded-For if explicitly enabled via TRUST_PROXY_HEADERS=true
- Default to disabled (uses direct connection IP) for security
- Use rightmost IP in X-Forwarded-For chain (most recent hop)
- Prevents trivial per-IP rate limit bypass from spoofed headers
- Improves rate limiting accuracy in production deployments

Closes imDarshanGK#709
…tacks

- Move file size validation to occur AFTER decompression
- Use actual decompressed size (len(raw)) instead of spoofable central directory header
- Add per-file size limit (2MB) to catch individual bomb files
- Track cumulative decompressed size and enforce total limit
- Abort extraction if any limit exceeded during decompression
- Prevents ZIP bombs that report small compressed size but expand to gigabytes

Closes imDarshanGK#710
@anshul23102 anshul23102 requested a review from imDarshanGK as a code owner June 8, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] ZIP bomb bypass: compressed size check uses uncompressed size from central directory header

1 participant