Conversation
📝 WalkthroughWalkthroughThis PR refactors the PDF compression functionality into a comprehensive, environment-driven staged pipeline with enhanced validation, instrumentation, and configuration. It updates the worker's tool result structure to propagate compression metadata, adds extensive test coverage, documents the new pipeline, and bumps a web dependency. Changes
Estimated Code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
CodeRabbit review (committed diff vs main) output:\n\n |
|
CodeRabbit review (committed diff vs main) output: |
acdaef6 to
95be773
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/worker/zenpdf_worker/tools.py (1)
923-1018:⚠️ Potential issue | 🟠 MajorRecompute status after final output bytes are known.
Status and the “no smaller output” warning are decided before later post-processing can replace the output. If a later pass reduces size beyond thresholds, tool_result can still report
no_change. Recalculate status (and the warning) afteroutput_bytesis computed.🛠️ Suggested fix
- if method == "original": - method = "passthrough" - warnings.append("No smaller output found; preserving original content.") + if method == "original": + method = "passthrough" @@ - savings_percent = round((savings_bytes / size_bytes) * 100, 2) if size_bytes else 0.0 + savings_ratio = (savings_bytes / size_bytes) if size_bytes else 0.0 + status = ( + "success" + if (savings_bytes >= min_savings_bytes and savings_ratio >= savings_threshold) + else "no_change" + ) + if status == "no_change": + warnings.append("No smaller output found; preserving original content.") + savings_percent = round(savings_ratio * 100, 2) if size_bytes else 0.0
🤖 Fix all issues with AI agents
In `@apps/worker/.env.example`:
- Around line 28-48: The .env.example contains duplicated compression
environment variables (e.g., ZENPDF_COMPRESS_PROFILE,
ZENPDF_COMPRESS_AUTO_IMAGE_HEAVY, ZENPDF_COMPRESS_USE_ZOPFLI,
ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG, ZENPDF_COMPRESS_SAVINGS_THRESHOLD_PCT,
ZENPDF_COMPRESS_MIN_SAVINGS_BYTES, ZENPDF_COMPRESS_TIMEOUT_*,
ZENPDF_COMPRESS_ENABLE_IMAGE_OPT, ZENPDF_QPDF_OI_*,
ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT, ZENPDF_COMPRESS_ENABLE_JBIG2,
ZENPDF_COMPRESS_PDFSIZEOPT_ARGS) causing ambiguous defaults; remove the
duplicate definitions so each variable appears only once, keeping the intended
canonical values (either the earlier block or this block) and consolidating
timeout variables (e.g., ZENPDF_COMPRESS_TIMEOUT_SECONDS vs timeout
base/per-page/max) into a single clear set of keys, ensuring comments explain
defaults where appropriate.
| # Compression tuning (defaults) | ||
| ZENPDF_COMPRESS_PROFILE=balanced | ||
| ZENPDF_COMPRESS_AUTO_IMAGE_HEAVY=1 | ||
| ZENPDF_COMPRESS_USE_ZOPFLI=0 | ||
| ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG=0 | ||
| ZENPDF_COMPRESS_SAVINGS_THRESHOLD_PCT=0.08 | ||
| ZENPDF_COMPRESS_MIN_SAVINGS_BYTES=200000 | ||
| ZENPDF_COMPRESS_TIMEOUT_BASE_SECONDS=120 | ||
| ZENPDF_COMPRESS_TIMEOUT_PER_MB_SECONDS=3 | ||
| ZENPDF_COMPRESS_TIMEOUT_PER_PAGE_SECONDS=1.5 | ||
| ZENPDF_COMPRESS_TIMEOUT_MAX_SECONDS=900 | ||
| # Leave empty to use calculated timeout; set to a positive integer to override. | ||
| ZENPDF_COMPRESS_TIMEOUT_SECONDS= | ||
| ZENPDF_COMPRESS_ENABLE_IMAGE_OPT=0 | ||
| ZENPDF_QPDF_OI_QUALITY=75 | ||
| ZENPDF_QPDF_OI_MIN_WIDTH=128 | ||
| ZENPDF_QPDF_OI_MIN_HEIGHT=128 | ||
| ZENPDF_QPDF_OI_MIN_AREA=16384 | ||
| ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT=0 | ||
| ZENPDF_COMPRESS_ENABLE_JBIG2=0 | ||
| ZENPDF_COMPRESS_PDFSIZEOPT_ARGS= |
There was a problem hiding this comment.
Remove duplicated compression keys to avoid ambiguous defaults.
This block redefines keys already set above, which can lead to confusion about which value is used. Please consolidate to a single definition per key (either move the earlier entries into this block or trim duplicates here).
🛠️ Suggested cleanup (trim duplicates in the new block)
ZENPDF_COMPRESS_SAVINGS_THRESHOLD_PCT=0.08
ZENPDF_COMPRESS_MIN_SAVINGS_BYTES=200000
-ZENPDF_COMPRESS_TIMEOUT_BASE_SECONDS=120
-ZENPDF_COMPRESS_TIMEOUT_PER_MB_SECONDS=3
-ZENPDF_COMPRESS_TIMEOUT_PER_PAGE_SECONDS=1.5
-ZENPDF_COMPRESS_TIMEOUT_MAX_SECONDS=900
# Leave empty to use calculated timeout; set to a positive integer to override.
ZENPDF_COMPRESS_TIMEOUT_SECONDS=
-ZENPDF_COMPRESS_ENABLE_IMAGE_OPT=0
-ZENPDF_QPDF_OI_QUALITY=75
-ZENPDF_QPDF_OI_MIN_WIDTH=128
-ZENPDF_QPDF_OI_MIN_HEIGHT=128
-ZENPDF_QPDF_OI_MIN_AREA=16384
-ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT=0
-ZENPDF_COMPRESS_ENABLE_JBIG2=0
-ZENPDF_COMPRESS_PDFSIZEOPT_ARGS=🧰 Tools
🪛 dotenv-linter (4.0.0)
[warning] 30-30: [UnorderedKey] The ZENPDF_COMPRESS_AUTO_IMAGE_HEAVY key should go before the ZENPDF_COMPRESS_PROFILE key
(UnorderedKey)
[warning] 32-32: [UnorderedKey] The ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG key should go before the ZENPDF_COMPRESS_PROFILE key
(UnorderedKey)
[warning] 33-33: [UnorderedKey] The ZENPDF_COMPRESS_SAVINGS_THRESHOLD_PCT key should go before the ZENPDF_COMPRESS_USE_ZOPFLI key
(UnorderedKey)
[warning] 34-34: [UnorderedKey] The ZENPDF_COMPRESS_MIN_SAVINGS_BYTES key should go before the ZENPDF_COMPRESS_PROFILE key
(UnorderedKey)
[warning] 35-35: [DuplicatedKey] The ZENPDF_COMPRESS_TIMEOUT_BASE_SECONDS key is duplicated
(DuplicatedKey)
[warning] 35-35: [UnorderedKey] The ZENPDF_COMPRESS_TIMEOUT_BASE_SECONDS key should go before the ZENPDF_COMPRESS_USE_ZOPFLI key
(UnorderedKey)
[warning] 36-36: [DuplicatedKey] The ZENPDF_COMPRESS_TIMEOUT_PER_MB_SECONDS key is duplicated
(DuplicatedKey)
[warning] 36-36: [UnorderedKey] The ZENPDF_COMPRESS_TIMEOUT_PER_MB_SECONDS key should go before the ZENPDF_COMPRESS_USE_ZOPFLI key
(UnorderedKey)
[warning] 37-37: [DuplicatedKey] The ZENPDF_COMPRESS_TIMEOUT_PER_PAGE_SECONDS key is duplicated
(DuplicatedKey)
[warning] 37-37: [UnorderedKey] The ZENPDF_COMPRESS_TIMEOUT_PER_PAGE_SECONDS key should go before the ZENPDF_COMPRESS_USE_ZOPFLI key
(UnorderedKey)
[warning] 38-38: [DuplicatedKey] The ZENPDF_COMPRESS_TIMEOUT_MAX_SECONDS key is duplicated
(DuplicatedKey)
[warning] 38-38: [UnorderedKey] The ZENPDF_COMPRESS_TIMEOUT_MAX_SECONDS key should go before the ZENPDF_COMPRESS_TIMEOUT_PER_MB_SECONDS key
(UnorderedKey)
[warning] 40-40: [UnorderedKey] The ZENPDF_COMPRESS_TIMEOUT_SECONDS key should go before the ZENPDF_COMPRESS_USE_ZOPFLI key
(UnorderedKey)
[warning] 41-41: [DuplicatedKey] The ZENPDF_COMPRESS_ENABLE_IMAGE_OPT key is duplicated
(DuplicatedKey)
[warning] 41-41: [UnorderedKey] The ZENPDF_COMPRESS_ENABLE_IMAGE_OPT key should go before the ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG key
(UnorderedKey)
[warning] 42-42: [DuplicatedKey] The ZENPDF_QPDF_OI_QUALITY key is duplicated
(DuplicatedKey)
[warning] 43-43: [DuplicatedKey] The ZENPDF_QPDF_OI_MIN_WIDTH key is duplicated
(DuplicatedKey)
[warning] 43-43: [UnorderedKey] The ZENPDF_QPDF_OI_MIN_WIDTH key should go before the ZENPDF_QPDF_OI_QUALITY key
(UnorderedKey)
[warning] 44-44: [DuplicatedKey] The ZENPDF_QPDF_OI_MIN_HEIGHT key is duplicated
(DuplicatedKey)
[warning] 44-44: [UnorderedKey] The ZENPDF_QPDF_OI_MIN_HEIGHT key should go before the ZENPDF_QPDF_OI_MIN_WIDTH key
(UnorderedKey)
[warning] 45-45: [DuplicatedKey] The ZENPDF_QPDF_OI_MIN_AREA key is duplicated
(DuplicatedKey)
[warning] 45-45: [UnorderedKey] The ZENPDF_QPDF_OI_MIN_AREA key should go before the ZENPDF_QPDF_OI_MIN_HEIGHT key
(UnorderedKey)
[warning] 46-46: [DuplicatedKey] The ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT key is duplicated
(DuplicatedKey)
[warning] 46-46: [UnorderedKey] The ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT key should go before the ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG key
(UnorderedKey)
[warning] 47-47: [DuplicatedKey] The ZENPDF_COMPRESS_ENABLE_JBIG2 key is duplicated
(DuplicatedKey)
[warning] 47-47: [UnorderedKey] The ZENPDF_COMPRESS_ENABLE_JBIG2 key should go before the ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT key
(UnorderedKey)
[warning] 48-48: [DuplicatedKey] The ZENPDF_COMPRESS_PDFSIZEOPT_ARGS key is duplicated
(DuplicatedKey)
[warning] 48-48: [UnorderedKey] The ZENPDF_COMPRESS_PDFSIZEOPT_ARGS key should go before the ZENPDF_COMPRESS_PROFILE key
(UnorderedKey)
🤖 Prompt for AI Agents
In `@apps/worker/.env.example` around lines 28 - 48, The .env.example contains
duplicated compression environment variables (e.g., ZENPDF_COMPRESS_PROFILE,
ZENPDF_COMPRESS_AUTO_IMAGE_HEAVY, ZENPDF_COMPRESS_USE_ZOPFLI,
ZENPDF_COMPRESS_GS_PASSTHROUGH_JPEG, ZENPDF_COMPRESS_SAVINGS_THRESHOLD_PCT,
ZENPDF_COMPRESS_MIN_SAVINGS_BYTES, ZENPDF_COMPRESS_TIMEOUT_*,
ZENPDF_COMPRESS_ENABLE_IMAGE_OPT, ZENPDF_QPDF_OI_*,
ZENPDF_COMPRESS_ENABLE_PDFSIZEOPT, ZENPDF_COMPRESS_ENABLE_JBIG2,
ZENPDF_COMPRESS_PDFSIZEOPT_ARGS) causing ambiguous defaults; remove the
duplicate definitions so each variable appears only once, keeping the intended
canonical values (either the earlier block or this block) and consolidating
timeout variables (e.g., ZENPDF_COMPRESS_TIMEOUT_SECONDS vs timeout
base/per-page/max) into a single clear set of keys, ensuring comments explain
defaults where appropriate.
Summary
Testing
Summary by CodeRabbit
New Features
Documentation
Chores