feat: add inbound image vision for Telegram photos#140
Open
billyshipp wants to merge 1 commit into
Open
Conversation
Downloads photos sent via Telegram, resizes to 1024px max with sharp, saves as JPEG attachments, and passes as multimodal content blocks so Claude can see and understand image content. Security: path confinement added to container agent-runner (all image paths validated to stay within /workspace/group/attachments/) and media type allowlisted before passing to Claude API. Outbound sendImage intentionally excluded — inbound vision only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds image vision for Telegram — photos sent to the bot are downloaded, resized, and passed to Claude as multimodal content blocks so the agent can see and understand image content.
Why
PR #9 covers this but includes outbound
sendImagewith path traversal vulnerabilities (absolute paths from container agents passed verbatim to hostsendImage, enabling host filesystem exfiltration). This PR delivers inbound vision only, with security fixes applied.How it works
message:photohandler downloads the largest available photo from Telegram's file APIsharpresizes to max 1024px, converts to JPEG at quality 85, saves togroups/{folder}/attachments/[Image: attachments/img-{ts}-{rand}.jpg]parseImageReferences()extracts refs from messages before each agent runSecurity vs PR #9
path.resolve()+ prefix check ensures image paths stay within/workspace/group/attachments/beforereadFileSync(fixes audit Finding 3)image/jpeg,image/png,image/gif,image/webpaccepted before passing to Claude API (fixes Finding 6)file_pathvalidated to not contain://or..before URL construction (fixes Finding 5)How it was tested
Tested on a live NanoClaw + Telegram instance. Sent a photo in a registered Telegram group — agent correctly described the image content.
Type of change