feat: add multi modal tool result support for mcp tool#703
feat: add multi modal tool result support for mcp tool#703mehtarac wants to merge 1 commit intostrands-agents:mainfrom
Conversation
|
|
||
| return record.type === 'text' && typeof record.text === 'string' | ||
| private _isImageFormat(format: string): format is ImageFormat { | ||
| return ['png', 'jpg', 'jpeg', 'gif', 'webp'].includes(format) |
There was a problem hiding this comment.
Issue: This duplicates the ImageFormat type definition from src/mime.ts. If ImageFormat is extended in the future, this list would need to be updated separately, creating a maintenance burden.
Suggestion: Consider one of these approaches:
- Export a constant array from
mime.tsthat can be used for both the type and runtime checks:
// In mime.ts
export const IMAGE_FORMATS = ['png', 'jpg', 'jpeg', 'gif', 'webp'] as const
export type ImageFormat = typeof IMAGE_FORMATS[number]- Or check if the format string satisfies the type using the existing
toMediaFormatreturn:
private _isImageFormat(format: MediaFormat): format is ImageFormat {
return format === 'png' || format === 'jpg' || format === 'jpeg' || format === 'gif' || format === 'webp'
}The current implementation works but couples this file to an implicit understanding of what ImageFormat contains.
| const mimeType = record.mimeType | ||
|
|
||
| if (typeof data !== 'string' || typeof mimeType !== 'string') { | ||
| logger.warn('mcp image content missing data or mimeType, falling back to json') |
There was a problem hiding this comment.
Suggestion: The structured logging format in AGENTS.md specifies field=<value> format. This message could be consistent with line 127 by including a field prefix:
logger.warn('content_type=<image> | mcp image content missing data or mimeType, falling back to json')This is a minor nit - current message is clear enough.
|
Assessment: Comment Good fix for properly handling multi-modal content from MCP tools. The implementation correctly maps image and resource content types to SDK blocks with appropriate fallbacks. Review Categories
The test coverage is thorough and the approach aligns with how other adapters (A2A) handle similar conversions. |
Description
MCP tools can return multi-modal content — images, embedded resources, and mixed content — but McpTool only handled text content blocks,falling back to JsonBlock for everything else. This meant image data from MCP servers was passed to the model as raw JSON (base64 strings) instead of proper ImageBlock instances, preventing models from actually interpreting the visual content.
Related Issues
Documentation PR
Type of Change
Bug fix
New feature
Breaking change
Documentation update
Other (please describe):
Testing
How have you tested the change?
npm run checkChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.