Skip to content

feat: add multi modal tool result support for mcp tool#703

Draft
mehtarac wants to merge 1 commit intostrands-agents:mainfrom
mehtarac:mcp_tool_mm
Draft

feat: add multi modal tool result support for mcp tool#703
mehtarac wants to merge 1 commit intostrands-agents:mainfrom
mehtarac:mcp_tool_mm

Conversation

@mehtarac
Copy link
Member

Description

MCP tools can return multi-modal content — images, embedded resources, and mixed content — but McpTool only handled text content blocks,falling back to JsonBlock for everything else. This meant image data from MCP servers was passed to the model as raw JSON (base64 strings) instead of proper ImageBlock instances, preventing models from actually interpreting the visual content.

Related Issues

Documentation PR

Type of Change

Bug fix
New feature
Breaking change
Documentation update
Other (please describe):

Testing

How have you tested the change?

  • I ran npm run check

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions github-actions bot added the strands-running <strands-managed> Whether or not an agent is currently running label Mar 20, 2026

return record.type === 'text' && typeof record.text === 'string'
private _isImageFormat(format: string): format is ImageFormat {
return ['png', 'jpg', 'jpeg', 'gif', 'webp'].includes(format)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This duplicates the ImageFormat type definition from src/mime.ts. If ImageFormat is extended in the future, this list would need to be updated separately, creating a maintenance burden.

Suggestion: Consider one of these approaches:

  1. Export a constant array from mime.ts that can be used for both the type and runtime checks:
// In mime.ts
export const IMAGE_FORMATS = ['png', 'jpg', 'jpeg', 'gif', 'webp'] as const
export type ImageFormat = typeof IMAGE_FORMATS[number]
  1. Or check if the format string satisfies the type using the existing toMediaFormat return:
private _isImageFormat(format: MediaFormat): format is ImageFormat {
  return format === 'png' || format === 'jpg' || format === 'jpeg' || format === 'gif' || format === 'webp'
}

The current implementation works but couples this file to an implicit understanding of what ImageFormat contains.

const mimeType = record.mimeType

if (typeof data !== 'string' || typeof mimeType !== 'string') {
logger.warn('mcp image content missing data or mimeType, falling back to json')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The structured logging format in AGENTS.md specifies field=<value> format. This message could be consistent with line 127 by including a field prefix:

logger.warn('content_type=<image> | mcp image content missing data or mimeType, falling back to json')

This is a minor nit - current message is clear enough.

@github-actions
Copy link

Assessment: Comment

Good fix for properly handling multi-modal content from MCP tools. The implementation correctly maps image and resource content types to SDK blocks with appropriate fallbacks.

Review Categories
  • Testing: Comprehensive coverage with 8 new tests covering image content, text/blob resources, mixed content, and fallback scenarios
  • Maintainability: One suggestion regarding duplicated image format list that could drift from the canonical ImageFormat type
  • Code Quality: Clean implementation following existing patterns; good use of existing utilities (decodeBase64, toMediaFormat)

The test coverage is thorough and the approach aligns with how other adapters (A2A) handle similar conversions.

@github-actions github-actions bot removed the strands-running <strands-managed> Whether or not an agent is currently running label Mar 20, 2026
@mehtarac mehtarac changed the title add multi modal tool result support for mcp tool feat: add multi modal tool result support for mcp tool Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant