You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have updated to the latest minor and patch version of Strands
I have checked the documentation and this is not expected behavior
I have searched ./issues and there are no duplicates of my issue
Strands Version
1.28.0
Python Version
3.12.8
Operating System
macOS 14.3
Installation Method
pip
Steps to Reproduce
Configure a Strands agent with Bedrock guardrails enabled (guardrail_redact_input=True, which is the default) and AgentCoreMemorySessionManager for session persistence.
Start a conversation with a legitimate message (e.g., "Suggest a metadata schema for our library").
Send a message that triggers the guardrail (e.g., sexually explicit content). The guardrail correctly blocks this and the agent returns the blockedInputMessaging text.
Send a completely legitimate follow-up message (e.g., "I'd like to have a catalog of books").
The guardrail blocks this legitimate message too, even though it contains no policy-violating content.
The conversation is now permanently stuck — every subsequent message is blocked by the guardrail.
Minimal reproduction:
fromstrandsimportAgentfromstrands.models.bedrockimportBedrockModelfrombedrock_agentcore.memory.integrations.strandsimport (
AgentCoreMemorySessionManager,
AgentCoreMemoryConfig,
)
model=BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
guardrail_id="YOUR_GUARDRAIL_ID",
guardrail_version="DRAFT",
# guardrail_redact_input=True # this is the default
)
session_manager=AgentCoreMemorySessionManager(
agentcore_memory_config=AgentCoreMemoryConfig(
memory_id="YOUR_MEMORY_ID",
actor_id="test_user",
session_id="test_session",
),
)
agent=Agent(model=model, session_manager=session_manager)
# Turn 1: legitimate message — works fineagent("Hello, help me organize my assets")
# Turn 2: offensive message — correctly blocked by guardrailagent("inappropriate content here")
# Turn 3: legitimate message — INCORRECTLY blocked# because the original unredacted offensive content from Turn 2# was persisted to AgentCore Memory and is replayed into contextagent("Actually, I'd like to organize my photos by date")
Expected Behavior
When a guardrail blocks a message and guardrail_redact_input=True (the default):
The offensive user message should be redacted both in-memory and in the persistent session store.
On subsequent turns, the conversation history loaded from AgentCore Memory should contain only the redacted version ("[User input redacted.]"), not the original offensive text.
Follow-up legitimate messages should not be blocked by the guardrail.
Actual Behavior
The redaction only happens in-memory but is never persisted to AgentCore Memory. Here's the chain of events:
When a guardrail intervenes, Agent._event_stream_handler calls:
RepositorySessionManager.redact_latest_message (line 81-93 in repository_session_manager.py) sets latest_agent_message.redact_message = redact_message and calls self.session_repository.update_message(...).
The problem: AgentCoreMemorySessionManager.update_message (line 490-511 in the bedrock-agentcore package) is effectively a no-op — it only logs at DEBUG level:
defupdate_message(self, session_id, agent_id, session_message, **kwargs):
# ...validation...logger.debug(
"Message update requested for message: %s (AgentCore Memory doesn't support updates)",
{session_message.message_id},
)
On the next turn, RepositorySessionManager.initialize loads messages from AgentCore Memory via list_messages → events_to_messages. Since the redaction was never persisted, the original unredacted offensive content is loaded back into agent.messages.
Even with guardrail_latest_message=True, the full conversation history (including the unredacted offensive text) is sent to Bedrock as context.
Without guardrail_latest_message=True (the default), the guardrail evaluates ALL messages, and the unredacted offensive message causes every subsequent turn to be blocked.
Additional Context
The update_message no-op in AgentCoreMemorySessionManager is documented with the comment "AgentCore Memory doesn't support updates", which is technically true — AgentCore Memory is an append-only event store. However, the SDK's redaction mechanism relies on update_message actually persisting the change.
The SessionMessage.to_message() method correctly returns redact_message if set, but since update_message is a no-op, the redact_message field is never stored, and when messages are reloaded from AgentCore Memory they have no redact_message set.
We confirmed this behavior with diagnostic hooks that log the full conversation history sent to the model — the original offensive text is visible in history on subsequent turns.
Possible Solution
Since AgentCore Memory is append-only and doesn't support in-place updates, the redaction strategy needs to change. Some options:
Deferred persistence: Buffer the user message in the session manager and only persist it after the model responds. If a guardrail intervenes and redacts the message, persist the redacted version instead. We implemented this as a workaround (a GuardrailSafeSessionManager wrapper) and confirmed it resolves the issue.
Pre-flight guardrail check: Before appending the user message to the session, call the Bedrock ApplyGuardrail API to check the content. If it violates the policy, persist only the redacted version.
Checks
Strands Version
1.28.0
Python Version
3.12.8
Operating System
macOS 14.3
Installation Method
pip
Steps to Reproduce
Configure a Strands agent with Bedrock guardrails enabled (
guardrail_redact_input=True, which is the default) andAgentCoreMemorySessionManagerfor session persistence.Start a conversation with a legitimate message (e.g., "Suggest a metadata schema for our library").
Send a message that triggers the guardrail (e.g., sexually explicit content). The guardrail correctly blocks this and the agent returns the
blockedInputMessagingtext.Send a completely legitimate follow-up message (e.g., "I'd like to have a catalog of books").
The guardrail blocks this legitimate message too, even though it contains no policy-violating content.
The conversation is now permanently stuck — every subsequent message is blocked by the guardrail.
Minimal reproduction:
Expected Behavior
When a guardrail blocks a message and
guardrail_redact_input=True(the default):"[User input redacted.]"), not the original offensive text.Actual Behavior
The redaction only happens in-memory but is never persisted to AgentCore Memory. Here's the chain of events:
When a guardrail intervenes,
Agent._event_stream_handlercalls:RepositorySessionManager.redact_latest_message(line 81-93 inrepository_session_manager.py) setslatest_agent_message.redact_message = redact_messageand callsself.session_repository.update_message(...).The problem:
AgentCoreMemorySessionManager.update_message(line 490-511 in the bedrock-agentcore package) is effectively a no-op — it only logs at DEBUG level:On the next turn,
RepositorySessionManager.initializeloads messages from AgentCore Memory vialist_messages→events_to_messages. Since the redaction was never persisted, the original unredacted offensive content is loaded back intoagent.messages.Even with
guardrail_latest_message=True, the full conversation history (including the unredacted offensive text) is sent to Bedrock as context.Without
guardrail_latest_message=True(the default), the guardrail evaluates ALL messages, and the unredacted offensive message causes every subsequent turn to be blocked.Additional Context
update_messageno-op inAgentCoreMemorySessionManageris documented with the comment "AgentCore Memory doesn't support updates", which is technically true — AgentCore Memory is an append-only event store. However, the SDK's redaction mechanism relies onupdate_messageactually persisting the change.SessionMessage.to_message()method correctly returnsredact_messageif set, but sinceupdate_messageis a no-op, theredact_messagefield is never stored, and when messages are reloaded from AgentCore Memory they have noredact_messageset.Possible Solution
Since AgentCore Memory is append-only and doesn't support in-place updates, the redaction strategy needs to change. Some options:
Deferred persistence: Buffer the user message in the session manager and only persist it after the model responds. If a guardrail intervenes and redacts the message, persist the redacted version instead. We implemented this as a workaround (a
GuardrailSafeSessionManagerwrapper) and confirmed it resolves the issue.Pre-flight guardrail check: Before appending the user message to the session, call the Bedrock
ApplyGuardrailAPI to check the content. If it violates the policy, persist only the redacted version.Related Issues