Skip to content

Conversation

@malhotra5
Copy link
Collaborator

@malhotra5 malhotra5 commented Nov 6, 2025

Summary

This PR does the following

  1. makes sure tool schemas is always has the security_risk field
  2. validates security_risk field at runtime.
    • when llm security analyzer is configured + security_risk is missing -> send an error event to correct itself (good for strong models)
    • when llm security analyzer is NOT configured + security_risk is missing -> default to SecurityRisk.UNKNOWN and proceed as usually

Benefits

  • Consistent schemas
  • Runtime enforcement of security_risk provides the desired flexibility
  • Graceful error handling (we send back an error event so the model has a chance to retry, good for stronger models)
  • No compromise on traceability (we have events registering when the security analyzer is added to a conversation; this is important since every event will contain security_risk field and they can be ignored depending on whether the analyzer has been configured)

The linked issue contains the following scenarios; this PR helps address each one of them

Scenario 1:
Security analyzer exists when starting conversation
System prompt includes tools with security_risk
Security analyzer is removed mid conversation
No risk field is expected, but LLM still believes it needs to pass this field because system prompt was not updated. Passing field will crash the conversation

In this scenario, the security_risk parameter will still exist. The LLM can continue passing the field. We simply won't interfere with the control loop even though the agent is passing risk values. The agent can also forget to pass a risk parameter which will be ignored.

Scenario 2:
Security analyzer was not included at start of conversation
System prompt does not include information on passing security_risk fields
Security analyzer is added later in the conversation
security_risk is a required field, but LLM is unaware of this requirement since system prompt did not include this information

This time security_risk information in the system prompt will always exist at the start of a conversation. The LLM is always aware of it. If analyzer is not configured and the LLM forgets to pass a value, we'll default to SecurityRisk.UNKNOWN and proceed as usual. When security analyzer is added later in the conversation, we'll make sure that at runtime an error event is raised if the LLM fails to pass the risk parameter.

Scenario 3:
Similar to 1, but:
The security analyzer is removed mid conversation
The conversation is fully reloaded
ActionEvents are created in history, and they have the old security_risk field
Validation fails, and leads to an infinite loop

The schema should always contain security_risk, so loading back events regardless of whether a section of conversation had or did not have a security analyzer should work.

Fixes #819

@malhotra5 can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:053baee-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-053baee-python \
  ghcr.io/openhands/agent-server:053baee-python

All tags pushed for this build

ghcr.io/openhands/agent-server:053baee-golang-amd64
ghcr.io/openhands/agent-server:053baee-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:053baee-golang-arm64
ghcr.io/openhands/agent-server:053baee-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:053baee-java-amd64
ghcr.io/openhands/agent-server:053baee-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:053baee-java-arm64
ghcr.io/openhands/agent-server:053baee-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:053baee-python-amd64
ghcr.io/openhands/agent-server:053baee-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:053baee-python-arm64
ghcr.io/openhands/agent-server:053baee-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:053baee-golang
ghcr.io/openhands/agent-server:053baee-java
ghcr.io/openhands/agent-server:053baee-python

About Multi-Architecture Support

  • Each variant tag (e.g., 053baee-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 053baee-python-amd64) are also available if needed

- Modified tool schema generation to include security_risk field when add_security_risk_prediction=True for all tool types (including read-only tools)
- Updated LLM security analyzer validation to always require security_risk field when using LLMSecurityAnalyzer
- Added comprehensive test suite for security_risk validation behavior
- Fixed existing tests to reflect new behavior where security_risk is included for read-only tools when prediction is enabled
- Updated docstrings to clarify the new behavior

Co-authored-by: openhands <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/agent
   agent.py1635466%139, 143–144, 151–152, 154–156, 158–160, 176, 191–193, 200–202, 204, 208, 211–212, 214, 221, 247, 252, 283, 287, 292, 303, 306, 322, 328, 338–339, 359–361, 363, 375–376, 381–382, 401–402, 407, 419–420, 425–426, 458, 465–466, 485
openhands-sdk/openhands/sdk/event
   security_analyzer.py30583%56, 63–64, 69, 81
TOTAL11813544053% 

malhotra5 and others added 26 commits November 6, 2025 10:53
…ation

- Created new SecurityAnalyzerConfigurationEvent class extending Event
- Added event type to EventType literal and exports
- Modified AgentBase.init_state to always emit SecurityAnalyzerConfigurationEvent
- Added comprehensive tests for event creation and emission
- Event tracks analyzer type (string name or None) and includes visualization methods

Co-authored-by: openhands <[email protected]>
- Test all 5 required scenarios with parameterized tests
- Case 1: LLM analyzer set, security risk passed, extracted properly
- Case 2: analyzer not set, security risk passed, extracted properly
- Case 3: LLM analyzer set, security risk not passed, ValueError raised
- Case 4: analyzer not set, security risk not passed, UNKNOWN returned
- Case 5: invalid security risk value passed, ValueError raised
- Include additional tests for error messages and argument mutation
- Follow existing test patterns and code style guidelines

Co-authored-by: openhands <[email protected]>
…ehavior in conversations

- Test new conversation initialization creates SystemPromptEvent and SecurityAnalyzerConfigurationEvent
- Test reinitializing with same analyzer type creates new events (different instances)
- Test reinitializing with same agent instance still creates new events
- Test switching between different analyzers creates appropriate events
- Test switching from no analyzer to analyzer creates events
- Test multiple reinitializations create correct event sequences
- Test event properties and methods validation
- Use parameterized tests and fixtures following existing patterns
- All 8 tests passing with proper edge case coverage

Co-authored-by: openhands <[email protected]>
- Updated test_conversation_event_id_validation to expect duplicate event at index 2 instead of 1
- Updated test_pause_basic_functionality to expect 2 events instead of 1
- Both tests now account for the new SecurityAnalyzerConfigurationEvent being added during conversation initialization

Co-authored-by: openhands <[email protected]>
…gurationEvent

- Updated expected event count to account for additional SecurityAnalyzerConfigurationEvent
- When conversation is loaded from persistence, agent initialization adds one more event
- Changed assertion from original_event_count to original_event_count + 1

Co-authored-by: openhands <[email protected]>
- Added __eq__ method to SecurityAnalyzerConfigurationEvent to compare only analyzer_type
- This prevents duplicate events when the same analyzer configuration is used
- Updated test expectations to account for SecurityAnalyzerConfigurationEvent being added during initialization
- Fixed test_conversation_event_id_validation to expect duplicate at index 2 (after SystemPromptEvent and SecurityAnalyzerConfigurationEvent)
- Fixed test_conversation_persistence_lifecycle to expect same event count when loading from persistence

Co-authored-by: openhands <[email protected]>
- Fix _extract_security_risk method to always pop security_risk from arguments before checking readOnlyHint
- Update test calls to include the new readOnlyHint parameter (3rd argument)
- Add comprehensive test for readOnlyHint=True scenario
- Ensure security_risk is properly removed from arguments in all cases

Co-authored-by: openhands <[email protected]>
@malhotra5 malhotra5 force-pushed the refactor/always-include-security-risk-fields branch from 3a8f6bb to 4d58824 Compare November 7, 2025 16:09
@malhotra5 malhotra5 marked this pull request as ready for review November 7, 2025 16:32
@neubig neubig requested a review from xingyaoww November 7, 2025 16:37
len(security_analyzer_configuration_events) == 0
or security_analyzer_event != security_analyzer_configuration_events[-1]
):
on_event(security_analyzer_event)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me, this security analyzer event solely act as some sort of flag?

Can we add a new field to ConversationState to keep this information instead of making it an event?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with this PR we always store security_risk param regardless of the security analyzer. So without knowing which analyzer was active at the time of the action being generated, a stored security_risk value is ambiguous: you can’t tell if it was enforced, advisory, or ignored.

SecurityAnalyzerConfigurationEvent provides a timestamped, minimal audit trail of the analyzer type in effect when the action was proposed. This is especially important as we don't record user confirmation events, they are implicitly assumed if the action was executed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we keep a variable inside State that record the latest timestamp and the security analyzer type instead of emitting an event?

I'm hesitant about throwing this into events as this make the logic here very hard to parse

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sounds good! i figured it would be easier if all the information was in one place rather than having to cross reference

will make the change

Copy link
Collaborator Author

@malhotra5 malhotra5 Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one reason I can think of adding the information to agent event history, instead of the conversation state, is that the security_risk is a default specific to the Agent class. Other folks who build their own agent implementations will use the same conversation state but may not have a security_risk requirement the way we do. So for specificity is makes sense to isolate that customization to the history for the default agent, rather than the shared state object

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we already have confirmation_policy in the state and it kinda make sense to have SecurityAnalyzer related information stored there as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah that makes sense; in that case how about moving the security_analyzer out of the agent class as well?

mid conversation analyzer switches are also tough to do because the agent is immutable. if its in the state we can easily update it - wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep! i think that's a more reasonable thing to do (although this means we are making breaking change 1 day after cutting the v1 release haha 🤣 )

@malhotra5 malhotra5 marked this pull request as draft November 7, 2025 19:07
openhands-agent and others added 4 commits November 7, 2025 19:09
- Add security_analyzer field to ConversationState class following confirmation_policy pattern
- Remove security_analyzer field from Agent base class
- Add backwards compatibility handling via custom model_validate method
- Update Agent class to use conversation.state.security_analyzer instead of self.security_analyzer
- Restore system_message as property for backwards compatibility, add get_system_message method
- Update all Agent class methods to pass security_analyzer parameter where needed
- Update is_confirmation_mode_active property to use state.security_analyzer
- Add security_analyzer property to ConversationStateProtocol
- Create comprehensive backwards compatibility tests
- Update existing test fixtures to work with new architecture
- All 1141 SDK tests passing, all pre-commit checks passing

Co-authored-by: openhands <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comprehensive overview for difficulties with security analyzer

4 participants