Refactor: Always include risk fields #1052

malhotra5 · 2025-11-06T15:49:07Z

Summary

This PR does the following

makes sure tool schemas is always has the security_risk field
validates security_risk field at runtime.
- when llm security analyzer is configured + security_risk is missing -> send an error event to correct itself (good for strong models)
- when llm security analyzer is NOT configured + security_risk is missing -> default to SecurityRisk.UNKNOWN and proceed as usually

Benefits

Consistent schemas
Runtime enforcement of security_risk provides the desired flexibility
Graceful error handling (we send back an error event so the model has a chance to retry, good for stronger models)
No compromise on traceability (we have events registering when the security analyzer is added to a conversation; this is important since every event will contain security_risk field and they can be ignored depending on whether the analyzer has been configured)

The linked issue contains the following scenarios; this PR helps address each one of them

Scenario 1:
Security analyzer exists when starting conversation
System prompt includes tools with security_risk
Security analyzer is removed mid conversation
No risk field is expected, but LLM still believes it needs to pass this field because system prompt was not updated. Passing field will crash the conversation

In this scenario, the security_risk parameter will still exist. The LLM can continue passing the field. We simply won't interfere with the control loop even though the agent is passing risk values. The agent can also forget to pass a risk parameter which will be ignored.

Scenario 2:
Security analyzer was not included at start of conversation
System prompt does not include information on passing security_risk fields
Security analyzer is added later in the conversation
security_risk is a required field, but LLM is unaware of this requirement since system prompt did not include this information

This time security_risk information in the system prompt will always exist at the start of a conversation. The LLM is always aware of it. If analyzer is not configured and the LLM forgets to pass a value, we'll default to SecurityRisk.UNKNOWN and proceed as usual. When security analyzer is added later in the conversation, we'll make sure that at runtime an error event is raised if the LLM fails to pass the risk parameter.

Scenario 3:
Similar to 1, but:
The security analyzer is removed mid conversation
The conversation is fully reloaded
ActionEvents are created in history, and they have the old security_risk field
Validation fails, and leads to an infinite loop

The schema should always contain security_risk, so loading back events regardless of whether a section of conversation had or did not have a security analyzer should work.

Fixes #819

@malhotra5 can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:053baee-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-053baee-python \
  ghcr.io/openhands/agent-server:053baee-python

All tags pushed for this build

ghcr.io/openhands/agent-server:053baee-golang-amd64
ghcr.io/openhands/agent-server:053baee-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:053baee-golang-arm64
ghcr.io/openhands/agent-server:053baee-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:053baee-java-amd64
ghcr.io/openhands/agent-server:053baee-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:053baee-java-arm64
ghcr.io/openhands/agent-server:053baee-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:053baee-python-amd64
ghcr.io/openhands/agent-server:053baee-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:053baee-python-arm64
ghcr.io/openhands/agent-server:053baee-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:053baee-golang
ghcr.io/openhands/agent-server:053baee-java
ghcr.io/openhands/agent-server:053baee-python

About Multi-Architecture Support

Each variant tag (e.g., 053baee-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 053baee-python-amd64) are also available if needed

- Modified tool schema generation to include security_risk field when add_security_risk_prediction=True for all tool types (including read-only tools) - Updated LLM security analyzer validation to always require security_risk field when using LLMSecurityAnalyzer - Added comprehensive test suite for security_risk validation behavior - Fixed existing tests to reflect new behavior where security_risk is included for read-only tools when prediction is enabled - Updated docstrings to clarify the new behavior Co-authored-by: openhands <[email protected]>

github-actions · 2025-11-06T15:52:20Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/agent
agent.py	163	54	66%	139, 143–144, 151–152, 154–156, 158–160, 176, 191–193, 200–202, 204, 208, 211–212, 214, 221, 247, 252, 283, 287, 292, 303, 306, 322, 328, 338–339, 359–361, 363, 375–376, 381–382, 401–402, 407, 419–420, 425–426, 458, 465–466, 485
openhands-sdk/openhands/sdk/event
security_analyzer.py	30	5	83%	56, 63–64, 69, 81
TOTAL	11813	5440	53%

…ation - Created new SecurityAnalyzerConfigurationEvent class extending Event - Added event type to EventType literal and exports - Modified AgentBase.init_state to always emit SecurityAnalyzerConfigurationEvent - Added comprehensive tests for event creation and emission - Event tracks analyzer type (string name or None) and includes visualization methods Co-authored-by: openhands <[email protected]>

- Test all 5 required scenarios with parameterized tests - Case 1: LLM analyzer set, security risk passed, extracted properly - Case 2: analyzer not set, security risk passed, extracted properly - Case 3: LLM analyzer set, security risk not passed, ValueError raised - Case 4: analyzer not set, security risk not passed, UNKNOWN returned - Case 5: invalid security risk value passed, ValueError raised - Include additional tests for error messages and argument mutation - Follow existing test patterns and code style guidelines Co-authored-by: openhands <[email protected]>

…ehavior in conversations - Test new conversation initialization creates SystemPromptEvent and SecurityAnalyzerConfigurationEvent - Test reinitializing with same analyzer type creates new events (different instances) - Test reinitializing with same agent instance still creates new events - Test switching between different analyzers creates appropriate events - Test switching from no analyzer to analyzer creates events - Test multiple reinitializations create correct event sequences - Test event properties and methods validation - Use parameterized tests and fixtures following existing patterns - All 8 tests passing with proper edge case coverage Co-authored-by: openhands <[email protected]>

- Updated test_conversation_event_id_validation to expect duplicate event at index 2 instead of 1 - Updated test_pause_basic_functionality to expect 2 events instead of 1 - Both tests now account for the new SecurityAnalyzerConfigurationEvent being added during conversation initialization Co-authored-by: openhands <[email protected]>

…gurationEvent - Updated expected event count to account for additional SecurityAnalyzerConfigurationEvent - When conversation is loaded from persistence, agent initialization adds one more event - Changed assertion from original_event_count to original_event_count + 1 Co-authored-by: openhands <[email protected]>

- Added __eq__ method to SecurityAnalyzerConfigurationEvent to compare only analyzer_type - This prevents duplicate events when the same analyzer configuration is used - Updated test expectations to account for SecurityAnalyzerConfigurationEvent being added during initialization - Fixed test_conversation_event_id_validation to expect duplicate at index 2 (after SystemPromptEvent and SecurityAnalyzerConfigurationEvent) - Fixed test_conversation_persistence_lifecycle to expect same event count when loading from persistence Co-authored-by: openhands <[email protected]>

…//github.com/OpenHands/software-agent-sdk into refactor/always-include-security-risk-fields

This reverts commit 160a6a2.

- Fix _extract_security_risk method to always pop security_risk from arguments before checking readOnlyHint - Update test calls to include the new readOnlyHint parameter (3rd argument) - Add comprehensive test for readOnlyHint=True scenario - Ensure security_risk is properly removed from arguments in all cases Co-authored-by: openhands <[email protected]>

openhands-sdk/openhands/sdk/agent/agent.py

xingyaoww · 2025-11-07T16:46:23Z

openhands-sdk/openhands/sdk/agent/agent.py

+            len(security_analyzer_configuration_events) == 0
+            or security_analyzer_event != security_analyzer_configuration_events[-1]
+        ):
+            on_event(security_analyzer_event)


It seems to me, this security analyzer event solely act as some sort of flag?

Can we add a new field to ConversationState to keep this information instead of making it an event?

So with this PR we always store security_risk param regardless of the security analyzer. So without knowing which analyzer was active at the time of the action being generated, a stored security_risk value is ambiguous: you can’t tell if it was enforced, advisory, or ignored.

SecurityAnalyzerConfigurationEvent provides a timestamped, minimal audit trail of the analyzer type in effect when the action was proposed. This is especially important as we don't record user confirmation events, they are implicitly assumed if the action was executed

How about we keep a variable inside State that record the latest timestamp and the security analyzer type instead of emitting an event?

I'm hesitant about throwing this into events as this make the logic here very hard to parse

ok sounds good! i figured it would be easier if all the information was in one place rather than having to cross reference

will make the change

So one reason I can think of adding the information to agent event history, instead of the conversation state, is that the security_risk is a default specific to the Agent class. Other folks who build their own agent implementations will use the same conversation state but may not have a security_risk requirement the way we do. So for specificity is makes sense to isolate that customization to the history for the default agent, rather than the shared state object

imo we already have confirmation_policy in the state and it kinda make sense to have SecurityAnalyzer related information stored there as well?

ah that makes sense; in that case how about moving the security_analyzer out of the agent class as well?

mid conversation analyzer switches are also tough to do because the agent is immutable. if its in the state we can easily update it - wdyt?

yep! i think that's a more reasonable thing to do (although this means we are making breaking change 1 day after cutting the v1 release haha 🤣 )

Co-authored-by: openhands <[email protected]>

- Add security_analyzer field to ConversationState class following confirmation_policy pattern - Remove security_analyzer field from Agent base class - Add backwards compatibility handling via custom model_validate method - Update Agent class to use conversation.state.security_analyzer instead of self.security_analyzer - Restore system_message as property for backwards compatibility, add get_system_message method - Update all Agent class methods to pass security_analyzer parameter where needed - Update is_confirmation_mode_active property to use state.security_analyzer - Add security_analyzer property to ConversationStateProtocol - Create comprehensive backwards compatibility tests - Update existing test fixtures to work with new architecture - All 1141 SDK tests passing, all pre-commit checks passing Co-authored-by: openhands <[email protected]>

malhotra5 and others added 26 commits November 6, 2025 10:53

Update agent.py

0e5b697

send back error events

e087266

simplify risk field handling

7b5a9dd

fix comment

c5e6329

add comments

90eeb48

move event emit

42a2bfc

prevent dupe configuration events

e1e612c

rm fluff tests

951a405

write tests for event equality and serialization

35ce7db

Merge branch 'main' into refactor/always-include-security-risk-fields

625e509

fix merge conflicts

1106bdf

fix tests

e188cf3

Merge branch 'main' into refactor/always-include-security-risk-fields

480e53d

always default to adding risk prediction

160a6a2

Merge branch 'refactor/always-include-security-risk-fields' of https:…

8405598

…//github.com/OpenHands/software-agent-sdk into refactor/always-include-security-risk-fields

Revert "always default to adding risk prediction"

01646be

This reverts commit 160a6a2.

Update tool.py

248648f

simplify

2402da2

handle readonly case

4b3e817

malhotra5 force-pushed the refactor/always-include-security-risk-fields branch from 3a8f6bb to 4d58824 Compare November 7, 2025 16:09

Delete test_security_risk_schema_consistency.py

e222455

malhotra5 marked this pull request as ready for review November 7, 2025 16:32

Merge branch 'main' into refactor/always-include-security-risk-fields

66439c7

neubig requested a review from xingyaoww November 7, 2025 16:37

xingyaoww reviewed Nov 7, 2025

View reviewed changes

malhotra5 added 2 commits November 7, 2025 12:33

rename param

2077d67

rename param

7d33f82

malhotra5 marked this pull request as draft November 7, 2025 19:07

openhands-agent and others added 4 commits November 7, 2025 19:09

Fix line length issues in test comments

a0b1869

Co-authored-by: openhands <[email protected]>

override system prompt

503e577

record transition

360ee72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Always include risk fields #1052

Refactor: Always include risk fields #1052

Uh oh!

malhotra5 commented Nov 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

xingyaoww Nov 7, 2025

Uh oh!

malhotra5 Nov 7, 2025

Uh oh!

xingyaoww Nov 7, 2025

Uh oh!

malhotra5 Nov 7, 2025

Uh oh!

malhotra5 Nov 7, 2025 •

edited

Loading

Uh oh!

xingyaoww Nov 7, 2025

Uh oh!

malhotra5 Nov 7, 2025

Uh oh!

xingyaoww Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor: Always include risk fields #1052

Are you sure you want to change the base?

Refactor: Always include risk fields #1052

Uh oh!

Conversation

malhotra5 commented Nov 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

github-actions bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

xingyaoww Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

malhotra5 Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

malhotra5 Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

malhotra5 Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xingyaoww Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

malhotra5 Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

malhotra5 commented Nov 6, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 6, 2025 •

edited

Loading

malhotra5 Nov 7, 2025 •

edited

Loading