Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
3590c43
docs: LLM profiles design + example profile
openhands-agent Oct 18, 2025
9b1e3db
llm: add profile_id field to LLM (profile filename identifier)\n\nCo-…
openhands-agent Oct 18, 2025
21efefe
feat(llm): add ProfileManager and eagerly register profiles at conver…
openhands-agent Oct 18, 2025
46ca1b7
chore: stop tracking local runtime and worktree files; add to .gitignore
openhands-agent Oct 18, 2025
5efdaee
chore: only ignore bead databases
enyst Oct 18, 2025
9cbf67f
test: cover llm profile manager
enyst Oct 18, 2025
dfab517
Update .gitignore
enyst Oct 18, 2025
441eb25
Improve LLM profile manager persistence
enyst Oct 18, 2025
e7cd039
Add example for managing LLM profiles
enyst Oct 18, 2025
269610a
Document plan for profile references
enyst Oct 18, 2025
d0ab952
Integrate profile-aware persistence
enyst Oct 19, 2025
f74d050
Simplify profile registration logging
enyst Oct 19, 2025
df308fb
Normalize inline_mode naming
enyst Oct 19, 2025
4d293db
Simplify profile_id sync in ProfileManager
enyst Oct 19, 2025
7d1a525
Rename profile sync helper
enyst Oct 19, 2025
ec45ed5
LLMRegistry handles profile management
enyst Oct 19, 2025
1566df4
docs: clarify LLMRegistry profile guidance
enyst Oct 19, 2025
8f8b5b9
refactor: rename profile persistence helpers
enyst Oct 19, 2025
a3efa6e
refactor: split profile transform helpers
enyst Oct 19, 2025
17617aa
style: use f-strings in LLMRegistry logging
enyst Oct 19, 2025
9134aa1
Update openhands/sdk/llm/llm_registry.py
enyst Oct 19, 2025
36ab580
chore: stop tracking scripts/worktree.sh
enyst Oct 19, 2025
cea6a0d
Merge upstream main into agent-sdk-18-profile-manager
enyst Oct 21, 2025
12eec55
fix: remove runtime llm switching
enyst Oct 21, 2025
03b4600
style: use f-string for registry logging
enyst Oct 21, 2025
acf67e3
docs: expand LLM profile example
enyst Oct 21, 2025
218728e
Refine LLM profile persistence
enyst Oct 21, 2025
75e8ecd
Update LLM profile docs for usage_id semantics
enyst Oct 22, 2025
8511524
Merge remote-tracking branch 'upstream/main' into agent-sdk-18-profil…
enyst Oct 23, 2025
1f3adab
Merge branch 'main' into agent-sdk-18-profile-manager
enyst Oct 24, 2025
96ba8e9
Merge branch 'main' into agent-sdk-18-profile-manager
enyst Oct 25, 2025
142faee
fix LLM mutation for profiles to respect immutability; add docstring;…
enyst Oct 25, 2025
82138dd
refactor: keep LLM profile expansion at persistence layer
enyst Oct 25, 2025
b6511a9
Merge branch 'main' of github.com:All-Hands-AI/agent-sdk into agent-s…
enyst Oct 25, 2025
f5404b6
fix: restore LLM profile validation behavior
enyst Oct 26, 2025
85bc698
Merge branch 'main' into agent-sdk-18-profile-manager
enyst Oct 26, 2025
ba4bd50
harden profile handling
enyst Oct 26, 2025
99a422c
Merge branch 'main' into agent-sdk-18-profile-manager
enyst Nov 6, 2025
cbf886e
docs: capture runtime LLM switching investigation
enyst Oct 20, 2025
af1fd40
docs: outline runtime LLM switching plan
enyst Oct 20, 2025
2ec0f9f
feat: allow switching runtime LLM profiles
enyst Oct 20, 2025
3bf69db
docs: add runtime LLM switch example
enyst Oct 21, 2025
12d7264
docs: document inline mode switch rejection
enyst Oct 21, 2025
0b075c9
Delete .openhands/microagents/vscode.md
enyst Nov 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,10 @@ cache
/workspace/
openapi.json
.client/

# Local workspace files
.beads/*.db
*.db
.worktrees/
agent-sdk.workspace.code-workspace

*.code-workspace
scripts/worktree.sh
101 changes: 101 additions & 0 deletions docs/llm_profiles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
LLM Profiles (design)

Overview

This document records the design decision for "LLM profiles" (named LLM configuration files) and how they map to the existing LLM model and persistence in the SDK.

Key decisions

- Reuse the existing LLM Pydantic model schema. A profile file is simply the JSON dump of an LLM instance (the same shape produced by LLM.model_dump(exclude_none=True) or LLM.load_from_json).
- Storage location: ~/.openhands/llm-profiles/<profile_name>.json. The profile_name is the filename (no extension) used to refer to the profile.
- Do not change ConversationState or Agent serialization format for now. Profiles are a convenience for creating LLM instances and registering them in the runtime LLMRegistry.
- Secrets: do NOT store plaintext API keys in profile files by default. Prefer storing the env var name in the LLM.api_key (via LLM.load_from_env) or keep the API key in runtime SecretsManager. The LLMRegistry.save_profile API exposes an include_secrets flag; default False.
- LLM.usage_id semantics: keep current behavior (a small set of runtime identifiers such as 'agent', 'condenser', 'title-gen', etc.). Do not use usage_id as the profile name.

LLMRegistry profile API (summary)

- list_profiles() -> list[str]
- load_profile(name: str) -> LLM
- save_profile(name: str, llm: LLM, include_secrets: bool = False) -> str (path)
- register_profiles(profile_ids: Iterable[str] | None = None) -> None

Implementation notes

- LLMRegistry is the single entry point for both in-memory registration and on-disk profile persistence. Pass ``profile_dir`` to the constructor to override the default location when embedding the SDK.
- Use LLM.load_from_json(path) for loading and llm.model_dump(exclude_none=True) for saving.
- Default directory: os.path.expanduser('~/.openhands/llm-profiles/')
- When loading, do not inject secrets. The runtime should reconcile secrets via ConversationState/Agent resolve_diff_from_deserialized or via SecretsManager.
- When saving, respect include_secrets flag; if False, ensure secret fields (api_key, aws_* keys) are omitted or masked.

CLI

- Use a single flag: --llm <profile_name> to select a profile for the agent LLM.
- Also support an environment fallback: OPENHANDS_LLM_PROFILE.
- Provide commands: `openhands llm list`, `openhands llm show <profile_name>` (redacts secrets).

Migration

- Migration from inline configs to profiles: provide a migration helper script to extract inline LLMs from ~/.openhands/agent_settings.json and conversation base_state.json into ~/.openhands/llm-profiles/<name>.json and update references (manual opt-in by user).

## Proposed changes for agent-sdk-19 (profile references in persistence)

### Goals
- Allow agent settings and conversation snapshots to reference stored LLM profiles by name instead of embedding full JSON payloads.
- Maintain backward compatibility with existing inline configurations.
- Enable a migration path so that users can opt in to profiles without losing existing data.

### Persistence format updates
- **Agent settings (`~/.openhands/agent_settings.json`)**
- Add an optional `profile_id` (or `llm_profile`) field wherever an LLM is configured (agent, condenser, router, etc.).
- When `profile_id` is present, omit the inline LLM payload in favor of the reference.
- Continue accepting inline definitions when `profile_id` is absent.
- **Conversation base state (`~/.openhands/conversations/<id>/base_state.json`)**
- Store `profile_id` for any LLM that originated from a profile when the conversation was created.
- Inline the full LLM payload only when no profile reference exists.

### Loader behavior
- On startup, configuration loaders must detect `profile_id` and load the corresponding LLM via `LLMRegistry.load_profile(profile_id)`.
- If the referenced profile cannot be found, fall back to existing inline data (if available) and surface a clear warning.
- Inject secrets after loading (same flow used today when constructing LLM instances).

### Writer behavior
- When persisting updated agent settings or conversation snapshots, write back the `profile_id` whenever the active LLM was sourced from a profile.
- Only write the raw LLM configuration for ad-hoc instances (no associated profile), preserving current behavior.
- Respect the `OPENHANDS_INLINE_CONVERSATIONS` flag (default: true for reproducibility). When enabled, always inline full LLM payloads—even if `profile_id` exists—and surface an error if a conversation only contains `profile_id` entries.

### Migration helper
- Provide a utility (script or CLI command) that:
1. Scans existing agent settings and conversation base states for inline LLM configs.
2. Uses `LLMRegistry.save_profile` to serialize them into `~/.openhands/llm-profiles/<generated-name>.json`.
3. Rewrites the source files to reference the new profiles via `profile_id`.
- Keep the migration opt-in and idempotent so users can review changes before adopting profiles.

### Testing & validation
- Extend persistence tests to cover:
- Loading agent settings with `profile_id` only.
- Mixed scenarios (profile reference plus inline fallback).
- Conversation snapshots that retain profile references across reloads.
- Add regression tests ensuring legacy inline-only configurations continue to work.

### Follow-up coordination
- Subsequent tasks (agent-sdk-20/21/22) will build on this foundation to expose CLI flags, update documentation, and improve secrets handling.


## Persistence integration review

### Conversation snapshots vs. profile-aware serialization
- **Caller experience:** Conversations that opt into profile references should behave the same as the legacy inline flow. Callers still receive fully expanded `LLM` payloads when they work with `ConversationState` objects or remote conversation APIs. The only observable change is that persisted `base_state.json` files can shrink to `{ "profile_id": "<name>" }` instead of storing every field.
- **Inline vs. referenced storage:** Conversation persistence previously delegated everything to Pydantic (`model_dump_json` / `model_validate`). The draft implementation added a recursive helper (`compact_llm_profiles` / `resolve_llm_profiles`) that walked arbitrary dictionaries and manually replaced or expanded embedded LLMs. This duplication diverged from the rest of the SDK, where polymorphic models rely on validators and discriminators to control serialization.
- **Relationship to `DiscriminatedUnionMixin`:** That mixin exists so we can ship objects across process boundaries (e.g., remote conversations) without bespoke traversal code. Keeping serialization rules on the models themselves, rather than sprinkling special cases in persistence helpers, lets us benefit from the same rebuild/validation pipeline.

### Remote conversation compatibility
- The agent server still exposes fully inlined LLM payloads to remote clients. Because the manual compaction was only invoked when writing `base_state.json`, remote APIs were unaffected. We need to preserve that behaviour so remote callers do not have to resolve profiles themselves.
- When a conversation is restored on the server (or locally), any profile references in `base_state.json` must be expanded **before** the state is materialised; otherwise, components that expect a concrete `LLM` instance (e.g., secret reconciliation, spend tracking) will break.

### Recommendation
- Move profile resolution/compaction into the `LLM` model:
- A `model_validator(mode="before")` can load `{ "profile_id": ... }` payloads with the `LLMRegistry`, while respecting `OPENHANDS_INLINE_CONVERSATIONS` (raise when inline mode is enforced but only a profile reference is available).
- A `model_serializer(mode="json")` can honour the same inline flag via `model_dump(..., context={"inline_llm_persistence": bool})`, returning either the full inline payload or a `{ "profile_id": ... }` stub. Callers that do not provide explicit context will continue to receive inline payloads by default.
- Have `ConversationState._save_base_state` call `model_dump_json` with the appropriate context instead of the bespoke traversal helpers. This keeps persistence logic co-located with the models, reduces drift, and keeps remote conversations working without additional glue.
- With this approach we still support inline overrides (`OPENHANDS_INLINE_CONVERSATIONS=true`), profile-backed storage, and remote access with no behavioural changes for callers.

68 changes: 68 additions & 0 deletions docs/llm_runtime_switch_investigation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Runtime LLM Profile Switching – Investigation (agent-sdk-24)

## Current architecture

### LLMRegistry
- Keeps an in-memory mapping `service_to_llm: dict[str, LLM]`.
- Loads/saves JSON profiles under `~/.openhands/llm-profiles` (or a custom directory) via:
- `list_profiles()` / `get_profile_path()`
- `save_profile(profile_id, llm)` – strips secret fields unless explicitly asked not to.
- `load_profile(profile_id)` – rehydrates an `LLM`, ensuring the runtime instance’s `profile_id` matches the file stem via `_load_profile_with_synced_id`.
- `register_profiles(profile_ids=None)` – iterates `list_profiles()`, calling `load_profile` then `add` for each profile; skips invalid payloads or duplicates.
- `validate_profile(data)` – wraps `LLM.model_validate` to report pydantic errors as strings.
- `add(llm)` publishes a `RegistryEvent` to the optional subscriber and records the LLM in `service_to_llm` keyed by `llm.service_id`.
- Currently assumes a one-to-one mapping of service_id ↔ active LLM instance.

### Agent & LLM ownership
- `AgentBase.llm` is a (frozen) `LLM` Basemodel. Agents may also own other LLMs (e.g., condensers) discovered via `AgentBase.get_all_llms()`.
- `AgentBase.resolve_diff_from_deserialized(persisted)` reconciles a persisted agent with the runtime agent:
- Calls `self.llm.resolve_diff_from_deserialized(persisted.llm)`; this only permits differences in fields listed in `LLM.OVERRIDE_ON_SERIALIZE` (api keys, AWS secrets, etc.). Any other field diff raises.
- Ensures tool names match and the rest of the agent models are identical.
- `LLM.resolve_diff_from_deserialized(persisted)` compares `model_dump(exclude_none=True)` between runtime and persisted objects, allowing overrides only for secret fields. Any other difference triggers a `ValueError`.

### Conversation persistence
- `ConversationState._save_base_state()` -> `compact_llm_profiles(...)` when `OPENHANDS_INLINE_CONVERSATIONS` is false, replacing inline LLM dicts with `{"profile_id": id}` entries.
- `ConversationState.create()` -> `resolve_llm_profiles(...)` prior to validation, so profile references become concrete LLM dicts loaded from `LLMRegistry`.
- When inline mode is enabled (`OPENHANDS_INLINE_CONVERSATIONS=true`), profiles are fully embedded and *any* LLM diff is rejected by the reconciliation flow above.

### Conversation bootstrapping
- `LocalConversation.__init__()` adds all LLMS from the agent to the registry and eagerly calls `register_profiles()` (errors logged at DEBUG level). This ensures the in-memory registry is primed with persisted profiles before a conversation resumes.

## Implications for runtime switching

1. **Registry as switch authority**
- Registry already centralizes active LLM instances and profile management, so introducing a “switch-to-profile” operation belongs here. That operation will need to:
- Load the target profile (if not already loaded).
- Update `service_to_llm` (and notify subscribers) atomically.
- Return the new `LLM` so callers can update their Agent / Conversation state.

2. **Agent/LLM reconciliation barriers**
- Current `resolve_diff_from_deserialized` logic rejects *any* non-secret field change. A runtime profile swap would alter at least `LLM.model`, maybe provider-specific params. We therefore need a sanctioned path that:
- Skips reconciliation when conversations are persisted with profile references (i.e., inline mode disabled).
- Refuses to switch when inline mode is required (e.g., evals with `OPENHANDS_INLINE_CONVERSATIONS=true`). Switching in inline mode would otherwise break diff validation.
- This aligns with the instruction to “REJECT SWITCH for eval mode,” but “JUST SWITCH” when persistence is profile-based.

3. **State & metrics consistency**
- After a switch we must ensure:
- `ConversationState.agent.llm` points at the new object (and any secondary LLM references, e.g., condensers, are updated if needed).
- `ConversationState.stats.service_to_metrics` either resets or continues per usage_id; we must decide what data should carry over when the service swaps to a different profile.
- Event persistence continues to work: future saves should store the new profile ID, and reloads should retrieve the same profile in the registry.

4. **Runtime API surface**
- Need an ergonomic call for agents/conversations to request a new profile by name (manual selection or automated policy). Potential entry points:
- `LLMRegistry.switch_profile(service_id, profile_id)` returning the active `LLM`.
- Conversation-level helper (e.g., `LocalConversation.switch_llm(profile_id)`) that coordinates registry + agent updates + persistence.

5. **Observer / callback considerations**
- Registry already has a single `subscriber`. If multiple components need to react to switches, we might extend this to a small pub/sub mechanism. Otherwise we can keep a single callback and have the conversation install its own handler.

## Open questions / risks
- What happens to in-flight operations when the switch occurs? (For initial implementation we can require the agent to be idle.)
- How should token metrics roll over? We likely reset or create a new entry keyed by the new profile.
- Tool / condenser LLMs: do we switch only the primary agent LLM, or should condensers also reference profiles? (Out of scope unless required by the plan.)
- Tests must cover: successful switch, rejected switch in inline mode, persistence after switch, registry events.

## Next steps
1. Capture the desired UX/API in the follow-up planning issue (agent-sdk-25).
2. Decide how to bypass reconciliation safely when profile references are used.
3. Define exact testing matrix (registry unit tests, conversation integration tests, persistence roundtrip).
Loading
Loading