feat(http)!: require host_config.providers; fail loudly on misconfig#69
Merged
Conversation
## Summary
This is a breaking change to `serve chat-completions` boot semantics.
The server now requires an explicit `host_config.providers` dict; the old
behavior (iterate a hardcoded KNOWN_PROVIDERS list, skip silently, fall back
to a placeholder model) is removed.
## Changed
### loader.py
- Added 'providers' to _VALID_TOP_LEVEL_KEYS.
- Added _validate_providers_registry() with closed schema for the new block:
each entry may have optional 'module' (defaults to the key, must be a known
provider) and optional 'config' (must be a dict). Unknown keys raise
config_unknown_key; wrong types raise config_invalid_type.
- Called from load_config() after the existing _validate_provider_module call.
### merger.py
- Added an explanatory comment: 'providers' is server-mode only and
deliberately not merged into bundle_modules by this function.
### app.py (lifespan)
- Removed KNOWN_PROVIDERS import and the loop that iterated it.
- Added import sys.
- New logic: reads app.state.host_config.get('providers'), exits 2 if the
block is missing or empty, collects one error per failing provider, exits 2
with the full error list if any provider fails. Passes per-provider 'config'
as extra_config to list_provider_models().
### routes/models.py
- Removed the placeholder fallback (lines 107-121). The route now returns
the served list directly; the lifespan guarantees it is non-empty or the
server has already exited.
### routes/chat_completions.py (three edits)
Edit A — model validation:
- Replaced served_registry.get(payload.model, 'anthropic') fallback with a
hard 400 / unknown_model when the model is not in the registry.
Edit B — honor stream: false:
- Changed ChatCompletionRequest.stream type from bool = False to
bool | None = None so 'field absent' is distinguishable from
'explicit False'.
- Added _collect_completion() helper that consumes the SSE generator and
returns a single chat.completion JSON object.
- In chat_completions(): if payload.stream is False, call _collect_completion
and return JSONResponse; otherwise return StreamingResponse.
Edit C — upstream errors as HTTP 502:
- Moved event_queue / display / approval / host_tool_yield_state creation and
turn_task start UP into chat_completions() (the route handler), before
StreamingResponse is returned, so Starlette has not yet committed the 200
status line.
- Added a 50 ms asyncio.wait pre-flight check: if turn_task completes within
the window (immediately-failing mock or provider that raises before its
first IO await), the handler raises HTTPException(502) with a structured
OpenAI-shape error envelope.
- _stream_chat_completion() now accepts pre-started components (turn_task,
signal_task, event_queue, display, host_tool_yield_state) instead of
creating them internally.
- Mid-stream errors (after the role chunk is yielded) continue to be embedded
in delta.content (no other option once SSE has started).
### _wire.py
- ChatCompletionRequest.stream: bool = False → bool | None = None
### tests/http/ (new)
- tests/http/__init__.py (empty, marks the package)
- tests/http/test_lifespan_providers.py: 8 tests for lifespan boot semantics
covering all exit-2 scenarios and two happy-path scenarios.
- tests/http/test_chat_completions_validation.py: 5 tests for Edit A/B/C
using a no-op lifespan fixture so TestClient does not trigger bundle loading.
### CHANGELOG.md
- Added [Unreleased] section above [0.8.0] documenting all breaking changes,
additions, internal changes, and migration steps.
## Migration
Server-mode users on <= 0.8.0: add providers to host_config.json.
Minimum (Anthropic only):
{"providers": {"anthropic": {}}}
Multi-provider:
{"providers": {"anthropic": {}, "openai": {"config": {"base_url": "https://api.openai.com/v1"}}}}
## Tests
- uv run pytest tests/config/ → 47 passed (unchanged)
- uv run pytest tests/http/ → 13 passed (new)
- ruff check + ruff format + pyright on all modified source files: clean
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)
Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Add operational lifecycle management for the chat-completions HTTP server.
## What changed
### New: src/amplifier_agent_cli/admin/serve_lifecycle.py
Single-responsibility module owning the state file at
~/.amplifier-agent/state/serve.json (mode 0600, parent dir 0700).
State file helpers:
- write_state_file(): atomic write via tempfile + os.replace; enforces
mode 0600 on the tempfile BEFORE rename so api_key is never readable
at a permissive mode even momentarily.
- read_state_file(): returns None when absent; raises ClickException on
unknown schema_version with a remediation hint.
- remove_state_file(): idempotent (missing_ok=True).
- is_pid_alive(pid): uses os.kill(pid, 0) — PermissionError treated as
'alive' (process exists, we don't own it).
- wait_for_exit(pid, timeout): 100 ms poll loop.
Three click commands:
- serve status: reads state file, validates PID is alive, probes
GET /v1/models, reports running/stale/not-running with provider
counts. Self-cleans stale state files (PID gone). Exits 0 on healthy,
1 on unreachable.
- serve stop: SIGTERM → graceful wait (--timeout, default 5s) →
SIGKILL escalation on expiry. --force skips straight to SIGKILL.
Exits 1 when no state file (nothing to stop).
- serve restart: captures launch args from state file, stops old server,
re-launches as detached subprocess (start_new_session=True), polls
for new state file (30s window) before reporting success.
### Modified: src/amplifier_agent_cli/admin/serve.py
- Import and register status_command, stop_command, restart_command on
serve_group via add_command().
- Set AMPLIFIER_AGENT_HTTP_BIND and AMPLIFIER_AGENT_HTTP_PORT env vars
before uvicorn.run() so load_config() can stash them in the state file.
- Install belt-and-suspenders SIGTERM/SIGINT handlers that call
remove_state_file(); lifespan finally-block is the primary cleanup path.
### Modified: src/amplifier_agent_http/_config.py
- Add host: str and port: int fields to ServerConfig.
- load_config() reads AMPLIFIER_AGENT_HTTP_BIND (default 127.0.0.1)
and AMPLIFIER_AGENT_HTTP_PORT (default 9099).
### Modified: src/amplifier_agent_http/app.py
- After all providers load (and before yield), compute providers_summary
{provider_id: model_count} and call write_state_file() with pid,
started_at, host, port, api_key, workspace, host_config_path,
providers_summary.
- In finally block, call remove_state_file() for graceful shutdown
cleanup.
### Modified: tests/http/test_lifespan_providers.py
- Add host='127.0.0.1' and port=9099 to _server_config() helper (new
required ServerConfig fields).
- Patch write_state_file and remove_state_file in base_mocks fixture
so tests never touch the real state file.
### New: tests/cli/test_serve_lifecycle.py
18 tests covering:
- write_state_file: mode bits (0600/0700), content round-trip, atomicity.
- read_state_file: missing → None, unknown schema_version → ClickException.
- remove_state_file: idempotent.
- is_pid_alive: live subprocess, dead subprocess, PID 1.
- serve status: not running, stale PID self-clean, running + reachable.
- serve stop: no state file → exit 1, graceful SIGTERM, --force SIGKILL,
timeout escalation.
- serve restart: no state file → exit 1, cmd assembly from stored args.
### Modified: CHANGELOG.md
- Added lifecycle commands bullet to [Unreleased] ### Added section.
## Test results
- tests/cli/test_serve_lifecycle.py: 18/18 passed
- tests/http/: 13/13 passed (no regression)
- tests/config/: 47/47 passed (no regression)
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)
Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
During a recent DTU exploration, we surfaced a set of backlog items (#4 #5 #6 #7 #8) around cryptic startup behavior in
amplifier-agent serve chat-completions:KNOWN_PROVIDERSlist and silently skipped providers whose credentials were missing or modules were not installed."amplifier"model entry.not_found_errorembedded indelta.content4 seconds later.stream: falsewas silently ignored.delta.contentat HTTP 200, making them impossible to distinguish from successful responses at the client level.This PR resolves all five issues with a single clean redesign: explicit per-provider config + fail loudly on any boot-time issue.
It also bundles the operational lifecycle commands that close the story — without them, users have to
pkill -f "amplifier-agent serve"every iteration, and can't tell whether the server is running, hung, or dead. Same audience, complementary change, single review.Changed
Breaking (server mode only):
amplifier-agent serve chat-completionsnow requireshost_config.providersto be a non-empty dict. Any provider declared there that cannot initialize (missing credentials, module not installed,list_models()raises, returns 0 models) causes the server to exit 2 with a structured error listing every problem. The previous behavior — iterating a hardcodedKNOWN_PROVIDERSlist, silently skipping unreachable providers, and falling back to an unusable placeholder model — is gone. Single-turn mode (amplifier-agent run) is unaffected; theprovider(singular) block continues to work for it.POST /v1/chat/completionsnow validatesmodelagainst the served registry. Requests with an unknown model return HTTP 400{"error": {"code": "unknown_model", ...}}immediately, instead of being silently routed to whichever provider loaded first and failing 4 seconds later with an upstreamnot_found_errorembedded indelta.content.stream: falseis now honored. Requests with that flag return a single JSON body; onlystream: true(or absent) uses SSE.Upstream errors raised before any content chunks are emitted now surface as HTTP 502 with a structured OpenAI-shape error envelope, instead of being embedded inside
delta.contentof a 200 SSE response./v1/modelsno longer falls back to a placeholder{"id": "amplifier", ...}entry. The lifespan now guaranteesserved_models_registryis non-empty (or the server exits at boot), so the fallback was unreachable in practice.Added
host_config.providers(plural) registry — declares which providers the server-mode lifespan loads and how to instantiate each. Schema:providers: {<provider_id>: {module?: str, config?: dict}}. Themoduledefaults to the provider_id when omitted. Each provider'sconfigis passed through as theextra_configarg tolist_provider_models()and then to the provider module's constructor.Lifecycle commands
Adds
serve status / stop / restartoperational commands. State persisted at~/.amplifier-agent/state/serve.json(mode 0600, parent dir 0700; api_key is sensitive — never logged). Status reports running/stale/not-running plus served-model counts by provider; stop honors graceful shutdown with optional--forceand--timeout; restart identity-restarts from the stored launch args (host, port, api-key, workspace, host_config).amplifier-agent serve statusReports whether the server is running, probes
GET /v1/modelsover the wire, and self-cleans stale state files (PID gone). Exits 0 on healthy, 1 when unreachable.amplifier-agent serve stop [--force] [--timeout SECONDS]Sends SIGTERM with a 5s graceful-exit window. Escalates to SIGKILL on timeout or when
--forceis given.amplifier-agent serve restartIdentity restart — captures host, port, api-key, workspace, and host_config_path from the state file, stops the old server, and re-launches as a detached background process.
Internal
_validate_providers_registry()inamplifier_agent_lib/config/loader.pyenforces the closed schema for the new block.tests/http/covering lifespan boot scenarios and chat-completions validation.src/amplifier_agent_cli/admin/serve_lifecycle.py— single owner of state file path and schema. Callers go throughread_state_file,write_state_file,remove_state_file.ServerConfiggainshostandportfields so the lifespan can write them to the state file without re-parsing CLI flags.Migration
For server-mode users on
<= 0.8.0: add aprovidersblock to yourhost_config.json. Minimum to keep working with just Anthropic:{ "providers": { "anthropic": {} } }Multi-provider example:
{ "providers": { "anthropic": {}, "openai": {"config": {"base_url": "https://api.openai.com/v1"}} } }If you don't pass
host_config.providers, the server will exit at boot with a clear error message rather than running in a broken half-state.Test Evidence