Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,54 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- **Breaking (server mode only):** `amplifier-agent serve chat-completions` now requires `host_config.providers` to be a non-empty dict. Any provider declared there that cannot initialize (missing credentials, module not installed, `list_models()` raises, returns 0 models) causes the server to exit 2 with a structured error listing every problem. The previous behavior — iterating a hardcoded `KNOWN_PROVIDERS` list, silently skipping unreachable providers, and falling back to an unusable placeholder model — is gone. Single-turn mode (`amplifier-agent run`) is unaffected; the `provider` (singular) block continues to work for it.

- **`POST /v1/chat/completions` now validates `model` against the served registry.** Requests with an unknown model return HTTP 400 `{"error": {"code": "unknown_model", ...}}` immediately, instead of being silently routed to whichever provider loaded first and failing 4 seconds later with an upstream `not_found_error` embedded in `delta.content`.

- **`stream: false` is now honored.** Requests with that flag return a single JSON body; only `stream: true` (or absent) uses SSE.

- **Upstream errors raised before any content chunks are emitted now surface as HTTP 502** with a structured OpenAI-shape error envelope, instead of being embedded inside `delta.content` of a 200 SSE response.

- **`/v1/models` no longer falls back to a placeholder `{"id": "amplifier", ...}` entry.** The lifespan now guarantees `served_models_registry` is non-empty (or the server exits at boot), so the fallback was unreachable in practice.

### Added

- **`amplifier-agent serve status / stop / restart` subcommands** — operational lifecycle for the chat-completions HTTP server. Status reports whether the server is running, where it's reachable, how many models from which providers it's serving, and self-cleans stale state files when the PID no longer exists. Stop sends SIGTERM with a configurable graceful-exit window (`--timeout`), escalating to SIGKILL on expiry or on `--force`. Restart performs an identity-restart using the args stored at original launch (host, port, api-key, workspace, host_config). State is tracked in `~/.amplifier-agent/state/serve.json` (mode 0600, parent dir 0700; api_key is sensitive — never logged).

- **`host_config.providers` (plural) registry** — declares which providers the server-mode lifespan loads and how to instantiate each. Schema: `providers: {<provider_id>: {module?: str, config?: dict}}`. The `module` defaults to the provider_id when omitted. Each provider's `config` is passed through as the `extra_config` arg to `list_provider_models()` and then to the provider module's constructor.

### Internal

- New `_validate_providers_registry()` in `amplifier_agent_lib/config/loader.py` enforces the closed schema for the new block.
- HTTP-face tests introduced from scratch under `tests/http/` covering lifespan boot scenarios and chat-completions validation.

### Migration

For server-mode users on `<= 0.8.0`: add a `providers` block to your `host_config.json`. Minimum to keep working with just Anthropic:

```json
{
"providers": {
"anthropic": {}
}
}
```

Multi-provider example:

```json
{
"providers": {
"anthropic": {},
"openai": {"config": {"base_url": "https://api.openai.com/v1"}}
}
}
```

If you don't pass `host_config.providers`, the server will exit at boot with a clear error message rather than running in a broken half-state.

## [0.8.0] — 2026-06-20

Adds an OpenAI-compatible chat-completions HTTP face for embedding amplifier-agent in third-party tools (opencode and similar), a persistent `auth` subcommand for provider credentials, and integrates the model-routing matrix for per-provider model selection. Existing JSON-RPC wire protocol unchanged — no wrapper bump required.
Expand Down
30 changes: 30 additions & 0 deletions src/amplifier_agent_cli/admin/serve.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,31 @@

import logging
import os
import signal
from pathlib import Path

import click
import uvicorn

from amplifier_agent_cli.admin.serve_lifecycle import (
remove_state_file,
restart_command,
status_command,
stop_command,
)


@click.group(name="serve")
def serve_group() -> None:
"""Start a wire face for amplifier-agent."""


# Register lifecycle subcommands on the group.
serve_group.add_command(status_command, name="status")
serve_group.add_command(stop_command, name="stop")
serve_group.add_command(restart_command, name="restart")


@serve_group.command(name="chat-completions")
@click.option(
"--bind",
Expand Down Expand Up @@ -144,6 +158,11 @@ def chat_completions(
raise click.UsageError(f"--config path does not exist or is not a file: {resolved_config_path}")
os.environ["AMPLIFIER_AGENT_HTTP_CONFIG_PATH"] = str(resolved_config_path)

# Expose host and port via env so load_config() can stash them in the
# state file (which the lifecycle commands read to know the wire address).
os.environ["AMPLIFIER_AGENT_HTTP_BIND"] = host
os.environ["AMPLIFIER_AGENT_HTTP_PORT"] = str(port)

# Resolve the values that will actually be used, so we can echo them
# to stderr (handy for opencode.json setup).
resolved_api_key = os.environ.get("AMPLIFIER_AGENT_HTTP_API_KEY", "local-dev-secret")
Expand All @@ -165,6 +184,17 @@ def chat_completions(
click.echo(f" Workspace: {resolved_workspace}", err=True)
click.echo(f" Config: {resolved_config}", err=True)

# Belt-and-suspenders: remove the state file on SIGTERM/SIGINT from the
# outer process context. uvicorn handles the actual shutdown sequence;
# the lifespan's finally block is the primary cleanup path. These
# handlers ensure cleanup even if the lifespan teardown is skipped (e.g.
# when the server is killed before lifespan has finished setting up).
def _cleanup_state(_signum: int, _frame: object) -> None:
remove_state_file()

signal.signal(signal.SIGTERM, _cleanup_state)
signal.signal(signal.SIGINT, _cleanup_state)

uvicorn.run(
"amplifier_agent_http.app:app",
host=host,
Expand Down
Loading
Loading