Add interactive chat harness for LLM service testing with auto-detection by Copilot · Pull Request #90 · TheWizardsCode/GEngine

Copilot · 2025-12-13T21:36:01Z

Engineers need a lightweight way to test /parse_intent and /narrate endpoints, validate prompt changes, observe token usage, and debug latency without writing test code or navigating FastAPI docs.

Implementation

HTTP Client (`src/gengine/echoes/llm/chat_client.py`)

Async wrapper around httpx.AsyncClient with context manager support
Methods for /parse_intent, /narrate, and /healthz endpoints
Configurable base URL, timeout, and custom headers

CLI REPL (`scripts/echoes_llm_chat.py`)

Interactive chat loop with two modes:
- parse: Natural language → intent JSON
- narrate: Events JSON → narrative text
Auto-detection of service URL: Automatically discovers the LLM service by trying:
1. Windows host IP (when running in WSL via /etc/resolv.conf)
2. http://localhost:8001
3. Displays helpful error if neither works
Multi-turn history management with configurable limits (default: 10 exchanges)
Slash commands: /clear, /save <path>, /quit, /exit
Context file loading via --context-file for initial game state
Displays latency, confidence, and token metrics per response

Usage

# Auto-detect service (no arguments needed)
uv run python scripts/echoes_llm_chat.py

# Or specify service URL explicitly
uv run python scripts/echoes_llm_chat.py --service-url http://localhost:8001

You: inspect the industrial district
📋 Intents:
[{"type": "inspect", "target": "district"}]
⏱  Latency: 45ms
🎯 Confidence: 0.95

# Narrate mode
uv run python scripts/echoes_llm_chat.py --mode narrate

Events (JSON or text): [{"type": "pollution_increase", "district": "industrial"}]
📖 Narrative:
The industrial district's pollution levels rose sharply...
⏱  Latency: 120ms
📊 Tokens: 45 in / 32 out

Testing (`tests/echoes/test_llm_chat_cli.py`)

13 tests covering request formation, history management, transcript export, and auto-detection
Uses httpx.MockTransport for HTTP layer mocking
Script imports via importlib.util.spec_from_file_location

Documentation

Added "LLM Chat Harness" section to README with:

Prerequisites and command-line options
Auto-detection behavior and WSL support
Sample sessions for both modes
Troubleshooting guide (connection failures, auto-detection, TLS errors, authentication)
Provider configuration examples (stub, OpenAI, Anthropic, Foundry)

Works with stub provider out of the box (no API keys required). Service URL auto-detection makes it easy to run without configuration—especially useful when working with WSL and Windows host services.

[!WARNING]

Original prompt

This section details on the original issue you should resolve

<filter_complete></filter_complete>

<issue_title>13.1.1 Build test chat interface</issue_title>
<issue_description>## Context

Phase 13 kicks off conversational tooling for the Echoes LLM service, but we currently do not have a developer-facing harness to exercise /parse_intent and /narrate outside of automated tests.
Engineers need a lightweight way to chat with the running echoes_llm_service (stub, OpenAI, Anthropic, or Foundry providers) to validate prompt changes, observe token usage, and debug latency before wiring any gameplay endpoints.
Providing a simple CLI chat loop will also let PMs and designers run scripted demos against remote environments without digging into FastAPI clients.

Goals

Provide a repeatable command (e.g., uv run python scripts/echoes_llm_chat.py --service-url http://localhost:8001) that opens an interactive prompt, accepts user text, and relays it to the configured echoes_llm_service.
Maintain basic multi-turn history on the client side so each request can optionally send the prior exchanges as context payload.
Surface useful debugging metadata (status, latency, provider/model, token counts) after each response and allow exporting transcripts.
Ship minimal documentation so teammates can run the tool locally or point it at a remote base URL.

Implementation Guidance

Add a reusable HTTP client helper (e.g., src/gengine/echoes/llm/chat_client.py) that wraps httpx.AsyncClient and knows how to hit /parse_intent (default) and /narrate when a --mode narrate flag is set. Accept base URL, timeout, and optional API key headers.
Create a CLI entry point under scripts/ (for example scripts/echoes_llm_chat.py) that:
- uses argparse to capture --service-url, --context-file (JSON), --mode (parse|narrate), --history-limit, and --export transcript.json.
- supports slash commands like /clear, /save <path>, and /quit for convenience.
- keeps an in-memory List[Dict[str, str]] history that is serialized into the context payload for /parse_intent (e.g., { "history": [...], "metadata": {...} }).
- prints structured output: intents (pretty JSON) for parse mode, generated narrative for narrate mode, plus latency/token metrics extracted from response metadata if available.
Add unit tests in tests/echoes (e.g., test_llm_chat_cli.py) that mock the HTTP layer (httpx.MockTransport or respx) to verify:
- requests are formed with history/context and mode-specific payloads
- /clear resets the local buffer and /save writes JSON transcripts
- error responses surface readable messages without crashing the REPL.
Extend README "LLM Service" coverage (or add a short "LLM Chat Harness" subsection) documenting prerequisites, commands, and sample session transcripts. Include guidance for pointing at stub vs. OpenAI/Anthropic providers and how to supply API keys via ECHOES_LLM_* env vars.
Provide a short troubleshooting section covering TLS errors, authentication failures, and how to run against docker compose (http://localhost:8001).

Acceptance Criteria

Running uv run python scripts/echoes_llm_chat.py --service-url http://localhost:8001 opens an interactive prompt that can exchange messages with the stub provider out of the box.
Users can switch between parse (intent JSON output) and narrate (story text) modes via CLI flag without restarting the service.
Conversation history is included in subsequent requests and can be cleared/exported via commands.
Errors from the service are handled gracefully with descriptive output and non-zero exit codes where appropriate.
Documentation (README or linked doc) explains setup, command options, and sample usage for local + remote endpoints.
Automated tests cover request formation, history management, and error handling.

Risks & Mitigations

Provider authentication differences: Document environment variables and default to stub provider, so running without API keys still works.
Long-running chats may reveal latency: Add per-request timing + token metrics to highlight slowness and provide guidance to switch providers.
Transcript storage: Limit history size (--history-limit) and redact API keys when exporting transcripts.

Tracker Reference

See .pm/tracker.md > Phase 13 > Task 13.1.1.
</issue_description>

Comments on the Issue (you are @copilot in this section)

Original prompt

This section details on the original issue you should resolve

<issue_title>13.1.1 Build test chat interface</issue_title>
<issue_description>## Context

Phase 13 kicks off conversational tooling for the Echoes LLM service, but we currently do not have a developer-facing harness to exercise /parse_intent and /narrate outside of automated tests.

Engineers need a lightweight way to chat with the running echoes_llm_service (stub, OpenAI, Anthropic, or Foundry providers) to validate prompt changes, observe token usage, and debug latency before wiring any gameplay endpoints.

Providing a simple CLI chat loop will also let PMs and designers run scripted demos against remote environments without digging into FastAPI clients.

Goals

Provide a repeatable command (e.g., uv run python scripts/echoes_llm_chat.py --service-url http://localhost:8001) that opens an interactive prompt, accepts user text, and relays it to the configured echoes_llm_service.

Maintain basic multi-turn history on the client side so each request can optionally send the prior exchanges as context payload.

Surface useful debugging metadata (status, latency, provider/model, token counts) after each response and allow exporting transcripts.

Ship minimal documentation so teammates can run the tool locally or point it at a remote base URL.

Implementation Guidance

Add a reusable HTTP client helper (e.g., src/gengine/echoes/llm/chat_client.py) that wraps httpx.AsyncClient and knows how to hit /parse_intent (default) and /narrate when a --mode narrate flag is set. Accept base URL, timeout, and optional API key headers.

Create a CLI entry point under scripts/ (for example scripts/echoes_llm_chat.py) that:

uses argparse to capture --service-url, --context-file (JSON), --mode (parse|narrate), --history-limit, and --export transcript.json.

supports slash commands like /clear, /save <path>, and /quit for convenience.

keeps an in-memory List[Dict[str, str]] history that is serialized into the context payload for /parse_intent (e.g., { "history": [...], "metadata": {...} }).

prints structured output: intents (pretty JSON) for parse mode, generated narrative for narrate mode, plus latency/token metrics extracted from response metadata if available.

Add unit tests in tests/echoes (e.g., test_llm_chat_cli.py) that mock the HTTP layer (httpx.MockTransport or respx) to verify:

requests are formed with history/context and mode-specific payloads

/clear resets the local buffer and /save writes JSON transcripts

error responses surface readable messages without crashing the REPL.

Extend README "LLM Service" coverage (or add a short "LLM Chat Harness" subsection) documenting prerequisites, commands, and sample session transcripts. Include guidance for pointing at stub vs. OpenAI/Anthropic providers and how to supply API keys via ECHOES_LLM_* env vars.

Provide a short troubleshooting section covering TLS errors, authentication failures, and how to run against docker compose (http://localhost:8001).

Acceptance Criteria

Running uv run python scripts/echoes_llm_chat.py --service-url http://localhost:8001 opens an interactive prompt that can exchange messages with the stub provider out of the box.

Users can switch between parse (intent JSON output) and narrate (story text) modes via CLI flag without restarting the service.

Conversation history is included in subsequent requests and can be cleared/exported via commands.

Errors from the service are handled gracefully with descriptive output and non-zero exit codes where appropriate.

Documentation (README or linked doc) explains setup, command options, and sample usage for local + remote endpoints.

Automated tests cover request formation, history management, and error handling.

Risks & Mitigations

Provider authentication differences: Document environment variables and default to stub provider, so running without API keys still works.

Long-running chats may reveal latency: Add per-request timing + token metrics to highlight slowness and provide guidance to switch providers.

Transcript storage: Limit history size (--history-limit) and redact API keys when exporting transcripts.

Tracker Reference

See .pm/tracker.md > Phase 13 > Task 13.1.1.
</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes 13.1.1 Build test chat interface #89

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

SorraTheOrc · 2025-12-13T22:31:52Z

@copilot rather than require the user to provide the service URL can we please attempt to auto detect it. Start by looking for it on the windows host, assuming the client is running in WSL. This would require us to find the IP number and use that instead of localhost. If that fails then try localhost and if that fails return an error.

Also allow /exit as an alias for /quit

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Copilot · 2025-12-13T22:38:31Z

@copilot rather than require the user to provide the service URL can we please attempt to auto detect it. Start by looking for it on the windows host, assuming the client is running in WSL. This would...

Done in commit 5ca8ec1.

Auto-detection: The client now automatically detects the service URL when --service-url is not specified. It tries:

Windows host IP (when running in WSL, parsed from /etc/resolv.conf)
http://localhost:8001

If both fail, it displays a helpful error message.

Example with auto-detection:

Auto-detecting LLM service...
✓ Detected service at http://localhost:8001

/exit alias: Added /exit as an alias for /quit. Both commands now work identically to exit the chat interface.

The slash commands help text has been updated to show: /clear, /save <path>, /quit, /exit

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

…gging, and update tracker

…gamedev-agent-thoughts.txt

Initial plan

9d08c29

Copilot AI assigned Copilot and SorraTheOrc Dec 13, 2025

Copilot started work on behalf of SorraTheOrc December 13, 2025 21:36 View session

Copilot AI and others added 3 commits December 13, 2025 21:43

Implement LLM chat client and CLI interface

eed6353

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Add comprehensive documentation for LLM chat harness

43dbb68

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Improve test imports using importlib instead of sys.path manipulation

b56d995

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add test chat interface for echoes LLM service~~ Add interactive chat harness for LLM service testing Dec 13, 2025

Copilot AI requested a review from SorraTheOrc December 13, 2025 21:52

Copilot finished work on behalf of SorraTheOrc December 13, 2025 21:52

Fix linting errors in LLM chat CLI files

aad6670

Copilot started work on behalf of SorraTheOrc December 13, 2025 22:32 View session

Add auto-detection of service URL and /exit alias

5ca8ec1

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Fix code review issues: file read and import placement

b4883aa

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>

Copilot AI changed the title ~~Add interactive chat harness for LLM service testing~~ Add interactive chat harness for LLM service testing with auto-detection Dec 13, 2025

Copilot finished work on behalf of SorraTheOrc December 13, 2025 22:43

GitHub Copilot added 6 commits December 13, 2025 15:03

Resolve merge conflicts: update chat CLI command list and test imports

7537c7e

Fix indentation error in chat CLI script

a933227

Fix LLMClient tests to use 127.0.0.1 to avoid WSL auto-discovery

91a33dc

Add logging for auto-discovered Windows host IP in WSL

d35376c

Update LLM client and chat script: unify WSL IP detection, improve lo…

03bb0c9

…gging, and update tracker

Merge origin/main into copilot/build-test-chat-interface: keep local …

869ef73

…gamedev-agent-thoughts.txt

SorraTheOrc merged commit 0177b4b into main Dec 14, 2025

SorraTheOrc deleted the copilot/build-test-chat-interface branch December 14, 2025 06:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add interactive chat harness for LLM service testing with auto-detection#90

Add interactive chat harness for LLM service testing with auto-detection#90
SorraTheOrc merged 13 commits intomainfrom
copilot/build-test-chat-interface

Copilot AI commented Dec 13, 2025 •

edited

Loading

Uh oh!

SorraTheOrc commented Dec 13, 2025

Uh oh!

Copilot AI commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

HTTP Client (src/gengine/echoes/llm/chat_client.py)

CLI REPL (scripts/echoes_llm_chat.py)

Usage

Testing (tests/echoes/test_llm_chat_cli.py)

Documentation

Goals

Implementation Guidance

Acceptance Criteria

Risks & Mitigations

Tracker Reference

Comments on the Issue (you are @copilot in this section)

Goals

Implementation Guidance

Acceptance Criteria

Risks & Mitigations

Tracker Reference

Comments on the Issue (you are @copilot in this section)

Uh oh!

SorraTheOrc commented Dec 13, 2025

Uh oh!

Copilot AI commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Dec 13, 2025 •

edited

Loading

HTTP Client (`src/gengine/echoes/llm/chat_client.py`)

CLI REPL (`scripts/echoes_llm_chat.py`)

Testing (`tests/echoes/test_llm_chat_cli.py`)