Skip to content

Harness scaffold#5

Open
Rhovian wants to merge 5 commits intomainfrom
harness-scaffold
Open

Harness scaffold#5
Rhovian wants to merge 5 commits intomainfrom
harness-scaffold

Conversation

@Rhovian
Copy link
Copy Markdown
Owner

@Rhovian Rhovian commented Mar 31, 2026

No description provided.

Rhovian and others added 5 commits March 29, 2026 15:48
Split inference/src/ into inference/core/, server/, cli/, shaders/.
Move tests/ into inference/tests/.

Rewrite server.m for OpenAI API conformance:
- Proper SSE chunks with model, created, system_fingerprint, role delta
- Non-streaming mode (stream: false)
- Tool-calling: parse tools array, detect <tool_call> blocks, emit as
  OpenAI tool_calls in both streaming and non-streaming responses
- Tool-result round-trips: handle tool role messages and assistant
  tool_calls in conversation history
- /v1/models lists all three models (35B-A3B, 27B, 9B) with loaded indicator
- /health returns model metadata, uptime, engine config
- Usage (prompt_tokens, completion_tokens) and timing (prefill_ms,
  decode_ms, tokens_per_sec) in final chunk and non-streaming response
- OpenAI-format error objects for 400/404/500

Add scripts/smoke_test.sh: build, server API conformance, tok/s
benchmark and quality check for all models present.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Create harness/ Rust crate (tokio, axum, serde) with the core type
system for multi-model orchestration:

- roles: 8 canonical roles (Planner, TaskManager, Worker, Reviewer,
  Referee, Tiebreaker, Summarizer, Searcher) with scratchpad access
  rules per role
- tasks: TaskSpec, TaskResult, TaskGrade (dimensional scores),
  TaskTelemetry, TaskState lifecycle, Task aggregate
- tasks/scratchpad: per-task directory + append-only manifest
- bus: Envelope (hub-routed, task_id as single source of truth,
  from_role/to_role on envelope), Participant (identity only),
  9 structured Message variants
- providers: async Provider trait (chat, chat_stream via
  Pin<Box<dyn Stream>>, load_model, unload_model, health_check),
  normalized ChatRequest/ChatResponse/StreamEvent types
- scheduler: LocalCapacity, CloudCapacity, LoadedModel,
  SchedulerDecision types
- tools: ToolName, ToolSpec, ToolResult
- config: HarnessConfig, RoleBinding, ProviderConfig
- telemetry: ModelPerformanceRecord
- types: DurationMs, TimestampMs shared newtypes

Add Rust target/ to .gitignore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… dispatch

The harness now has two provider interfaces reflecting two fundamentally
different relationships:
- InferenceProvider: turn-level, harness drives the tool loop (orome-local)
- CliProvider: task-level, CLI drives its own tool loop (claude, codex)

Adds plan file schema (.orome/plan.yaml) and boot manifest (orome.yaml)
as the contracts between planning and execution phases. Config updated
to reflect CLI-based providers (no API keys, no hardcoded capacity).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…plates

Implements the full dispatch pipeline:
- Dispatcher routes tasks to CLI providers (claude/codex) or local
  inference (orome), driving the tool loop for local models
- `orome plan --claude` launches interactive planning with injected
  system prompt (project context, schemas, model roster)
- `orome run --claude` reads .orome/plan.yaml, validates the DAG,
  and executes the worker -> reviewer -> retry loop
- Provider adapters: claude (JSON output), codex (JSONL), orome-local
  (OpenAI-compatible HTTP/SSE with x_orome timing)
- Prompt templates for planner, worker, reviewer, and tiebreaker roles

Dry-run validates plan parsing and topological sort work correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant