WIP: feat: base multi-agent collaboration framework & shared context by Leeaandrob · Pull Request #423 · sipeed/picoclaw

Leeaandrob · 2026-02-18T14:55:57Z

📝 Description

WIP — Base multi-agent collaboration framework with shared context pool, agent handoff, and discovery tools.

Builds on top of the merged PRs #213 (provider protocol refactor) and #131 (model fallback chain + multi-agent routing) to add:

Blackboard — Thread-safe shared key-value context pool (pkg/multiagent/blackboard.go) for inter-agent data sharing with scoped entries (author, scope, timestamp)
BlackboardTool — LLM-callable tool for read/write/list/delete on the shared context
Agent Handoff — Synchronous delegation between agents via ExecuteHandoff() with automatic context propagation through the blackboard
HandoffTool — LLM-callable tool that resolves target agent, writes context, builds system prompt, and delegates via RunToolLoop
ListAgentsTool — Discovery tool listing all registered agents with ID/Name/Role
AgentResolver interface — Decouples pkg/multiagent from pkg/agent to avoid circular imports
AgentLoop integration — Blackboard snapshot injection into system messages, per-session blackboards via sync.Map, tools auto-registered when >1 agent configured
Config extensions — Role and SystemPrompt fields on AgentConfig and AgentInstance

Architecture decisions

Zero overhead for single-agent setups (multi-agent tools only registered when len(registry.ListAgentIDs()) > 1)
registryResolver adapter bridges AgentRegistry → multiagent.AgentResolver at the integration boundary
Blackboard entries carry authorship metadata for audit/debugging

🗣️ Type of Change

🐞 Bug fix (non-breaking change which fixes an issue)
✨ New feature (non-breaking change which adds functionality)
📖 Documentation update
⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

🤖 Fully AI-generated (100% AI, 0% Human)
🛠️ Mostly AI-generated (AI draft, Human verified/modified)
👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

Closes #294

📚 Technical Context (Skip for Docs)

Reference URL: Feature: Base Multi-agent Collaboration Framework & Shared Context #294
Reasoning: Foundation layer for multi-agent collaboration. Blackboard pattern enables shared state without tight coupling. Handoff mechanism provides synchronous delegation with context propagation. Built on merged PR Refactor providers by protocol family (discussion #122) #213 (protocol refactor) and PR feat: model fallback chain + multi-agent routing #131 (routing + fallback chain).

🧪 Test Environment

Hardware: Linux x86_64
OS: Ubuntu (kernel 6.11.0-29-generic)
Model/Provider: Claude CLI (claude-code), Codex CLI (codex-code) — both integration tested
Channels: CLI direct mode

📸 Evidence (Optional)

Click to view test results

Unit tests: 17 packages, all PASS (including pkg/multiagent with 28 tests)
Integration tests: Claude CLI (3/3 PASS), Codex CLI (3/3 PASS)
Lint: go vet ./... clean, go build ./... clean

New test files:

pkg/multiagent/blackboard_test.go — 18 tests (CRUD, concurrency, JSON roundtrip, tool actions)
pkg/multiagent/handoff_test.go — 10 tests (handoff execution, tool behavior, agent discovery)

☑️ Checklist

My code/docs follow the style of this project.
I have performed a self-review of my own changes.
I have updated the documentation accordingly.

Merge feat/multi-agent-routing into framework branch. Resolve conflicts: - types.go: keep protocoltypes aliases + add FailoverError/ModelConfig - loop.go: take registry-based AgentLoop rewrite from PR sipeed#131

Implements the base multi-agent collaboration framework: Phase 1 - Config extension: - Add Role and SystemPrompt fields to AgentConfig - Wire them into AgentInstance for per-agent identity Phase 2 - Blackboard shared context pool: - pkg/multiagent/blackboard.go: thread-safe key-value store with authorship tracking, scope metadata, and JSON serialization - pkg/multiagent/blackboard_tool.go: LLM tool for read/write/list/delete Phase 3 - Handoff mechanism and agent discovery: - pkg/multiagent/handoff.go: ExecuteHandoff delegates tasks between agents via RunToolLoop, injecting blackboard context - pkg/multiagent/handoff_tool.go: LLM tool with dynamic agent listing - pkg/multiagent/list_agents_tool.go: discovery tool for LLM Phase 4 - AgentLoop integration: - registryResolver adapter bridges AgentRegistry to multiagent.AgentResolver - blackboard/handoff/list_agents tools registered for multi-agent configs - Per-session blackboard via sync.Map, snapshot injected into system prompt - Handoff tool context propagation for channel/chatID Design decisions: - String values only (natural language agent communication) - Scope field defaults to "shared", extensible for sipeed#119 identity model - Author field tracks which agent wrote (maps to future S-id) - Multi-agent tools only registered when >1 agent configured (zero overhead) - ~2.3KB per session memory budget Closes: sipeed#294

…amework

Mermaid-based C4 model documentation covering: - C1 System Context: picoclaw in its ecosystem - C2 Container: runtime containers and responsibilities - C3 Component: multi-agent internals (blackboard, handoff, routing, fallback, provider protocol) - C4 Code Detail: interfaces, tool loop flow, blackboard lifecycle, fallback decision tree - Sequence Diagrams: handoff, blackboard sync, fallback chain, route resolution, config lifecycle - Roadmap: phased plan with dependency graph and status tracking Relates to sipeed#294, sipeed#283

…ackages Fixes identified by running golangci-lint v2.10.1 (PR sipeed#304 config) with govet, staticcheck, errcheck, revive, gosec enabled: - Replace interface{} with any (revive: use-any) - Replace WriteString(fmt.Sprintf(...)) with fmt.Fprintf (staticcheck: QF1012) - Add doc comments on all exported methods (revive: exported) - Safe type assertions with ok-check pattern (errcheck/revive) - Use strings.EqualFold instead of double ToLower (staticcheck: SA6005) - Add default cases to switch statements (revive) - Suppress gosec G117 false positive on SessionKey field - Test improvements: range int, unused params Zero issues remaining in pkg/multiagent, pkg/agent, pkg/routing. All 47 tests pass. Relates to sipeed#304, sipeed#294

- Add multi-agent hardening PRP with 4-phase plan based on OpenClaw gap analysis (foundation fix, tool policy, resilience, async) - Update roadmap with hardening phases and dependency graph - Update C3 component diagram with planned components and known bugs - Add 5 new sequence diagrams for planned features (guardrails, tool policy, loop detection, async spawn, cascade stop) - Document blackboard split-brain bug with fix approach - Add OpenClaw comparison table and reference map

Add Capabilities []string to the full agent chain (config → instance → resolver → multiagent) as required by issue sipeed#294 spec: "Define a standard Agent interface that includes Capabilities." - AgentConfig: new capabilities JSON field - AgentInstance: new Capabilities field, populated from config - registryResolver: map Capabilities in GetAgentInfo and ListAgents - AgentInfo: new Capabilities field - FindAgentsByCapability: helper to query agents by capability tag - 4 new tests covering match, multi-match, empty, and nil safety

The LLM can now delegate tasks by capability instead of requiring a specific agent_id. The handoff tool resolves the first matching agent via FindAgentsByCapability. - HandoffTool.Execute: accept "capability" as alternative to "agent_id" - HandoffTool.Description: display agent capabilities in tool listing - HandoffTool.Parameters: "task" is the only required field now - 4 new tests (route by capability, not found, no target, description)

feat: add Capabilities field for capability-based agent routing (sipeed#294)

Tools were bound to a static board at registration time while the system prompt injected data from a separate per-session board. Add BoardAware interface and SetBoard() methods so tools receive the correct session blackboard before each execution cycle.

Prevent infinite handoff loops (A->B->A) and unbounded depth chains. HandoffRequest now carries Depth/Visited/MaxDepth fields that are propagated to target agents. Default max depth is 3.

Add AllowlistChecker interface and wire it into HandoffTool to control which agents can delegate to which. Default behavior is open (allow all) when no subagents config is present; enforces allow_agents list when configured via CanSpawnSubagent.

Introduce ToolHook with BeforeExecute/AfterExecute lifecycle methods in ToolRegistry. Hooks enable policy enforcement and loop detection pipelines without modifying individual tools. BeforeExecute can block execution; AfterExecute always runs for observability.

Cover SetBoard switching, BoardAware interface compliance, depth limit, cycle detection, depth propagation, allowlist block/permit/default-open, AllowlistCheckerFunc adapter, and ToolHook before/after/block/chain behavior.

Add boundary tests for depth limit, self-handoff cycle detection, provider error propagation, JSON unmarshal invalid data, empty blackboard list, hook observability on block, and no-hook/not-found tool guard paths.

Define DefaultToolGroups mapping group references (e.g. "group:fs", "group:web") to concrete tool names, and ResolveToolNames to expand group refs into deduplicated tool name lists. This is the foundation for per-agent tool policies in Phase 2.

Add ToolPolicyConfig struct with Allow/Deny string slices supporting both individual tool names and group refs (e.g. "group:web"). Add ToolPolicy field to AgentConfig. Backward compatible: nil = full access.

New policy.go with ApplyPolicy (allow/deny filtering) and DepthDenyList (leaf agents lose spawn/handoff/list_agents). Add Clone() for shallow registry copy and Remove() for tool unregistration to ToolRegistry.

Apply per-agent tool policy (allow/deny) at startup in registerSharedTools so denied tools are removed before the LLM sees them. Apply depth-based policy in ExecuteHandoff: at max depth, leaf agents lose spawn/handoff/ list_agents to prevent further chaining. Clone is lightweight — shares tool instances, only copies the map.

- groups_test.go: 6 tests for ResolveToolNames (group expansion, individual, mixed, dedup, unknown group, empty) - policy_test.go: 11 tests for ApplyPolicy, DepthDenyList, Clone, Remove, and policy composition pipeline - handoff_test.go: 2 depth policy tests (leaf loses spawn/handoff, mid-chain retains all tools)

The retry loop only handled context/token errors, letting 429s and rate-limit responses fail immediately. Add detection for 429, rate_limit, resource_exhausted, overloaded, quota, and too_many_requests errors with exponential backoff (5s, 10s, 20s). Works regardless of whether fallback candidates are configured.

Implements LoopDetector as a ToolHook with four detection engines: - Generic repeat: blocks after N identical tool+args calls (default 20) - Ping-pong: detects A,B,A,B alternation with no-progress evidence - No-progress: tracks result hashes to distinguish stuck from progressing - Circuit breaker: emergency stop for any tool with identical outcomes Per-session isolation via context key, sliding window (default 30), configurable thresholds matching OpenClaw production values.

- Inject session key into context before LLM iteration loop - Register LoopDetector as ToolHook per agent in registerSharedTools - Uses production defaults: warn@10, block@20, circuit breaker@30

Coverage for all four detection engines: - Generic repeat: below/at/above warning and critical thresholds - Ping-pong: alternation with and without progress evidence - Circuit breaker: no-progress streak and progress-resets-streak - Session isolation, reset, sliding window eviction - Registry integration, context key, hash determinism

RunRegistry tracks active handoff/spawn runs with parent-child relationships. CascadeStop recursively cancels a run and all its descendants with cycle-safe seen-set protection. Supports: Register, Deregister, CascadeStop, StopAll, GetChildren.

- HandoffTool wraps context with cancel, registers run in RunRegistry - Deregisters on completion (normal or error) via defer - Propagates ParentRunKey to nested handoffs for correct tree structure - AgentLoop creates shared RunRegistry, passes to all HandoffTools

Coverage for: single run, parent-child chain, mid-chain stop, multiple siblings, cycle protection (A→B→A), non-existent key, StopAll, GetChildren, Go context tree propagation.

Replace single-strategy context overflow handling with a tiered cascade: - Tier 1: Truncate oversized tool results (>8000 chars) with newline-aware boundary detection. Cheapest recovery — no messages dropped. - Tier 2: Drop oldest 50% of messages (existing forceCompression). - Tier 3: Both truncation + compression for maximum space reclamation. Each retry escalates to the next tier, giving the LLM progressively more aggressive context reduction before giving up.

Cover 5 scenarios: oversized result truncated with footer, small result untouched, non-tool messages preserved, newline boundary preference, and multiple tool results in single session.

AuthRotator manages round-robin selection across multiple API keys per provider. Each key tracks its own cooldown state via CooldownTracker (2-track: transient 1min→1hr exponential, billing 5h→24h). AuthRotatingProvider wraps multiple LLM providers and delegates to the best available key on each request. On retriable failures, the failing key is put in cooldown and subsequent requests use the next available key.

Add api_keys array to ProviderConfig for multi-key support. ResolveAPIKeys() returns api_keys if set, otherwise wraps api_key. Factory detects multiple keys and creates AuthRotatingProvider. Backward compatible: single api_key works unchanged.

12 tests covering: round-robin selection, cooldown skipping, all-in-cooldown, success reset, available count, profile builder, rotating provider success rotation, failure marking, all-cooldown error, single key, concurrent access, and billing vs standard cooldown durations.

Async spawn (fire-and-forget) with per-parent semaphore-based concurrency limiting and buffered announcement channels for result delivery. Includes: - SpawnManager: goroutine-based async agent invocation with configurable per-parent concurrency limits and timeouts - Announcer: per-session buffered channels with back-pressure (drops oldest on overflow) for spawn result delivery - SpawnTool: LLM-callable tool for async agent spawning with allowlist enforcement and capability-based routing - Config: MaxChildrenPerAgent + SpawnTimeoutSec in SubagentsConfig

ProcessScope tracks PIDs per session key for namespace-like isolation. Agents can only see and kill their own processes. - Register/Deregister/Owns for session-PID binding - ListPIDs filters dead processes via Unix signal 0 - KillAll sends SIGTERM to all session-owned processes - Cleanup removes all tracking for a session

DedupCache provides idempotent execution guarantees for spawn and announce operations using deterministic keys with TTL-based expiry and periodic background sweep. - Check/CheckWithResult for duplicate detection - BuildSpawnKey: deterministic hash of (from, to, task) - BuildAnnounceKey: deterministic key for announcement dedup - Configurable TTL with automatic expired entry cleanup

Integrates Phase 4 components into the core agent orchestrator: - AgentLoop creates Announcer and SpawnManager at startup - SpawnTool registered alongside existing multi-agent tools - Spawn announcements drained between LLM iterations and injected as system messages for LLM context - Tool contexts wired for spawn_agent (board + channel) - Config-driven spawn limits (MaxChildrenPerAgent, SpawnTimeoutSec)

27 tests covering Phase 4 components: Spawn (10): accepted, concurrency limit, cascade stop, context timeout, parallel fan-out, announcer deliver/drain, pending, back-pressure, concurrent delivery, cleanup Dedup (10): first call, duplicate, different keys, expired entry, check with result, size, concurrent access (100 goroutines), sweep, spawn key deterministic, announce key format Process scope (7): register/owns, deregister, cross-session isolation, list PIDs filters dead, kill all, cleanup, empty session

nikolasdehor

This is an impressive and well-structured multi-agent framework. The architecture is sound (blackboard pattern, AgentResolver interface to avoid circular imports, depth-based recursion guards). I have some concerns that should be addressed before this leaves WIP:

Correctness / Safety

Goroutine leak in AsyncSpawn: If the parent context is cancelled before the spawned goroutine completes, ExecuteHandoff may block indefinitely if the provider does not respect context cancellation. The context.WithTimeout mitigates this, but if the underlying HTTP call ignores the context (some providers do), the goroutine leaks. Consider adding a select with a force-kill timer as a last resort.
Semaphore count() race in error message: In AsyncSpawn, the rejection error message calls sem.count() after acquire() returned false. At that point, other goroutines may have released slots, so the count is misleading. Minor, but the error message implies the count is accurate when it is not.
HandoffTool state mutation is not thread-safe: ExecuteHandoff mutates ht.depth, ht.visited, ht.maxDepth directly on the tool. If two handoffs are executing concurrently using the same tool instance (which is possible with spawns), they will race on these fields. These should be passed through context or cloned per-invocation.
LoopDetector session state unbounded growth: The sessions sync.Map grows without bound -- one entry per session, never cleaned up. Long-running gateways will accumulate stale sessions. Add a TTL-based eviction or hook into session cleanup.

Design

Auth rotation belongs in a separate PR: pkg/providers/auth_rotation.go (185 lines + 343 lines of tests) is unrelated to multi-agent collaboration. Mixing it here makes the PR harder to review and bisect.
Excessive external references in comments: Comments like 'inspired by NVIDIA CUDA stream scheduling', 'Google MapReduce uses similar fan-out caps', 'Microsoft Azure Functions uses similar timeout patterns' add noise without value. The patterns speak for themselves -- the comments should explain why the code does what it does, not draw parallels to unrelated systems.
tools.DepthDenyList approach: Stripping tools at max depth is clever for preventing infinite chaining, but it changes the agent capability set mid-conversation. This could confuse the LLM if it planned to use a tool that was available in previous turns but is now gone. Consider returning a clear error from the tool instead of removing it entirely.

Testing

Good test coverage overall (28 tests for blackboard, cycle detection, depth limits). The mockProvider approach is clean. Would like to see a test for the concurrent mutation issue in point 3.

Overall: solid foundation, but needs the thread-safety fix in point 3 and the session leak in point 4 before it is ready for merge. The auth rotation should be split out.

Leeaandrob added 3 commits February 18, 2026 11:10

merge: incorporate multi-agent routing (PR sipeed#131)

587ef4d

Merge feat/multi-agent-routing into framework branch. Resolve conflicts: - types.go: keep protocoltypes aliases + add FailoverError/ModelConfig - loop.go: take registry-based AgentLoop rewrite from PR sipeed#131

Merge remote-tracking branch 'upstream/main' into feat/multi-agent-fr…

c9a5769

…amework

Leeaandrob added the type: enhancement New feature or request label Feb 18, 2026

Leeaandrob added 2 commits February 18, 2026 12:04

edouard-claude mentioned this pull request Feb 18, 2026

feat: Base multi-agent collaboration framework #409

Closed

10 tasks

Leeaandrob and others added 23 commits February 18, 2026 13:50

Merge pull request #1 from edouard-claude/feat/multi-agent-enhancements

c5145f1

feat: add Capabilities field for capability-based agent routing (sipeed#294)

feat: add recursion guard with depth limit and cycle detection

eb707ff

Prevent infinite handoff loops (A->B->A) and unbounded depth chains. HandoffRequest now carries Depth/Visited/MaxDepth fields that are propagated to target agents. Default max depth is 3.

test: add edge-case coverage for phase 1 hardening

660a70c

Add boundary tests for depth limit, self-handoff cycle detection, provider error propagation, JSON unmarshal invalid data, empty blackboard list, hook observability on block, and no-hook/not-found tool guard paths.

feat: add ToolPolicyConfig for per-agent tool filtering

f3a838f

Add ToolPolicyConfig struct with Allow/Deny string slices supporting both individual tool names and group refs (e.g. "group:web"). Add ToolPolicy field to AgentConfig. Backward compatible: nil = full access.

feat: add policy engine with registry Clone and Remove

0b251e7

New policy.go with ApplyPolicy (allow/deny filtering) and DepthDenyList (leaf agents lose spawn/handoff/list_agents). Add Clone() for shallow registry copy and Remove() for tool unregistration to ToolRegistry.

fix: align struct fields and method signatures for goimports

09061d8

feat: wire loop detector into agent tool execution pipeline

076f526

- Inject session key into context before LLM iteration loop - Register LoopDetector as ToolHook per agent in registerSharedTools - Uses production defaults: warn@10, block@20, circuit breaker@30

test: add comprehensive cascade stop tests

94675e0

Coverage for: single run, parent-child chain, mid-chain stop, multiple siblings, cycle protection (A→B→A), non-existent key, StopAll, GetChildren, Go context tree propagation.

Leeaandrob added 10 commits February 19, 2026 13:26

test: add tool result truncation tests

372dfa0

Cover 5 scenarios: oversized result truncated with footer, small result untouched, non-tool messages preserved, newline boundary preference, and multiple tool results in single session.

nikolasdehor reviewed Feb 20, 2026

View reviewed changes

nikolasdehor mentioned this pull request Feb 23, 2026

Introducing Swarm Mode: Multi-instance Collaboration for PicoClaw #284

Open

is-Xiaoen mentioned this pull request Feb 24, 2026

[Feature] JSONL-backed session persistence with Store interface #711

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: feat: base multi-agent collaboration framework & shared context#423

WIP: feat: base multi-agent collaboration framework & shared context#423
Leeaandrob wants to merge 38 commits intosipeed:mainfrom
Leeaandrob:feat/multi-agent-framework

Leeaandrob commented Feb 18, 2026

Uh oh!

nikolasdehor left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Leeaandrob commented Feb 18, 2026

📝 Description

Architecture decisions

🗣️ Type of Change

🤖 AI Code Generation

🔗 Related Issue

📚 Technical Context (Skip for Docs)

🧪 Test Environment

📸 Evidence (Optional)

☑️ Checklist

Uh oh!

nikolasdehor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants