🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉
OpenClaw
2026.4.1.2has been successfully built from source and is running locally on Ubuntu 22.04.
OpenClaw
2026.4.1-beta.1has been successfully built from source and is running locally on Ubuntu 22.04. Build stack: Node v22.22.0 · pnpm v10.32.1 (Corepack) · rolldown v1.0.0-rc.12 · TypeScript
中文 README · 中文规范 · Contributing · Code of Conduct · Good First Issues
-
Dev.9: integrate the hermes-agent into openclaw project.#10
-
https://github.com/gungwang/claude-into-openclaw/pull/10/changes
-
YouTube Video: Claude Code, Hermes Agent, OpenClaw 整合 (三位一体)
Analyze the hermes-agent repository as an open-source AI coding agent framework and extract practical improvements for OpenClaw that are compatible with OpenClaw's current architecture. Identifies features absent from or significantly superior to OpenClaw's existing capabilities.
Builds on prior work in SPEC_OPENCLOW_IMPROVEMENTS_FROM_CLAW_CODE_ANALYSIS_V2.md. Assumes Tracks A–G from that spec are implemented or in progress. The track lettering in this document (A–G) is independent and specific to hermes-agent–derived improvements.
The hermes-agent codebase is a production-grade Python AI agent framework with a shared agent/tool substrate exposed through multiple surfaces (CLI, gateway, ACP, MCP, batch, RL). Key architectural strengths:
- Session persistence with SQLite WAL + FTS5, schema migration, write-contention handling, and session lineage tracking.
- Training & evaluation pipeline including batch trajectory generation, RL training CLI, toolset distributions, SWE benchmark runners, and multi-backend environment management.
- Supply-chain security with threat-pattern scanning (20+ categories), manifest-based skill synchronization, URL safety validation, and OSV vulnerability checking.
- Rich tool ecosystem spanning browser automation (10 tools, 3 provider backends), Mixture of Agents, voice/TTS, image generation, background process monitoring, and checkpoint management.
- Plugin architecture with pre/post tool+LLM hooks, context engine replacement, live message injection, and full CLI lifecycle management.
- Gateway platform coverage for 15+ messaging platforms including Chinese enterprise platforms (WeCom, DingTalk, Feishu, WeChat).
This specification identifies 7 improvement tracks that would bring hermes-agent–caliber capabilities to OpenClaw while preserving OpenClaw's existing TypeScript architecture, plugin system, and channel framework.
- SPEC_OPENCLOW_IMPROVEMENTS_FROM_HERMES_AGENT_ANALYSIS.md
- SPEC_OPENCLOW_IMPROVEMENTS_FROM_HERMES_AGENT_ANALYSIS_zh.md
- PLAN-hermesAgentAnalysis.prompt.md
- PLAN-hermesAgentAnalysis_zh.prompt.md
- HERMES_OPENCLAW_ADRS.md
- HERMES_OPENCLAW_CONTRIBUTOR_GUIDE.md
- HERMES_OPENCLAW_EXECUTION_PLANS.md
- HERMES_OPENCLAW_TECHNICAL_REFERENCE.md
- HERMES_TEST_FIXES_TECHNICAL_REPORT.md
-
AB#dev.2: copy my version into a new repo for research and demo.#1
This repository serves as a bridge between Claude Code's architectural insights and OpenClaw's agent platform. By analyzing Claude Code's tool/command inventory, agent harness patterns, and runtime structures, we aim to enhance OpenClaw with:
- Improved Security: Canonical identity layers, policy decision traceability, and skill vetting with runtime trust labels
- Enhanced Power: Adapter maturity frameworks, mode contract testing, and deterministic routing with explainability
- Greater Intelligence: Route quality benchmarking, session event journals for replay/debug, and collision-safe tool resolution
- Token Efficiency: Better compaction strategies informed by harness lifecycle patterns and context management techniques
This work combines Claude Code features, functionality, and architectural patterns with OpenClaw's existing strengths (agent loop, streaming lifecycle, multi-agent delegation, transcript hygiene) to create migration-grade observability and adapter ergonomics.
📋 For detailed improvement specifications, see ---
- README.md
- SPEC_OPENCLOW_IMPROVEMENTS_FROM_CLAW_CODE_ANALYSIS_V2.md |
- 中文版-规范
- CLAUDE_OPENCLOW_EXECUTION_PLANS.md
- CLAUDE_OPENCLAW_TECHNICAL_REFERENCE.md
- README_zh.md
- CODE_OF_CONDUCT.md
- CODE_OF_CONDUCT_zh.md
- CONTRIBUTING.md
- CONTRIBUTING_zh.md
- GOOD_FIRST_ISSUES.md
- GOOD_FIRST_ISSUES_zh.md
Plan/specification only. No implementation changes included.
Analyze the claw-code repository as a Claude-code-style harness mirror and extract practical improvements for OpenClaw (features, skills, functionality, agent architecture) that are compatible with OpenClaw’s current documented design.
The claw-code codebase currently functions as a high-fidelity inventory + simulation scaffold:
- Broad mirrored command/tool surfaces via snapshots (207 command entries, 184 tool entries).
- Good CLI exploration/reporting scaffolding.
- Limited real runtime semantics (many placeholder/simulated handlers).
This is useful for OpenClaw because it highlights what a large harness inventory needs beyond baseline functionality:
- canonical identity and deduping for huge command/tool surfaces
- deterministic routing and explainability
- strict parity governance (metadata → dry-run → active runtime)
- mode contract testing (remote/ssh/teleport/etc.)
- richer adapter lifecycle and policy visibility
OpenClaw already has many mature primitives (agent loop, streaming lifecycle, transcript hygiene, compaction, hooks, multi-agent/delegation). The opportunity is to add migration-grade observability and adapter ergonomics so OpenClaw can absorb larger tool ecosystems with less ambiguity and better safety posture.
commands_snapshot.jsonandtools_snapshot.jsondrive command/tool catalogs.- Command/tool execution shims frequently return “mirrored ... would handle ...” messages.
- Many subsystem packages are placeholder metadata wrappers.
OpenClaw can benefit from a stronger “inventory governance” layer whenever importing third-party skills/tools or mirroring external ecosystems.
Observed from snapshots:
- Commands: 207 total, 141 unique names (high duplicate display-name rate).
- Tools: 184 total, 94 unique names; heavy repeated generic names (
prompt,UI,constants).
As tool/plugin ecosystems scale, name collisions become common. Name-only routing/lookup quickly gets brittle.
Runtime mode handlers in claw-code (remote/ssh/teleport/direct/deep-link) are mostly placeholders.
OpenClaw already has real agent-loop machinery and runtime queues. Codifying mode contracts and diagnostics can prevent future regressions and improve operator confidence.
claw-code has parity audit concepts but weak fallback behavior when local archive is missing.
OpenClaw can adopt the parity-level pattern for optional features/skills/providers, turning “supported/not supported” into measurable maturity bands.
OpenClaw documentation indicates these strong foundations already exist:
- Serialized agent loop + lifecycle streams + wait semantics.
- Queue lanes and per-session consistency guarantees.
- Transcript hygiene and provider-specific sanitization rules.
- Session compaction + pre-compaction memory flush.
- Multi-agent/delegate architecture with policy boundaries.
- Internal/plugin hooks at key lifecycle points.
Therefore this spec does not propose replacing core OpenClaw architecture; it proposes additive improvements on top.
Human-readable names are not globally unique in large ecosystems.
Add canonical identity metadata for command/tool registry entries:
id(stable unique, namespaced)displayNamenamespace(core/plugin/skill/provider/local)versionor source digestcapabilityClass(read, write, execute, network, messaging, scheduling)
- deterministic lookup
- collision-safe routing
- better audit trails
- Registry rejects identity collisions on
id. - Routing, status, and diagnostics surfaces expose canonical IDs.
- Legacy name-based lookup remains available but warns on ambiguity.
When tool surfaces grow, misrouting is expensive and hard to debug.
Introduce a route explainability format and benchmark set:
- exact-match / alias / semantic / policy-prior signals
- per-candidate score breakdown
- top-k with rationale
- offline benchmark suite for regression testing
- easier debugging
- measurable route quality over releases
route --explain-style output in internal diagnostics.- Stable benchmark corpus committed in docs/test assets.
- Route quality gates in CI for critical intents.
Binary “exists vs works” hides real maturity.
Adopt parity/maturity levels for tools/commands/skills:
- L0: discoverable metadata
- L1: schema-validated + listed
- L2: dry-run semantics + policy checks
- L3: active runtime support in controlled scope
- L4: production-hardened (telemetry + replay confidence)
- honest capability reporting
- clearer roadmap for contributors
- machine-readable maturity report artifact
- docs-generated capability tables from artifact
- every non-experimental tool tagged with maturity level
Users and operators need “why blocked/allowed” answers with reproducible logic.
Extend policy decision logging with structured reason codes:
- capability denied
- namespace denied
- risk-tier denied
- missing approval context
- channel-policy conflict
- easier compliance reviews
- faster support/debug
- every blocked tool call includes reason code + policy source pointer
- lifecycle stream can emit policy decision events in verbose/debug mode
Mode complexity (direct/remote/node/acp/session orchestration) risks drift without explicit contracts.
Define mode contracts and required test cases:
- connect/auth/health/teardown states
- timeout/retry behavior
- error taxonomy (auth, network, policy, runtime)
- deterministic user-facing failure messages
- higher reliability across environments
- easier incident triage
- contract tests per mode path
- standardized failure envelope used by CLI + chat-facing surfaces
Open skill ecosystems need safety transparency and runtime trust context.
Integrate trust labels for skill/tool origin and vetting state:
- source: core | first-party | community | local
- vetting: unreviewed | reviewed | verified
- requested capabilities summary
- safer install/use workflows
- clearer operator decisions
- install/enable flow surfaces trust label + capability scope
- policy can require reviewed/verified for certain capability classes
Complex runs benefit from a concise event timeline separate from raw transcript details.
Add optional normalized event-journal export for diagnostics:
- message_in
- route_selected
- tool_call_start/end
- policy_decision
- compaction_start/end
- memory_flush
- easier replay/debug
- better observability dashboards
- export endpoint/CLI path for journal view
- correlation IDs tie journal events to transcript entries
- Canonical ID + ambiguity warning layer for command/tool registries.
- Routing explainability diagnostics with score decomposition.
- Maturity report artifact for tools/skills/features in docs/CI.
- Policy reason-code surfacing in debug/verbose streams.
These four deliver high operational value without destabilizing existing loop/runtime design.
-
Risk: Added metadata complexity burdens maintainers. Mitigation: auto-generate most fields where possible; require minimal mandatory fields.
-
Risk: Explainability data leaks internals by default. Mitigation: gate detailed traces behind debug/verbose and redact sensitive values.
-
Risk: Maturity labels become stale. Mitigation: tie labels to CI checks and contract-test pass criteria.
-
Risk: Policy reason codes diverge from actual enforcement path. Mitigation: reason emitted only from enforcement engine, not wrappers.
- canonical IDs (internal registry)
- ambiguity detection/warnings
- policy reason code schema
- routing explainability
- routing benchmark harness
- mode contract matrix spec
- maturity-level reporting artifacts
- trust labels for skills/tools
- docs and contributor templates
- % registry entries with canonical IDs
- route benchmark top-1/top-3 accuracy trend
- % denied calls with structured reason codes
- mode-contract test pass rate
- % skills/tools with trust labels + maturity levels
- ADR: canonical identity schema for commands/tools.
- ADR: routing explainability and benchmark protocol.
- ADR: maturity rubric and report schema.
- ADR: policy reason code taxonomy.
- Test-plan document for mode contract matrix.
- Contributor guide for adding new tool/skill entries with IDs + trust metadata.
==============================================================
Runtime: Node 24 (recommended) or Node 22.16+.
git clone https://github.com/gungwang/claude-into-openclaw.git
cd claude-into-openclaw/openclaw
pnpm install
pnpm ui:build # auto-installs UI deps on first run
pnpm build
npm install -g .
openclaw onboard --install-daemon
# Dev loop (auto-reload on source/config changes)
pnpm gateway:watchOpenClaw Onboard installs the Gateway daemon (launchd/systemd user service) so it stays running.
Runtime: Node 24 (recommended) or Node 22.16+.
openclaw onboard --install-daemon
openclaw gateway --port 18789 --verbose
# Send a message
openclaw message send --to +1234567890 --message "Hello from OpenClaw"
# Talk to the assistant (optionally deliver back to any connected channel: WhatsApp/Telegram/Slack/Discord/Google Chat/Signal/iMessage/BlueBubbles/IRC/Microsoft Teams/Matrix/Feishu/LINE/Mattermost/Nextcloud Talk/Nostr/Synology Chat/Tlon/Twitch/Zalo/Zalo Personal/WeChat/WebChat)
openclaw agent --message "Ship checklist" --thinking high- stable: tagged releases (
vYYYY.M.DorvYYYY.M.D-<patch>), npm dist-taglatest. - beta: prerelease tags (
vYYYY.M.D-beta.N), npm dist-tagbeta(macOS app may be missing). - dev: moving head of
main, npm dist-tagdev(when published).
Switch channels (git + npm): openclaw update --channel stable|beta|dev.
Here are six quick screenshots demonstrating the upgraded experience.







