Let ClawSweeper judge real behavior proof by pashpashpash · Pull Request #48 · openclaw/clawsweeper

pashpashpash · 2026-05-05T22:55:43Z

ClawSweeper already records a structured realBehaviorProof judgement and uses it to block pass/automerge markers, but it did not yet own the positive label that tells maintainers the evidence was convincing.

This makes the proof review explicitly agent-led. The review prompt tells Codex to inspect PR bodies, comments, screenshots, videos, logs, terminal output, and links with its own tools and best judgement. Review runs now give Codex a scratch directory plus a separate read-only inspection token, while the deterministic wrapper keeps the write token for comments and labels. During apply, ClawSweeper syncs proof: sufficient when the structured judgement is sufficient and removes it when the evidence is missing, weak, mock-only, or no longer applicable.

pashpashpash added 2 commits May 5, 2026 15:54

feat: judge real behavior proof evidence

90dd661

fix: tell contributors how to refresh proof reviews

37cbd19

pashpashpash merged commit 0fefca2 into main May 5, 2026
5 checks passed

pashpashpash deleted the codex/agent-led-proof-judgement branch May 5, 2026 23:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let ClawSweeper judge real behavior proof#48

Let ClawSweeper judge real behavior proof#48
pashpashpash merged 2 commits intomainfrom
codex/agent-led-proof-judgement

pashpashpash commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pashpashpash commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant