What metric actually matters for coding-agent loops? #47

Keesan12 · 2026-05-18T17:40:26Z

Keesan12
May 18, 2026
Maintainer

query: What metric are people actually using for coding agents right now?

The pattern I keep seeing is teams optimize one local number and miss the loop.

Examples:

Those all help, but they do not answer the operational question:

did the next attempt deserve to run?

The metric that seems to hold up better is cost per verified outcome:

That is the seam MartinLoop is built around:

If you're running Claude Code, Codex, OpenCode, or your own loop, I'm curious what you're actually measuring today:

If you want to kick the tires, the OSS repo is here:
https://github.com/Keesan12/martin-loop

If that runtime layer is useful, a star helps a lot right now. If you try it, I care more about what breaks.