Skip to content
This repository was archived by the owner on Feb 1, 2026. It is now read-only.
This repository was archived by the owner on Feb 1, 2026. It is now read-only.

feat(core,prebuilt): add validated LLM step w/ repair function + llm_retry/agent_retry helpers #1

@AlexXLi12

Description

@AlexXLi12

Background

llm_step(...) currently supports a parser, but Coevolved doesn’t have a first-class pattern for structured output validation + automatic repair/retry when parsing/validation fails.

We want an “Instructor-style” workflow: attempt to parse/validate, and if it fails, turn the failure into additional context/prompting and retry the LLM call (bounded by attempts).

Reference: Instructor project (for ideas on UX/patterns): https://github.com/instructor-ai/instructor

Goals

  • Add more sophisticated validation for LLM steps (Pydantic-first), with an extensible “repair” hook.
  • Keep core API surface clean: the retry/repair logic should live in a dedicated class under coevolved/core/.
  • Provide prebuilt helpers under coevolved/prebuilt/:
    • llm_retry(...): wraps an LLM step with validate→repair→retry behavior.
    • agent_retry(...): a higher-level retry step for agent workflows (either retry a planner step or a whole agent callable).

Proposed design (core)

Add a new class under coevolved/src/coevolved/core/ (name TBD; suggestions below):

  • LLMRepairPolicy
  • ValidatedLLMStep
  • LLMValidationRepair

It should accept a custom function that transforms failure output into new input for the next attempt:

  • failure_to_input(failure, *, state, prompt_payload, response, attempt) -> (new_state | new_prompt_payload | prompt_delta)

Where “failure” is typically:

  • a pydantic.ValidationError (schema mismatch)
  • a parsing exception (bad JSON, invalid tool args, etc.)

Also include:

  • max_attempts: int
  • retryable exception filtering (default: ValidationError + parsing errors)
  • optional backoff / jitter (nice-to-have)

Proposed design (prebuilt)

Under coevolved/src/coevolved/prebuilt/:

  • llm_retry.py
    • llm_retry(step: Step, *, repair_policy: ..., max_attempts: int = ..., ...) -> Step
    • or a factory that resembles llm_step(...) but adds validation/repair hooks.
  • agent_retry.py (or llm_retry.py with both exports)
    • wraps a step/agent callable with bounded retries
    • supports custom predicate: should_retry(exc, state, attempt) -> bool

Scope / Tasks

  • Core
    • Add the new class in coevolved/core/ implementing:
      • attempt LLM call
      • parse/validate response
      • on failure: call failure_to_input(...), update input, retry
      • on success: return validated output and/or attach to state
    • Emit tracing events that make retries debuggable (at minimum, annotate attempts in annotations or add a dedicated event; keep this lightweight).
  • Prebuilt
    • Implement llm_retry(...)
    • Implement agent_retry(...)
  • Docs
    • Add a short guide snippet showing:
      • a Pydantic output model
      • a repair function that appends the validation error to the prompt/messages
      • bounded retries
  • Tests
    • success on first attempt
    • failure then success after repair
    • failure exhausting attempts raises a clear error with last failure included

Acceptance criteria

  • A new core abstraction exists under coevolved/core/ that:
    • accepts a custom failure_to_input function
    • retries the LLM call up to max_attempts
    • cleanly returns/attaches the validated result
  • llm_retry(...) and agent_retry(...) exist under coevolved/prebuilt/ and have docstrings + basic examples.
  • Unit tests cover the three core scenarios (first-try success; repair success; exhaust attempts).

Notes / Decisions needed

  • Decide whether to add instructor as an optional dependency vs. implementing the pattern ourselves without additional deps.
  • Decide whether retries should create a new trace “attempt id” or simply annotate events with attempt metadata.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions