Skip to content

feat(sdk): add wall-clock-anchored cron primitive (cron_at) for scheduled-at-time agents #22

Description

@ankitdas-volgapartners

Problem

PilotSwarm's only durable-timer primitive is cron(seconds, reason) — pure interval-based. Agents that need to fire at a specific wall-clock time (e.g., a nightly compliance auditor at 02:00 UTC, a weekly customer SLA report at Mon 09:00 customer-TZ, a monthly billing reconciliation at the 1st 04:00) must work around the missing primitive with "wake every N minutes, check if it's HH:MM, sleep again."

The durable-timers SKILL.md teaches the interval pattern as the only recurring primitive, with no example of computing variable-interval cron for wall-clock targets. An agent that follows the skill literally for "every day at 02:00 UTC" picks the maximally-literal interpretation: wake every ~15 min, time-check, sleep. That costs ~96 LLM turns per day for a job that fires once.

Concrete production cost shape

A nightly compliance/audit agent that runs once at 02:00 UTC daily:

Pattern Wakes/day LLM turns/day Annualized
Naive cron(900) + time-check guard ~96 ~96 (each wake = full turn for fact-read + clock-check + sleep) ~35,000 turns/year for ONE daily run
Wall-clock anchored cron_at 1 1 365 turns/year

Per-tenant agents in a fleet multiply this directly. Token cost is dominated by skill + system prompt being re-served on every wake.

Proposed fix

Add a sibling tool cron_at that accepts wall-clock anchor fields. The SDK orchestration computes the next-fire-ms (with TZ + DST handling) under the hood and uses the existing durable-timer machinery.

API

```ts
cron_at({
minute: 0-59, // required — the wall-clock anchor
hour?: 0-23, // omit ⇒ hourly recurrence
day_of_week?: 0-6, // 0 = Sunday — weekly (mutually exclusive with day_of_month)
day_of_month?: 1-31, // monthly (mutually exclusive with day_of_week)
tz: string, // IANA zone (e.g. "America/Los_Angeles") — mandatory
max_fires?: number, // optional cap on total fires; omit ⇒ fire forever
reason: string,
})
```

Recurrence inferred from anchor fields

Set fields Recurrence Example
minute hourly {minute: 5} → every hour at HH:05 (anomaly detection sweeps)
minute + hour daily {minute: 0, hour: 2, tz: "UTC"} → 02:00 UTC nightly compliance audit
minute + hour + day_of_week weekly {minute: 0, hour: 9, day_of_week: 1, tz: "America/New_York"} → Mondays 09:00 ET SLA report
minute + hour + day_of_month monthly {minute: 0, hour: 4, day_of_month: 1, tz: "UTC"} → 1st of month 04:00 UTC billing reconciliation

max_fires semantics

Value Meaning
omitted (default) recurrent forever
1 one-shot scheduled action — fires at the next anchor time, then stops. Wall-clock-anchored counterpart to wait(seconds)
n > 1 fires n times at consecutive anchor instants, then stops
0 or negative reject at validation

After the last fire, the SDK stops scheduling new wakes and emits a cron.completed event so the agent can finalize.

Edge cases (locked at design time)

  • DOM=31 in short months: standard cron behavior — skip that month. ("Last day of month" is a future feature, not v1.)
  • day_of_week + day_of_month both set: reject at validation.
  • day_of_week or day_of_month set without hour: reject (period inference ambiguous).
  • tz mandatory: no silent UTC default. Forces explicit choice for cross-tenant agents.
  • DST: SDK recomputes next-fire on every wake from current UTC + IANA zone — handles spring-forward / fall-back without agent involvement.
  • Replay safety: max_fires counter is deterministic per orchestration replay — stored in orchestration state, decremented atomically with each fire.

Existing cron(seconds, reason) stays untouched

Static-interval workloads (sweeper, resourcemgr) keep their current API. Only wall-clock-anchored use cases use cron_at.

Skill update

packages/sdk/plugins/system/skills/durable-timers/SKILL.md adds a "Wall-Clock Anchored Recurring" pattern teaching agents to use cron_at and explicitly forbidding the wake-and-check anti-pattern. Plus a "One-Shot Scheduled at Wall-Clock" sub-pattern showing max_fires: 1.

Why field-named over cron-expression syntax

LLMs fumble cron-expression positional grammar (0 9 * * *). Named fields with Zod-validated ranges are easier to generate correctly and easier to read in skill docs.

Out of scope (for follow-ups)

  • Full cron-expression support.
  • Per-day distinct times (e.g., 09:00 weekdays + 11:00 weekends).
  • "Last day of month" / "last weekday" semantics.
  • start_at / end_at absolute-time bounds (current max_fires covers most "stop after N" cases).

Tracking

Will land as a separate PR against main. Skill update is part of the same PR. Estimated diff: ~250 lines (orchestration handler + tool registration + Zod schema + skill update + tests).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions