-
Notifications
You must be signed in to change notification settings - Fork 677
Description
Summary
When using the Kubernetes runtime with the batchsandbox workload provider, a sandbox can already be usable through execd health checks while the lifecycle API still reports status.state = "Allocated" instead of "Running".
Observed behavior
In our real Kubernetes E2E flow:
Sandbox.create()/CodeInterpreter.create()succeeds- execd health checks already pass
- sandbox endpoints are reachable and usable
- but
GET /sandboxes/{id}may still return:
{
"status": {
"state": "Allocated"
}
}This creates a mismatch between "sandbox is already usable" and "lifecycle state is not Running yet".
Why this is problematic
From the public lifecycle schema and SDK model docs, the documented states are:
- Pending
- Running
- Pausing
- Paused
- Stopping
- Terminated
- Failed
Allocated does not appear to be part of the documented public lifecycle contract, but it is observable from the Kubernetes runtime implementation.
As a result, clients and E2E tests that correctly expect Running after readiness/health checks may still observe Allocated.
Expected behavior
Kubernetes runtime should avoid surfacing Allocated to clients once the sandbox is already considered usable/ready, and return Running instead.