feat(pool): add reusable pod reset policy with task-executor sidecar support#521
feat(pool): add reusable pod reset policy with task-executor sidecar support#521fengcone wants to merge 7 commits intoalibaba:mainfrom
Conversation
- Introduce PodRecyclePolicy with options Delete and Reuse for pooled BatchSandbox pods - Add ResetSpec for reset configuration when using Reuse policy (container restart, clean dirs, timeout) - Implement controller logic to handle pod disposal and resetting before BatchSandbox deletion - Add finalizer management for pod disposal lifecycle in BatchSandbox reconciler - Integrate calls to task-executor Reset API for pod reset status and control in controller - Track pod recycle states (Resetting, ResetSucceeded, ResetFailed) via pod labels and update accordingly - Implement task-executor sidecar Reset API to support asynchronous pod reset operations - Block task creation and sync during reset in task-executor to avoid race conditions - Support cleaning of task data and user-specified directories and main container restart during reset - Add new CLI flags for configuring task-executor image and resources for pod reset support - Update kustomize deploy to substitute TASK_EXECUTOR_IMAGE_PLACEHOLDER with configured image - Add new API types and deepcopy methods for reset configuration and responses - Add new status field to Pool to track number of pods resetting for observability
# Conflicts: # kubernetes/Makefile # kubernetes/cmd/controller/main.go
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7eb332017
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…deployment - Change Resetting field in Pool status from optional to required - Add resetting field to the CRD's required properties - Replace TASK_EXECUTOR_IMAGE_PLACEHOLDER with actual image in controller deployment manifest - Update e2e tests to use utils for TaskExecutorImage and SandboxImage values
- Add Version field to ResetRequest using BatchSandbox UID for idempotency - Implement version-aware state machine to distinguish retries from new requests - Add waitForNewContainer to ensure new container readiness after restart - Consolidate kustomize patches into single manager_args_patch.yaml - Refactor e2e test setup to suite-level BeforeSuite/AfterSuite Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
681274c to
16b9b7f
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5d2851c787
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
5d2851c to
31568a3
Compare
Summary
#452 support pod disposal policy for pooled BatchSandbox deletion
Testing
Breaking Changes
ReusetoDelete)Checklist