Skip to content

feat(osbase): sandbox install runc — zero-parameter full-stack install#1143

Open
WeissonHan wants to merge 1 commit into
alibaba:mainfrom
WeissonHan:runc
Open

feat(osbase): sandbox install runc — zero-parameter full-stack install#1143
WeissonHan wants to merge 1 commit into
alibaba:mainfrom
WeissonHan:runc

Conversation

@WeissonHan

@WeissonHan WeissonHan commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

After running:
anolisa osbase sandbox install runc

the user gets a working Docker environment end-to-end:
docker info ✓

What this delivers:

  • Full-stack install: runc + containerd + docker + docker-client installed as a unit; services automatically enabled and started.
  • Post-install verification: confirms runc binary, containerd service, docker daemon, and a live container health check.
  • Optional package hints: informs user about available extras (e.g. nerdctl) without forcing installation.
  • Manifest-driven: scenario definitions live in sandbox.toml; adding a new scenario requires only a TOML stanza, no code changes.

Description

Implement end-to-end anolisa osbase sandbox install runc command that
delivers a working Docker environment with zero parameters. The install
pipeline reads scenario definitions from sandbox.toml and executes five
phases: preflight → packages → services → verify → state.

Previously the install path was a 3-step stub (preflight → packages → hint).
This PR upgrades it to a full pipeline that actually enables services and
runs a health check to confirm the environment works.

Related Issue

no-issue: new feature

closes #

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional change)
  • Performance improvement
  • CI/CD or build changes

Scope

  • cosh (copilot-shell)
  • sec-core (agent-sec-core)
  • skill (os-skills)
  • sight (agentsight)
  • tokenless (tokenless)
  • ckpt (ws-ckpt)
  • memory (agent-memory)
  • anolisa (anolisa-cli)
  • skillfs (SkillFS)
  • Multiple / Project-wide

Checklist

  • I have read the Contributing Guide
  • My code follows the project's code style
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the documentation accordingly
  • For cosh: Lint passes, type check passes, and tests pass
  • For sec-core (Rust): cargo clippy -- -D warnings and cargo fmt --check pass
  • For sec-core (Python): Ruff format and pytest pass
  • For skill: Skill directory structure is valid and shell scripts pass syntax check
  • For sight: cargo clippy -- -D warnings and cargo fmt --check pass
  • For tokenless: cargo clippy -- -D warnings and cargo fmt --check pass
  • For memory (Linux only): cargo clippy --all-targets -- -D warnings, cargo fmt --check, and cargo test pass
  • For anolisa: cargo clippy --all-targets --locked -- -D warnings, cargo fmt --all --check, and cargo test --locked pass
  • For skillfs: cargo fmt --all --check, cargo clippy --workspace --all-targets -- -D warnings, and cargo test --workspace pass
  • Lock files are up to date (package-lock.json / Cargo.lock)

Testing

  • cargo test --locked — 948 tests passed, 0 failed
  • Unit tests cover: manifest parsing (including services field), preflight
    checks, request validation, and phase result aggregation
  • Acceptance criteria: on Alinux 4, anolisa osbase sandbox install runc
    followed by docker info succeeds

Additional Notes

The services field is a new addition to sandbox.toml scenario schema.
Existing scenarios without services default to an empty list (phase is
skipped), so this is fully backward-compatible.

}

// ─── Phase 5: State (stub) ───────────────────────────────────────────
eprintln!(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] The state phase is marked successful and prints that the sandbox was recorded as installed, but the implementation is still a stub and does not write any anolisa state. That is misleading for users and automation that expect status/list/uninstall to see this install. Please either implement the real state write here or mark this phase as skipped/stub with an honest message.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still present in the latest revision. The code still emits state: recorded sandbox-... as installed and records a successful state phase while the phase is explicitly a stub. Please either implement the real state write, or change the user-visible output/phase status so it does not claim that anolisa state was recorded.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installation result will be updated in /var/lib/anolisa/installed.toml.

@@ -268,7 +268,10 @@ fn handle_sandbox_install(
if outcome.exit_code != 0 {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] This still returns CliError::Runtime for any non-zero outcome, including the new degraded verify path. That makes the command fail while the message says "install completed with warnings" and the core code treats verify failure as non-fatal. Please align the semantics either way: return success for degraded warnings, or make verify failure a real failed phase instead of PhaseStatus::Degraded.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phase Core behavior on failure CLI behavior
Preflight return Err(PhaseFailed{phase:"preflight", …}) Err(CliError) → non-zero exit
Packages (dnf) return Err(PhaseFailed{phase:"packages", …}) Err(CliError) → non-zero exit
Services (systemctl) return Err(PhaseFailed{phase:"services", …}) Err(CliError) → non-zero exit
Verify (docker info) warnings stored in outcome, Ok(outcome) Ok(()) → exit 0 + print warning count

new behaviors are as above.


// L4: smoke test
if !skip_verify {
eprintln!("[osbase] verify: docker run --rm alpine echo hello");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] This smoke test may require pulling alpine from the network on a fresh machine, so a mirror/network/rate-limit issue can look like an install failure even when runc/containerd/docker are installed correctly. If this remains part of default verification, it should stay warning-only and must not cause the CLI to report the install command as failed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated as daemon health check

@WeissonHan WeissonHan force-pushed the runc branch 2 times, most recently from 974124d to 90f9396 Compare June 26, 2026 01:50
let output = Command::new(cmd)
.args(args)
.output()
.map_err(|e| {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] CI is currently failing at cargo fmt --all --check on this block. Running cargo fmt --all should collapse this map_err into the single-line form expected by rustfmt.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

let stderr = String::from_utf8_lossy(&output.stderr);
let hint = stderr.lines().next().unwrap_or("").trim();
Err(format!(
"{label} failed (exit {}). Try: systemctl status {cmd}. {hint}",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] This generic hint is inaccurate for several verification commands. For example, a runc --version failure suggests systemctl status runc, and a systemctl is-active containerd failure suggests systemctl status systemctl. Please either make the hint command-specific or drop the fixed systemctl status {cmd} hint and surface stderr only.

@WeissonHan WeissonHan Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hints appears as:

- runc --version failed (exit 127): No such file or directory
- containerd active failed (exit 3): inactive
- docker info failed (exit 1): Cannot connect to the Docker daemon

non-sense messages like systemctl status runc are removed.

@kongche-jbw

Copy link
Copy Markdown
Collaborator

One PR-description note: the implementation now verifies Docker with docker info, but the PR body and Testing section still say docker run --rm alpine echo hello / live container smoke test. Please update the description so reviewers do not think the default verification still depends on pulling the alpine image.

let layout = FsLayout::system(None);
let state_path = layout.state_dir.join("installed.toml");
let state_result = (|| -> Result<String, String> {
let mut state =

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] This state write path loads and saves /var/lib/anolisa/installed.toml without holding InstallLock. The repository lock contract says every writer touching installed.toml must hold the install lock; otherwise concurrent install/update/adapter operations can lose each other’s state updates. Please acquire layout.lock_file before loading state and keep the load/upsert/save sequence inside the lock.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acquires InstallLock before touching installed.toml, wrapping the entire load/upsert/save sequence inside the lock:

  • Lock acquired before state load — mirrors the pattern in lifecycle.rs and install.rs
  • Entire load → upsert → save runs under the lock — _lock drops (releases) when the closure returns
  • Lock contention degrades gracefully — returns a clear message ("install lock at … is held by another process") and the phase reports Degraded (non-fatal, exit 2)
  • Non-blocking semantics preserved — try_lock_exclusive fails immediately, never blocks the pipeline

})
} else {
Ok(())
// exit_code 2 = degraded (non-fatal warnings already

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] This degraded-success behavior is only applied on the direct execution path. The helper path still goes through send_helper_request, which treats any HelperResponse::Success with exit_code != 0 as CliError::Runtime; meanwhile the helper server returns outcome.exit_code unchanged. As a result, the same degraded verify/state outcome succeeds when run directly as root but fails when routed through system-helper. Please apply the same degraded-success handling to the helper response path.

@WeissonHan WeissonHan Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

     match resp {
         HelperResponse::Success { message, exit_code } => {
-            if exit_code == 0 {
+            if exit_code == 0 || exit_code == 2 {
+                // exit_code 0 = full success, 2 = degraded (non-fatal
+                // verify/state warnings). Both are reported as success;
+                // the core already printed diagnostics to stderr.
                 for line in message.lines() {
                     eprintln!("[osbase] {line}");
                 }
+                if exit_code == 2 {
+                    eprintln!("[osbase] install completed with degraded status (non-fatal)");
+                }
                 Ok(())
             } else {
                 Err(CliError::Runtime {

The fix treats exit_code == 2 (degraded) the same as exit_code == 0 in the helper response path:

  • Both print the helper's message lines to stderr and return Ok(())
  • Degraded additionally prints a "non-fatal" notice so the user knows warnings occurred
  • Only exit_code >= 3 (or 1) is now treated as a hard failure via CliError::Runtime
    This makes the helper path consistent with the direct execution path where verify/state degradation never fails the CLI.

@WeissonHan

Copy link
Copy Markdown
Contributor Author

One PR-description note: the implementation now verifies Docker with docker info, but the PR body and Testing section still say docker run --rm alpine echo hello / live container smoke test. Please update the description so reviewers do not think the default verification still depends on pulling the alpine image.

fixed

@WeissonHan WeissonHan force-pushed the runc branch 2 times, most recently from d3ce5a2 to 39a4ba2 Compare June 26, 2026 03:00
@kongche-jbw

Copy link
Copy Markdown
Collaborator

Thanks for the latest round of fixes. I did one more full pass on 39a4ba2; most previous comments are addressed (helper degraded handling, misleading verify hint, fmt, and docker info verification). A few remaining points:

  1. State metadata should be system-scoped

    The new osbase state write stores into /var/lib/anolisa/installed.toml, but a fresh InstalledState::load() defaults to install_mode = User and an empty prefix. Existing install paths explicitly set state.install_mode = System and state.prefix = layout.prefix before saving system state. Please do the same before upsert_object.

  2. The install lock only covers the state phase

    dnf install and systemctl enable --now still run before InstallLock is acquired. That means concurrent mutating anolisa operations can still overlap on package/service changes and only compete at the final state write. If this command is an anolisa-managed install pipeline, the lock should cover at least the package/service/state mutation window, not only installed.toml.

  3. State persistence failure is still treated as non-fatal success

    If state load/save/lock fails after packages and services are changed, the code marks the phase as Degraded; CLI/helper now treat exit_code = 2 as success. That can leave a machine with Docker installed/enabled but no anolisa state record. If this is intentional, please call it out as a limitation in the PR; otherwise state persistence failure should probably be a hard failure rather than printing installed successfully.

  4. PR description still has stale smoke-test wording

    The top example now says docker info, but the body still says “live container smoke test” / “runs a smoke test”. The implementation now performs daemon health verification, not a container run, so please sync that wording too.

@WeissonHan

Copy link
Copy Markdown
Contributor Author

Thanks for the latest round of fixes. I did one more full pass on 39a4ba2; most previous comments are addressed (helper degraded handling, misleading verify hint, fmt, and docker info verification). A few remaining points:

  1. State metadata should be system-scoped
    The new osbase state write stores into /var/lib/anolisa/installed.toml, but a fresh InstalledState::load() defaults to install_mode = User and an empty prefix. Existing install paths explicitly set state.install_mode = System and state.prefix = layout.prefix before saving system state. Please do the same before upsert_object.

1. State metadata is now system-scoped

InstalledState::load() defaults to install_mode = User and an empty prefix. The previous code never overrode these fields, so the osbase state record was incorrectly tagged as user-scope.Fix: Before calling upsert_object, explicitly set system scope — matching the pattern used in tier1/install.rs:

state.install_mode = StateInstallMode::System;
state.prefix = layout.prefix.clone();
  1. The install lock only covers the state phase
    dnf install and systemctl enable --now still run before InstallLock is acquired. That means concurrent mutating anolisa operations can still overlap on package/service changes and only compete at the final state write. If this command is an anolisa-managed install pipeline, the lock should cover at least the package/service/state mutation window, not only installed.toml.

2. InstallLock now covers the full mutation window

Previously, the lock was acquired only inside the Phase 5 closure (state write). This meant concurrent anolisa processes could overlap on dnf install and systemctl enable --now, racing on package/service mutations.Fix: Moved InstallLock::acquire() to immediately after Preflight (which is read-only) and before Phase 2 (Packages). The _lock binding lives until run_manifest_install() returns, so the lock spans packages → services → verify → state:

// ─── Acquire InstallLock ─────────────────────────────────────────────
// Lock covers the full mutation window: packages → services → state.
// Held until the function returns (drop releases the lock).
let _lock = InstallLock::acquire(&layout.lock_file).map_err(|e| ...)?;

Phase 5 now simply says // Lock is already held (acquired before Phase 2).

  1. State persistence failure is still treated as non-fatal success
    If state load/save/lock fails after packages and services are changed, the code marks the phase as Degraded; CLI/helper now treat exit_code = 2 as success. That can leave a machine with Docker installed/enabled but no anolisa state record. If this is intentional, please call it out as a limitation in the PR; otherwise state persistence failure should probably be a hard failure rather than printing installed successfully.

3. State persistence failure is a hard error

If state save fails after packages are installed and services started, the machine is modified with no anolisa record — an unrecoverable inconsistency. The previous code treated this as Degraded (exit_code = 2 = success to the CLI).Fix: Changed from PhaseStatus::Degraded (non-fatal warning) to PhaseStatus::Failed + return Err(...):

// State persistence failure after packages/services were mutated
// is a hard error: the machine has changed but we have no record.
return Err(OsbaseInstallError::PhaseFailed {
    phase: "state".to_string(),
    message: format!(
        "state persistence failed after packages/services were modified: {reason}"
    ),
});
  1. PR description still has stale smoke-test wording
    The top example now says docker info, but the body still says “live container smoke test” / “runs a smoke test”. The implementation now performs daemon health verification, not a container run, so please sync that wording too.

4. "Smoke test" wording replaced with "daemon health verification"

The implementation runs docker info (daemon reachability check), not docker run (container lifecycle). Updated all references:
Module doc: binary checks + docker daemon health verification
Design note JSON: "checks": ["runc --version", "containerd active", "docker version", "docker info"]
Design note table: docker info (daemon health) instead of smoke test

@kongche-jbw

Copy link
Copy Markdown
Collaborator

Latest pass on 755cd709: the main correctness issues from the previous round look addressed (lock coverage, system-scoped state metadata, hard failure on state persistence errors, helper degraded handling, and Docker health check wording).

One remaining UX/consistency point:

system-helper output still does not surface the full five-phase result.

The non-privileged path confirmed in the screenshot goes through anolisa-system-helper, but the helper response is still hand-built in daemon_server.rs and only prints preflight/packages/installed/optional package lines. It does not show whether the services, verify, or state phases succeeded. The same applies to helper dry-run, which still has the older package-only wording.

Since execute_install() already returns outcome.phases, it would be better for the helper response to render from those phases instead of reconstructing a partial message. That keeps the helper path aligned with the real core pipeline and avoids future drift. A small helper-path test would also be useful to assert that runc install output includes services/verify/state information.

@ikunkun-sys

Copy link
Copy Markdown
Collaborator

Latest pass on 755cd709, a few remaining issues:

  1. Verify is still hard-coded to the runc/Docker stack while the install path is shared by every sandbox scenario.

    run_manifest_install() is used for rund, firecracker, gvisor, and landlock as well, but the new verify phase always runs runc --version, systemctl is-active containerd, docker version, and docker info. Those non-runc installs can now report degraded verification for components unrelated to the requested scenario.

  2. system-helper output still does not reflect the actual five-phase outcome.

    The helper success message is reconstructed from manifest metadata and package assumptions, so non-root installs through anolisa-system-helper only show preflight/packages/installed/optional lines. The actual services, verify, and state phase results from outcome.phases are still omitted from the helper response.

  3. Optional package hints are still counted as warnings.

    outcome.warnings now mixes real degraded verification with optional package availability. A successful runc install that only reports the nerdctl optional hint still makes the direct CLI print install completed with 1 warning(s), even though nothing degraded.

@WeissonHan

WeissonHan commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

1. Verify Hard-Coded to runc/Docker Stack

  1. Verify is still hard-coded to the runc/Docker stack while the install path is shared by every sandbox scenario.
    run_manifest_install() is used for rund, firecracker, gvisor, and landlock as well, but the new verify phase always runs runc --version, systemctl is-active containerd, docker version, and docker info. Those non-runc installs can now report degraded verification for components unrelated to the requested scenario.

Approach: Verification commands should be declared by the scenario itself, not hard-coded in Rust.

  • Added a verify_commands: Vec field to ScenarioConfig with #[serde(default)] so existing manifests remain backward-compatible.
  • Populated sandbox.toml with per-scenario verify commands — runc checks docker info, gvisor checks only runsc --version, firecracker checks firecracker --version and jailer --version.
  • Rewrote run_post_verify() to accept &ScenarioConfig. It splits each verify_commands entry on whitespace and executes it via run_verify_cmd. If the field is empty, it falls back to systemctl is-active for each declared service. If neither is defined (e.g. landlock), the verify phase completes immediately with "no verify commands defined; skipped".

The result: each scenario only verifies its own components, eliminating false degraded reports for unrelated binaries.

2. Helper Response Does Not Reflect Actual Five-Phase Outcome

  1. system-helper output still does not reflect the actual five-phase outcome.
    The helper success message is reconstructed from manifest metadata and package assumptions, so non-root installs through anolisa-system-helper only show preflight/packages/installed/optional lines. The actual services, verify, and state phase results from outcome.phases are still omitted from the helper response.

Approach: Since outcome.phases already captures every phase's real execution status, the helper should serialize it directly rather than fabricate a parallel narrative.

  • Removed the old logic that loaded SandboxManifest a second time to reconstruct progress lines.
  • Iterates outcome.phases, formatting each as "{phase.name}: ✓/skipped/degraded/✗ {phase.message}".
  • Appends genuine warnings with a warning:` prefix, and informational hints with ahint:prefix.
  • The message now faithfully represents what actually happened during execution, making the helper path fully consistent with the direct-execution path.

3. Optional Package Hints Counted as Warnings

  1. Optional package hints are still counted as warnings.
    outcome.warnings now mixes real degraded verification with optional package availability. A successful runc install that only reports the nerdctl optional hint still makes the direct CLI print install completed with 1 warning(s), even though nothing degraded.

Approach: Introduce a semantic separation between warnings (real anomalies that may warrant attention) and hints (purely informational messages that do not indicate any degradation).

  • Added a hints: Vec field to OsbaseInstallOutcome, documented as not affecting exit_code.
  • Moved the optional-packages message from warnings.push(hint) to hints.push(hint).
  • In the CLI direct-execution path, the "N warning(s)" line is now printed only when warnings is non-empty. Hints are printed separately as [osbase] hint: ....
  • In the helper response, hints are likewise emitted with a hint:` prefix, clearly distinguished fromwarning:lines.

The effect: a fully successful runc install no longer displays any warning count — the user only sees an informational hint about nerdctl at the end of the output.

@WeissonHan

Copy link
Copy Markdown
Contributor Author

Latest pass on 755cd709: the main correctness issues from the previous round look addressed (lock coverage, system-scoped state metadata, hard failure on state persistence errors, helper degraded handling, and Docker health check wording).

One remaining UX/consistency point:

system-helper output still does not surface the full five-phase result.

The non-privileged path confirmed in the screenshot goes through anolisa-system-helper, but the helper response is still hand-built in daemon_server.rs and only prints preflight/packages/installed/optional package lines. It does not show whether the services, verify, or state phases succeeded. The same applies to helper dry-run, which still has the older package-only wording.

Since execute_install() already returns outcome.phases, it would be better for the helper response to render from those phases instead of reconstructing a partial message. That keeps the helper path aligned with the real core pipeline and avoids future drift. A small helper-path test would also be useful to assert that runc install output includes services/verify/state information.

this was already addressed in the initial commit (859c253) where dispatch_osbase_install now renders from outcome.phases rather than reconstructing from manifest metadata. The follow-up here extracts the formatting into a standalone testable function and adds the two helper-path unit tests you requested.

1. Extract format_outcome_response() as a standalone functionThe inline formatting block is now a named function, making the logic testable without requiring root (the daemon's execute_install validates uid=0):

     match execute_install(&request, &env) {
-        Ok(outcome) => {
-            // Build formatted progress lines from actual phase outcomes.
-            use crate::osbase_install::PhaseStatus;
-            let mut lines = Vec::new();
-            for phase in &outcome.phases { ... }
-            ...
-        }
+        Ok(outcome) => format_outcome_response(outcome),
         Err(e) => HelperResponse::Error { ... },
     }
 }

+/// Renders every phase from `outcome.phases` so the non-root CLI path
+/// sees the full five-phase pipeline result (preflight, packages,
+/// services, verify, state) rather than a partial reconstruction.
+fn format_outcome_response(outcome: OsbaseInstallOutcome) -> HelperResponse {
+    use crate::osbase_install::PhaseStatus;
+    let mut lines = Vec::new();
+    for phase in &outcome.phases {
+        let status_str = match phase.status {
+            PhaseStatus::Success => "✓",
+            PhaseStatus::Skipped => "skipped",
+            PhaseStatus::Degraded => "degraded",
+            PhaseStatus::Failed => "✗",
+        };
+        let msg = phase.message.as_deref().unwrap_or("");
+        lines.push(format!("{}: {} {}", phase.name, status_str, msg));
+    }
+    for w in &outcome.warnings { lines.push(format!("warning: {w}")); }
+    for h in &outcome.hints { lines.push(format!("hint: {h}")); }
+    HelperResponse::Success { message: lines.join("\n"), exit_code: outcome.exit_code }

2. Test: successful runc install surfaces all five phases + hints

#[test]
fn helper_install_dryrun_surfaces_all_phases() {
    let outcome = OsbaseInstallOutcome {
        phases: vec![
            PhaseResult { name: "preflight", status: Success, ... },
            PhaseResult { name: "packages",  status: Success, ... },
            PhaseResult { name: "services",  status: Success, ... },
            PhaseResult { name: "verify",    status: Success, ... },
            PhaseResult { name: "state",     status: Success, ... },
        ],
        exit_code: 0,
        warnings: vec![],
        hints: vec!["optional packages available: nerdctl"],
    };
    let resp = format_outcome_response(outcome);
    // Asserts: message contains "preflight:", "packages:", "services:",
    //          "verify:", "state:", "hint: optional packages..."
    // Asserts: NO "warning:" line in a clean install
}

3. Test: degraded verify produces warning + exit_code 2

#[test]
fn helper_degraded_verify_shows_warning() {
    let outcome = OsbaseInstallOutcome {
        phases: vec![...verify with PhaseStatus::Degraded...],
        exit_code: 2,
        warnings: vec!["verify degraded: docker info failed (exit 1)"],
        hints: vec![],
    };
    let resp = format_outcome_response(outcome);
    // Asserts: exit_code == 2
    // Asserts: message contains "verify: degraded" and "warning: verify degraded"
}

Test results: all 6 daemon_server tests pass (including the 2 new ones), full anolisa-core + anolisa-cli test suites green.

@ikunkun-sys ikunkun-sys left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest pass on 0684284b, two remaining issues:

  1. Dry-run is still reported as a warning.

    build_dry_run_outcome() returns warnings: ["dry-run mode: no changes made"], so both direct execution and the system-helper path report dry-run as a warning. This keeps producing install completed with 1 warning(s) / warning: dry-run mode... even though dry-run is an intentional mode, not a degraded or anomalous outcome.

  2. A skipped verify path is recorded as verify success.

    For scenarios with no verify_commands and no services, run_post_verify() returns the message no verify commands defined; skipped, but the caller records the phase as PhaseStatus::Success. For example, landlock now reports a successful verify phase while the phase message says it was skipped, so the structured phase status and user-facing message disagree.

@WeissonHan

Copy link
Copy Markdown
Contributor Author

1. Dry-run no longer a warning

  1. Dry-run is still reported as a warning.
    build_dry_run_outcome() returns warnings: ["dry-run mode: no changes made"], so both direct execution and the system-helper path report dry-run as a warning. This keeps producing install completed with 1 warning(s) / warning: dry-run mode... even though dry-run is an intentional mode, not a degraded or anomalous outcome.
-        warnings: vec!["dry-run mode: no changes made".to_string()],
-        hints: vec![],
+        warnings: vec![],
+        hints: vec!["dry-run mode: no changes made".to_string()],

Dry-run is an intentional mode, not an anomaly. Moved to hints so the CLI won't print "install completed with 1 warning(s)" for dry-run invocations.

2. Verify "nothing to verify" records as Skipped, not Success

> 2. **A skipped verify path is recorded as verify success.**
>    For scenarios with no `verify_commands` and no `services`, `run_post_verify()` returns the message `no verify commands defined; skipped`, but the caller records the phase as `PhaseStatus::Success`. For example, `landlock` now reports a successful verify phase while the phase message says it was skipped, so the structured phase status and user-facing message disagree.

For scenarios like landlock where neither verify_commands nor services are defined, run_post_verify() returns "no verify commands defined; skipped". The phase is now recorded as PhaseStatus::Skipped — making the structured status consistent with the user-facing message.

After running:
    anolisa osbase sandbox install runc

the user gets a working Docker environment end-to-end:
    docker info  ✓

What this delivers:
- Full-stack install: runc + containerd + docker + docker-client installed
  as a unit; services automatically enabled and started.
- Post-install verification: confirms runc binary, containerd service,
  docker daemon, and a live container health check.
- Optional package hints: informs user about available extras (e.g. nerdctl)
  without forcing installation.
- Manifest-driven: scenario definitions live in sandbox.toml; adding a new
  scenario requires only a TOML stanza, no code changes.

Signed-off-by: Weisson <Weisson@linux.alibaba.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants