[bug] governor: swap_llm_model ignores systemctl exit status — memory-relief model swap silently no-ops (OOM risk)

### Summary

`ServiceCtl::swap_llm_model` runs `systemctl daemon-reload` and `systemctl restart <unit>` via `.output().await?` but never checks `output.status.success()` — unlike every other state-changing method in `service_ctl.rs`. If the restart is denied (polkit policy, masked unit, bad override), the governor's mode-driven LLM model swap silently no-ops while reporting success, so the heavier model keeps running and the device can OOM.

### Steps to reproduce

1. Run `genie-governor` on the device with mode transitions enabled (Day / NightA / NightB / Media).
2. Arrange for the LLM unit restart to fail — e.g. the governor process lacks polkit rights to `systemctl restart`, the unit is masked, or the systemd override the function just wrote is rejected on reload.
3. Trigger a mode transition that swaps the model, e.g. `Day/NightA -> NightB` (the memory-relief transition to a smaller model), `Media -> *`, or `NightB -> Day` (`governor.rs:204-223`).
4. Observe: `swap_llm_model` returns `Ok(())`. The governor logs only the pre-action `"swapping LLM model"` info line. The model never actually changes.

### Expected behavior

`swap_llm_model` should check `status.success()` on both the `daemon-reload` and `restart` commands, log the captured `stderr` on failure, and return `Err(...)` — matching `start`, `docker_start`, and `enable_zram` in the same file. A failed model swap during a memory-pressure transition must be observable (logged / surfaced), not reported as success.

### Actual behavior

```rust
// crates/genie-governor/src/service_ctl.rs:109-120
// Reload systemd and restart the LLM service.
Command::new("systemctl")
    .args(["daemon-reload"])
    .output()
    .await?;          // <- exit status discarded

Command::new("systemctl")
    .args(["restart", &unit])
    .output()
    .await?;          // <- exit status discarded

Ok(())                // <- always reports success
```

`.output().await?` only propagates an error if the *process fails to spawn*. A non-zero `systemctl` exit (permission denied, masked unit, reload error) is swallowed. Compare with the sibling methods, which all branch on `output.status.success()`:

- `start` (`service_ctl.rs:19-23`) — checks status, logs stderr, bails.
- `docker_start` (`service_ctl.rs:81-85`) — checks status, logs stderr, bails.
- `enable_zram` (`service_ctl.rs:136-139`) — checks status, logs stderr.

`swap_llm_model` is the only state-changing method that ignores it. The impact is worst on the `-> NightB` memory-relief swap: if it silently fails, the larger daytime model stays resident on the 8 GB Orin overnight, defeating the governor's purpose and risking OOM. (Callers in `governor.rs:207/215/222` additionally discard the result with `let _ = ...await`, so even after this fix the return value should be logged at the call site — but the function must first be *capable* of reporting failure.)

### Hardware

Jetson Orin Nano Super 8 GB

### JetPack / L4T version

_No response_

### GenieClaw version / commit

`main` — `crates/genie-governor/src/service_ctl.rs:91-121`

### Relevant logs

```shell
# All you ever see is the pre-action line; no error even when the restart was denied:
INFO genie_governor::service_ctl: swapping LLM model unit=genie-ai-runtime.service model=/opt/geniepod/models/<nightb-model>
# (no failure log, no error returned — model never actually swapped)
```

### Additional context

- Not a duplicate. Searched open + closed issues for `governor` / `swap_llm_model` / `daemon-reload` / `model swap` / `service_ctl` — the matches are config-resolution / default-model / deploy-pipeline topics (#40, #52, #54, #107, …); none describe the dropped exit status. Related but distinct from #107 (context size destabilizes the stack).
- Fix is small and local: capture `output`, branch on `status.success()`, log `stderr`, and `bail!` on failure for both commands — then log the result at the `governor.rs` call sites instead of `let _ =`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] governor: swap_llm_model ignores systemctl exit status — memory-relief model swap silently no-ops (OOM risk) #148

Summary

Steps to reproduce

Expected behavior

Actual behavior

Hardware

JetPack / L4T version

GenieClaw version / commit

Relevant logs

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bug] governor: swap_llm_model ignores systemctl exit status — memory-relief model swap silently no-ops (OOM risk) #148

Description

Summary

Steps to reproduce

Expected behavior

Actual behavior

Hardware

JetPack / L4T version

GenieClaw version / commit

Relevant logs

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions