Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
1691501
feat(control-plane): plan-mode HITL gate (storage + API + sandbox pre…
bleleve May 22, 2026
a2cad10
fix(control-plane,sandbox,shared): address CodeRabbit feedback on #671
bleleve May 22, 2026
2f613bb
fix(control-plane): address CodeRabbit follow-up review on #671
bleleve May 22, 2026
ca76bda
fix(control-plane): guard non-object JSON in approval-body parsing
bleleve May 22, 2026
2338207
fix(control-plane,sandbox): address CodeRabbit second follow-up on #671
bleleve May 22, 2026
7b2d37b
feat(control-plane): dispatch implementation prompt on plan approval
bleleve May 22, 2026
63c67e5
refactor(control-plane): drop redundant onPlanApproved callback
bleleve May 22, 2026
610126f
test(control-plane): isolate plan-approval integration test via clean…
bleleve May 22, 2026
55576fc
fix(sandbox-runtime): satisfy ruff E402 + I001 on bridge.py
bleleve May 26, 2026
7736e7f
feat(control-plane): fire cross-channel plan-verdict callback on appr…
bleleve May 26, 2026
9d8136d
feat(control-plane): fire cross-channel notification on session archi…
bleleve Jun 1, 2026
f51a7e9
fix(control-plane): plumb actor display name through cross-channel ca…
bleleve Jun 1, 2026
cd74065
feat: deployment-wide default model and default plan model
bleleve May 22, 2026
f2a8496
fix(shared,web,terraform): address CodeRabbit feedback on #672
bleleve May 22, 2026
2f9e6e2
fix(control-plane): reconcile resolved defaults with enabledModels in…
bleleve May 22, 2026
a6ee816
fix(shared): normalize env-var fallbacks in fetchModelDefaults
bleleve May 22, 2026
33ee598
feat(slack-bot): plan-mode (App Home + approval modal + auto-classifier)
bleleve May 22, 2026
7e832ec
fix(slack-bot): keep slack-notify guard on follow-up prompts and log …
bleleve May 22, 2026
f4a365c
fix(slack-bot): split channel-info warn logs by failure mode (review …
bleleve May 22, 2026
d2626fe
feat(slack-bot): chat.update awaiting message on cross-channel plan v…
bleleve May 26, 2026
abbb74c
refactor(slack-bot): shorten plan-awaiting KV TTL to 24h + clear on m…
bleleve May 26, 2026
29d9b00
refactor(slack-bot): bump plan-awaiting KV TTL to 3 days
bleleve Jun 1, 2026
6c61c41
feat(slack-bot): post archive/unarchive notification in the originati…
bleleve Jun 1, 2026
e9663e8
fix(slack-bot): render actor display name in cross-channel callbacks
bleleve Jun 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 40 additions & 21 deletions docs/GETTING_STARTED.md
Original file line number Diff line number Diff line change
Expand Up @@ -318,21 +318,8 @@ Save these values somewhere secure—you'll need them in the next step.
```bash
cd terraform/environments/production

# Copy the example files
# Copy the example file and fill in values
cp terraform.tfvars.example terraform.tfvars
cp backend.tfvars.example backend.tfvars
```

### Configure `backend.tfvars`

Fill in your R2 credentials:

```hcl
access_key = "your-r2-access-key-id"
secret_key = "your-r2-secret-access-key"
endpoints = {
s3 = "https://YOUR_CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"
}
```

### Configure `terraform.tfvars`
Expand Down Expand Up @@ -454,8 +441,11 @@ Then run:
```bash
cd terraform/environments/production

# Initialize Terraform with backend config
terraform init -backend-config=backend.tfvars
# Initialize Terraform with R2 backend credentials
terraform init \
-backend-config="access_key=YOUR_R2_ACCESS_KEY_ID" \
-backend-config="secret_key=YOUR_R2_SECRET_ACCESS_KEY" \
-backend-config='endpoints={s3="https://YOUR_CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"}'

# Deploy (phase 1 - creates workers without bindings)
terraform apply
Expand Down Expand Up @@ -645,13 +635,38 @@ curl -I https://open-inspect-web-{deployment_name}.YOUR-SUBDOMAIN.workers.dev
3. Create a new session with a repository
4. Send a prompt and verify the sandbox starts

### Configure Default Models

The web UI exposes a **Settings → Models → Default Models** section that controls which build and
plan models the deployment uses by default. Bots (Linear, GitHub, Slack) read these values at
session-creation time, so changes propagate without a Terraform redeploy.

1. Open **Settings → Models** in the web UI.
2. Under **Default Models**, pick:
- **Default model** — the build model used when no per-request override is in play.
- **Default plan model** — the model that runs the planning turn when plan mode is enabled.
3. Save. The values are stored in D1; bots fall back to the worker's `DEFAULT_MODEL` /
`DEFAULT_PLAN_MODEL` env var only when the control plane is unreachable, then to a shared
constant.

Disabling a model that is the current default is blocked inline — pick a new default first.

See [PLAN_MODE.md](PLAN_MODE.md) for the full plan-mode workflow these defaults feed into.

---

## Step 10: Set Up CI/CD (Optional)

Enable automatic deployments when you push to main by adding GitHub Secrets.
Enable automatic deployments by configuring GitHub Environments and secrets:

- **`main` branch** → deploys **staging** (`terraform/environments/staging`)
- **`stable` branch** → deploys **production** (`terraform/environments/production`)

Create `staging` and `production` environments under Settings → Environments. Add secrets to each
environment (or use repository-level secrets shared by both). Pull requests to `main` run
`terraform plan` against staging.

Go to your fork's Settings → Secrets and variables → Actions, and add:
Go to your fork's Settings → Secrets and variables → Actions (or per-environment secrets), and add:

| Secret Name | Value |
| ----------------------------- | ----------------------------------------------------------------------------- |
Expand Down Expand Up @@ -719,8 +734,9 @@ gh secret set GH_APP_PRIVATE_KEY < private-key-pkcs8.pem

Once configured, the GitHub Actions workflow will:

- Run `terraform plan` on pull requests (with PR comment)
- Run `terraform apply` when merged to main
- Run `terraform plan` on pull requests to `main` (staging, with PR comment)
- Run `terraform apply` on pushes to `main` (staging)
- Run `terraform apply` on pushes to `stable` (production)

---

Expand Down Expand Up @@ -749,7 +765,10 @@ terraform apply
Re-run init with backend config:

```bash
terraform init -backend-config=backend.tfvars
terraform init \
-backend-config="access_key=YOUR_R2_ACCESS_KEY_ID" \
-backend-config="secret_key=YOUR_R2_SECRET_ACCESS_KEY" \
-backend-config='endpoints={s3="https://YOUR_CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com"}'
```

### GitHub App authentication fails
Expand Down
47 changes: 34 additions & 13 deletions docs/HOW_IT_WORKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,14 @@ if needed.

### What's Stored in a Session

| Data | Description |
| ------------- | ------------------------------------------------- |
| Messages | Prompts you've sent and their metadata |
| Events | Tool calls, token streams, status updates |
| Artifacts | PRs created, screenshots captured |
| Participants | Users who have joined the session |
| Sandbox state | Reference to the current sandbox and its snapshot |
| Data | Description |
| ------------- | --------------------------------------------------------------------------------------------------------------------------- |
| Messages | Prompts you've sent and their metadata |
| Events | Tool calls, token streams, status updates |
| Artifacts | PRs created, screenshots captured |
| Participants | Users who have joined the session |
| Sandbox state | Reference to the current sandbox and its snapshot |
| Plans | Versioned markdown plans + approval status (`awaiting_approval`, `approved`, `rejected`) — see [PLAN_MODE.md](PLAN_MODE.md) |

Each session gets its own SQLite database in a Cloudflare Durable Object, ensuring isolation and
high performance even with hundreds of concurrent sessions.
Expand Down Expand Up @@ -186,13 +187,13 @@ When you create a session for a repo without an existing snapshot:
└─────────┘ └──────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └───────┘
│ │
▼ ▼
.openinspect/setup.sh .openinspect/start.sh
scripts/.openinspect/setup.sh scripts/.openinspect/start.sh
```

1. **Sandbox created**: Modal spins up a new container from the base image
2. **Git sync**: Clones your repository using GitHub App credentials
3. **Setup script**: Runs `.openinspect/setup.sh` for provisioning (if present)
4. **Start script**: Runs `.openinspect/start.sh` for runtime startup (if present)
3. **Setup script**: Runs `scripts/.openinspect/setup.sh` for provisioning (if present)
4. **Start script**: Runs `scripts/.openinspect/start.sh` for runtime startup (if present)
5. **Agent start**: OpenCode server starts and connects back to the control plane
6. **Ready**: Sandbox accepts prompts

Expand All @@ -209,7 +210,7 @@ When restoring from a previous snapshot:

1. **Restore snapshot**: Modal restores the filesystem from a saved image
2. **Quick sync**: Pulls latest changes (usually just a few commits)
3. **Start script**: Runs `.openinspect/start.sh` for runtime startup (if present)
3. **Start script**: Runs `scripts/.openinspect/start.sh` for runtime startup (if present)
4. **Ready**: Sandbox is ready almost instantly

Snapshots include installed dependencies, built artifacts, and workspace state. This is why
Expand All @@ -220,8 +221,8 @@ follow-up prompts in an existing session are much faster than the first prompt.
When starting from a pre-built repo image:

1. **Incremental git sync**: Fast fetch + hard reset to latest branch head
2. **Setup skipped**: `.openinspect/setup.sh` already ran when the image was built
3. **Start script runs**: `.openinspect/start.sh` executes for per-session runtime startup
2. **Setup skipped**: `scripts/.openinspect/setup.sh` already ran when the image was built
3. **Start script runs**: `scripts/.openinspect/start.sh` executes for per-session runtime startup
4. **Ready**: Agent starts once runtime hook succeeds

If `start.sh` exists and fails, startup fails fast instead of continuing with a broken runtime.
Expand Down Expand Up @@ -320,6 +321,26 @@ This lets you send follow-up thoughts while the agent works. Prompts are process

You can also stop the current execution if the agent is going down the wrong path.

### Plan-Mode Gate

When a session is in plan mode the message queue is not blocked — what changes is **how** each
prompt is dispatched. While `plan_mode = 1` and `plan_approval_status = "awaiting_approval"` (or
unset, pre-plan), every dispatched prompt runs as a planning turn (`planMode: true` in the command),
so a follow-up sent before you approve is treated as an amendment and produces plan v2 — not
blocked. Approve or reject flips `isPlanningTurn` to false; the next prompt then runs as a normal
build turn. The full workflow (triggers, approval UIs, amendments, plan vs build model split) lives
in [PLAN_MODE.md](PLAN_MODE.md).

### Prompt-Safety Wrapping

Bot-assembled prompts can contain untrusted text (a Linear issue body, a PR description, a Slack
thread). To stop prompt injection across the trusted/untrusted boundary, the bots wrap each piece of
user-supplied content in `<user_content>` blocks via `buildUntrustedUserContentBlock` from
`@open-inspect/shared` (HTML-escapes attributes, neutralizes literal `</user_content>` inside the
body). The sandbox bridge then wraps the whole prompt in `<user_message>` when a plan or resume
preamble is prepended, and also neutralizes literal `</user_message>` so a user can't close the
outer wrapper from inside.

---

## The Agent
Expand Down
Loading