Add safe same-tab macro execution for common browser action loops

## 배경

OpenChrome에는 `batch_execute`와 HTTP JSON-RPC batch가 있지만, 같은 tab에서 `click -> wait -> read_page(delta)` 같은 일반적인 브라우저 액션 시퀀스를 하나의 안전한 tool call로 실행하는 표준 macro는 부족하다. `agent-browser batch`는 여러 명령을 한 invocation으로 실행해 반복 호출 overhead를 줄인다.

## 목표

LLM/MCP 왕복과 중간 토큰 낭비를 줄이기 위해, 기존 `act` tool에 structured `steps` 입력 모드를 추가해, 같은 tab 내 안전한 액션 시퀀스를 실행한다. 신규 top-level tool은 만들지 않는다.

## 구현 범위

- 기존 `act` tool의 input schema를 확장해 `instruction` 대신 `steps` 배열을 받을 수 있게 한다. `instruction` 기반 기존 동작은 유지한다.
- `steps` 모드에서 허용되는 step type은 초기 버전에서 아래 whitelist로 제한한다.
  - `wait_for`
  - `read_page`
  - `click`: `ref_N`, `node_N`, 또는 backend node id target만 허용
  - `fill`: `ref_N`, `node_N`, 또는 backend node id target + value만 허용
  - `navigate`: `url` 필드가 명시된 step에서만 허용
- 옵션:
  - `failFast?: boolean` 기본 true
  - `returnMode?: "final" | "all" | "delta"` 기본 final
  - `perStepTimeoutMs?: number`
  - `maxSteps` hard cap, 예: 10
- 각 step 결과에는 성공/실패, duration, final tab/url 상태가 포함되어야 한다.
- `returnMode="delta"`는 가능한 경우 마지막 `read_page compression="delta"`를 사용한다.

## 비범위

- arbitrary `javascript_tool` step 허용
- 여러 tab 병렬 실행. 이는 기존 `batch_execute`/workflow 계층의 책임으로 둔다.
- auth/tenant/rate-limit 우회
- irreversible action confirmation 우회

## 성공 기준

- 단일 `act` call의 `steps` 모드로 흔한 `click -> wait -> read_page` 루프를 수행할 수 있다.
- 실패 시 어느 step에서 실패했는지 명확히 반환한다.
- 기존 tool handler의 validation/timeout/session lookup 정책을 재사용한다.
- 중간 결과를 모두 반환하지 않는 mode에서는 응답 payload가 줄어든다.

## 테스트 계획

### 단위/통합 테스트

- 성공 시퀀스: click 후 DOM 변경, final read_page만 반환.
- 실패 시퀀스: 존재하지 않는 ref/node 클릭 시 step index와 error 반환.
- `failFast=false`일 때 후속 safe step 처리 정책 검증.
- `maxSteps` 초과, unknown step type, disallowed tool step 거부 검증.
- 기존 `act`, `batch_execute`, HTTP batch 테스트 회귀 없음.

### OpenChrome 실검증

1. local fixture에 버튼 클릭 시 `<p id="status">Clicked</p>`로 변하는 페이지를 준비한다.
2. `navigate` 후 `read_page mode="dom" filter="interactive"`로 버튼 node/ref를 확보한다.
3. `act` structured steps 호출 예:
   ```json
   {
     "tabId":"<tab>",
     "returnMode":"final",
     "steps":[
       {"type":"click","target":"node_123"},
       {"type":"wait_for","text":"Clicked","timeoutMs":2000},
       {"type":"read_page","mode":"dom","compression":"delta"}
     ]
   }
   ```
4. 한 번의 `act` tool call 결과에 최종 상태 `Clicked`가 포함되는지 확인한다.
5. 동일 작업을 개별 tool 3회 호출과 비교해 MCP 응답 수와 총 payload가 감소했는지 기록한다.
6. 실패 target으로 재실행해 step index/error가 명확히 표시되는지 확인한다.

## 리스크와 완화

- 리스크: macro가 너무 강력해져 디버깅이 어려워질 수 있다.
  - 완화: whitelist, maxSteps, per-step result summary, default failFast 적용.
- 리스크: 기존 tool decorator/hint/session 정책을 우회할 수 있다.
  - 완화: 내부에서 기존 handler 또는 공용 실행 경로를 호출하고 직접 CDP bypass를 최소화한다.



## Curated scope, overlap handling, and verification checklist

### Scope classification
- **Canonical lane:** same-tab structured macro execution.
- **Primary deliverable:** structured `steps` mode on existing `act` for safe same-tab action loops.
- **Open PR:** #1098 (`feat/969-act-structured-steps`). Continue there; do not duplicate the PR.
- **Non-goal:** new top-level tool, cross-tab workflow engine, arbitrary JavaScript, broad irreversible actions, or changing instruction-based `act` behavior.

### Overlap and conflict resolution
- [ ] Preserve existing `act` instruction mode; structured steps are additive.
- [ ] Coordinate with #1003 irreversible-action hook for risky steps but do not implement hook policy here.
- [ ] Coordinate with #1062 normalizer only for action shape validation; normalization is not execution policy.

### Implementation checklist
- [ ] Extend `act` input schema to accept either instruction or whitelisted same-tab `steps` array.
- [ ] Whitelist initial steps: wait_for, read_page, click/fill by stable refs/backend node IDs, and explicit navigate URL where allowed.
- [ ] Implement failFast and returnMode final/all/delta behavior with bounded outputs.
- [ ] Validate refs/tab consistency before executing each step and stop safely on stale/invalid refs.
- [ ] Add tests for schema compatibility, safe step sequence, invalid/stale ref, failFast behavior, return modes, and no behavior change for instruction mode.

### Success criteria
- [ ] Common click/wait/read_page loops can run in one safe call with less MCP/LLM roundtrip overhead.
- [ ] Structured steps are constrained to same-tab safe operations and stable targets.
- [ ] Existing `act` callers are backward-compatible.
- [ ] Failures report which step failed and do not continue unsafely when failFast is true.

### Post-merge OpenChrome live verification checklist
- [ ] Run a local fixture macro: read_page -> click -> wait_for -> read_page(delta) and verify final output.
- [ ] Run invalid/stale ref macro and verify failFast stops with step-index diagnostic.
- [ ] Run existing instruction-based `act` smoke to verify compatibility.
- [ ] Record roundtrip/output comparison and step result JSON in merge notes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add safe same-tab macro execution for common browser action loops #969

배경

목표

구현 범위

비범위

성공 기준

테스트 계획

단위/통합 테스트

OpenChrome 실검증

리스크와 완화

Curated scope, overlap handling, and verification checklist

Scope classification

Overlap and conflict resolution

Implementation checklist

Success criteria

Post-merge OpenChrome live verification checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add safe same-tab macro execution for common browser action loops #969

Description

배경

목표

구현 범위

비범위

성공 기준

테스트 계획

단위/통합 테스트

OpenChrome 실검증

리스크와 완화

Curated scope, overlap handling, and verification checklist

Scope classification

Overlap and conflict resolution

Implementation checklist

Success criteria

Post-merge OpenChrome live verification checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions