Skip to content

Add an in-place screenshot markup tool to the browser pane#6335

Open
1shiharat wants to merge 8 commits into
stablyai:mainfrom
1shiharat:feat/browser-screenshot-markup
Open

Add an in-place screenshot markup tool to the browser pane#6335
1shiharat wants to merge 8 commits into
stablyai:mainfrom
1shiharat:feat/browser-screenshot-markup

Conversation

@1shiharat

@1shiharat 1shiharat commented Jun 25, 2026

Copy link
Copy Markdown

What this adds

A screenshot-markup tool for the in-app browser pane. A new Draw button freezes the visible page into a still image and overlays a drawing surface for pen, highlighter, arrow, rectangle, ellipse, and text (color, thickness, font size, undo/redo, plus a select tool to move/restyle/re-edit shapes). Copy markup then puts the annotated PNG on the clipboard to paste into your agent.

It's fully opt-in and additive — nothing changes unless you click the new button, and no existing flow is touched.

Why

When steering an agent against a live page, "make this look like X" is far faster to express by drawing on a screenshot than by typing it out. This turns the browser pane's existing "grab element → feedback" idea into a freehand markup pass, delivered through the same clipboard path the terminal already uses for pasted screenshots — so it works for local and remote/SSH agents alike.

A note on scope (sorry for the surprise)

This is a sizable, unsolicited contribution and I didn't open an issue first — apologies for dropping it without a heads-up. It's deliberately self-contained (all renderer-side, new modules under browser-pane/markup/, one small wiring change in BrowserPane.tsx) and opt-in, so it should be low-risk to evaluate or set aside. I'm very happy to split it into smaller PRs, rework the UX, or close it if it doesn't fit the roadmap — just say the word.

How it works

  • The drawing layer is a renderer-side <canvas> overlay — it never touches the page. Only the base image is environment-specific: local panes use webview.capturePage(), remote panes snapshot the displayed screencast <img>. Compose + delivery are shared.
  • The composite is always a PNG within the clipboard handler's limits, copied via the existing clipboard:writeImage; pasting into the agent reuses Orca's clipboard-screenshot flow (which writes the temp file on the correct host for local or remote agents).
  • Drawing model, compositing, shape rendering, hit-testing, and text metrics are isolated, unit-tested modules; the editor logic is split into focused hooks.

Testing

  • Unit tests for the drawing model (incl. undo/redo snapshots), compositing budget, shape rendering, hit-testing, and text metrics.
  • pnpm lint, pnpm typecheck, and pnpm test pass locally.

Compatibility / review notes

  • Cross-platform (macOS/Linux/Windows): renderer/canvas only; no platform APIs or paths. Undo (Cmd/Ctrl+Z) is platform-guarded via navigator.userAgent.
  • Local / remote / SSH: environment-agnostic by design — only the base-image capture branches (local webview vs remote screencast frame); delivery reuses the terminal's existing local/remote clipboard-image path.
  • Agents / integrations / git providers: provider-neutral; only adds a renderer tool and reuses the clipboard.
  • Performance: the overlay mounts only while active; the frozen base image and canvases are released on exit. No new IPC, polling, or watchers.
  • Security: no new IPC surface; the composite goes through the existing, size-limited PNG-only clipboard:writeImage handler.
  • UI quality: shadcn primitives + design tokens; verified across the draw/select/edit flows in light and dark.

Screenshots

screenshot 2026-06-25 at 15 42 35 screenshot 2026-06-25 at 15 43 12 screenshot 2026-06-25 at 15 43 39 screenshot 2026-06-25 at 15 43 47

X: @1shiharat

Adds a Draw button to the browser toolbar (local and remote/SSH panes) that
freezes the visible viewport into a still image and overlays a drawing canvas.
Users can mark it up with pen, highlighter, arrow, rectangle, ellipse, and text
(color, thickness, undo/redo), then copy the composited PNG to the clipboard to
paste into their agent — reusing the existing clipboard screenshot path, so it
works for local and remote agents alike.

The drawing layer is a renderer-side canvas, so the base image is the only
environment-specific piece: local panes use webview.capturePage(), remote panes
snapshot the already-displayed screencast frame. Drawing model, compositing, and
shape rendering are isolated, unit-tested modules.
- Copy now always produces a PNG (the clipboard handler accepts PNG only;
  a JPEG fallback silently produced an empty clipboard) and targets the
  clipboard size limit so realistic viewports stay full-resolution.
- Flatten the captured base image onto opaque white so transparent page
  backgrounds no longer ghost the live view through the frozen backdrop.
- Keep markup text-input keystrokes local (focus the field explicitly and
  stop propagation) so the browser pane's global handlers can't swallow them.
- Make the Draw toolbar button toggle markup mode off on a second click.
- Add a font-size control (text tool) to the markup toolbar.
- Add a Select tool to re-edit committed shapes: click to select, drag to
  move, change color/width/font-size of the selection, Delete to remove, and
  double-click text to re-edit its content. Undo/redo now snapshots the whole
  shape list so every edit (not just adding) is reversible.
- Fix the text input: transparent ink-colored field (no dark theme box over the
  screenshot) and ignore Enter during IME composition so Japanese conversion no
  longer commits the annotation early.
- Split the grown overlay logic into focused modules (editor hook, pointer +
  keyboard hooks, canvas renderer, document edits, hit-testing) to stay within
  the file-size budget.
- Clicking elsewhere while a text box is open now commits the text and
  switches to the Select tool, instead of discarding it (the input
  unmounted before its blur fired) and opening a new box at the click.
- Move the drawing toolbar to the bottom (stacked above the actions bar)
  and open the color/size popover upward.
- Hide the text shape being re-edited from the canvas so the live input
  isn't doubled by its committed render.
- Drop the input's padding/border and use the same font as the canvas so
  the typing preview sits exactly where the text renders (no shift);
  field-sizing keeps the box hugging the text.
- Estimate text width with full-width (CJK) awareness so the selection box
  and hit bounds track Japanese text instead of clipping it.
- Move the font-size control out of the color popover into its own toolbar
  button showing the current size.
Draw the selection box on the canvas even while a text shape is being
re-edited (the shape stays hidden so it isn't doubled), and strip the box
off the text input so it renders only the editable glyphs. The frame is now
identical whether selecting or editing, and the text overlays the same
position with the same font, so there's no shift between the two.
Force the text input's line-height to 1 and zero its height/padding so its
top edge matches the canvas textBaseline:'top' render — the editing text no
longer sits slightly lower than the committed/selected text.
@1shiharat 1shiharat marked this pull request as ready for review June 25, 2026 07:05
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 051527a2-63a0-4537-81e6-9f90e6c1529d

📥 Commits

Reviewing files that changed from the base of the PR and between 8eec63f and 397d6ca.

📒 Files selected for processing (10)
  • src/renderer/src/components/browser-pane/BrowserPane.tsx
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.test.ts
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupEditor.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupMode.ts
  • src/renderer/src/i18n/locales/en.json
  • src/renderer/src/i18n/locales/es.json
  • src/renderer/src/i18n/locales/ja.json
  • src/renderer/src/i18n/locales/ko.json
  • src/renderer/src/i18n/locales/zh.json
✅ Files skipped from review due to trivial changes (3)
  • src/renderer/src/i18n/locales/en.json
  • src/renderer/src/i18n/locales/ja.json
  • src/renderer/src/i18n/locales/zh.json
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.ts
  • src/renderer/src/i18n/locales/ko.json
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.test.ts
  • src/renderer/src/i18n/locales/es.json
  • src/renderer/src/components/browser-pane/BrowserPane.tsx
  • src/renderer/src/components/browser-pane/markup/useMarkupEditor.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupMode.ts

📝 Walkthrough

Walkthrough

BrowserPane now supports markup drawing in both remote and local panes. The change adds shared markup components, a markup editor hook, screenshot capture and composition utilities, hit-testing and rendering helpers, and new locale strings for the markup UI. It also wires the new flow into the browser pane toolbar and pane content for both remote and local views.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.87% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding an in-place screenshot markup tool to the browser pane.
Description check ✅ Passed The description is mostly complete: it covers summary, screenshots, testing, and compatibility notes, though the AI Review Report and Security Audit sections are not explicit.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
src/renderer/src/components/browser-pane/markup/markup-hit-test.test.ts (1)

16-55: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a highlight hit-test case matching rendered width.

Please add a test asserting highlight selection uses the widened highlight stroke semantics, so this path is regression-protected.

src/renderer/src/components/browser-pane/markup/markup-canvas-render.ts (1)

27-34: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use a shared UI token for the selection outline.

Line 30 hard-codes #3b82f6, which makes the canvas overlay bypass the renderer theme/token system. Please resolve this from an existing token/CSS variable instead of introducing a new color literal here.

As per coding guidelines, “Never invent new color values...” and “All UI work ... must follow docs/STYLEGUIDE.md and use tokens defined in src/renderer/src/assets/main.css.”

Source: Coding guidelines


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0f19723d-9ec7-4b89-8a0f-8881681c24aa

📥 Commits

Reviewing files that changed from the base of the PR and between 151cc05 and 8eec63f.

📒 Files selected for processing (25)
  • src/renderer/src/components/browser-pane/BrowserPane.tsx
  • src/renderer/src/components/browser-pane/markup/MarkupDrawButton.tsx
  • src/renderer/src/components/browser-pane/markup/MarkupOverlay.tsx
  • src/renderer/src/components/browser-pane/markup/MarkupToolbar.tsx
  • src/renderer/src/components/browser-pane/markup/markup-base-image.ts
  • src/renderer/src/components/browser-pane/markup/markup-canvas-render.ts
  • src/renderer/src/components/browser-pane/markup/markup-clipboard-delivery.ts
  • src/renderer/src/components/browser-pane/markup/markup-drawing-model.test.ts
  • src/renderer/src/components/browser-pane/markup/markup-drawing-model.ts
  • src/renderer/src/components/browser-pane/markup/markup-editor-document.ts
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.test.ts
  • src/renderer/src/components/browser-pane/markup/markup-hit-test.ts
  • src/renderer/src/components/browser-pane/markup/markup-screenshot-compose.test.ts
  • src/renderer/src/components/browser-pane/markup/markup-screenshot-compose.ts
  • src/renderer/src/components/browser-pane/markup/markup-shape-render.test.ts
  • src/renderer/src/components/browser-pane/markup/markup-shape-render.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupEditor.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupKeyboardShortcuts.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupMode.ts
  • src/renderer/src/components/browser-pane/markup/useMarkupPointerHandlers.ts
  • src/renderer/src/i18n/locales/en.json
  • src/renderer/src/i18n/locales/es.json
  • src/renderer/src/i18n/locales/ja.json
  • src/renderer/src/i18n/locales/ko.json
  • src/renderer/src/i18n/locales/zh.json

Comment thread src/renderer/src/components/browser-pane/BrowserPane.tsx
Comment thread src/renderer/src/components/browser-pane/markup/markup-hit-test.ts Outdated
Comment thread src/renderer/src/components/browser-pane/markup/useMarkupEditor.ts
Comment thread src/renderer/src/components/browser-pane/markup/useMarkupMode.ts Outdated
Comment thread src/renderer/src/components/browser-pane/markup/useMarkupMode.ts Outdated
- Hit-test highlights against their rendered (4x) thickness so the select
  tool doesn't miss large highlight strokes.
- Reset transient editor state (pending text, selection, in-progress drag) on
  Clear all so a pending input blur can't re-add text and no stale selection
  lingers over the emptied canvas.
- Guard markup completion with the capture token so a stale onDeliver can't
  reset or error a session the user already cancelled or restarted.
- Skip the grab-element keyboard shortcut while the markup overlay is open,
  matching the already-disabled grab toolbar buttons.
- Localize the markup error messages and surface them via a toast (the
  controller's error was otherwise never shown).
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants