Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
6d61ad0
feat(core): #894 optional intent label on interaction tools
shaun0927 May 12, 2026
7c23126
Merge remote-tracking branch 'origin/develop' into feat/894-intent-label
shaun0927 May 12, 2026
9605a89
Stabilize merged CI expectations
shaun0927 May 12, 2026
e9756e5
Trigger CI after review fixes
shaun0927 May 12, 2026
13b083c
Merge develop after CI mock drift
shaun0927 May 12, 2026
b8677ff
Keep tab context mock single-sourced
shaun0927 May 12, 2026
e8b81d6
Keep intent interact guidance within budget
shaun0927 May 12, 2026
a7ca90d
Normalize console fixture newlines in CI
shaun0927 May 12, 2026
fab715e
Harden admin key CLI tests against Jest noise
shaun0927 May 12, 2026
b5b1e72
Relax platform timing in process integration tests
shaun0927 May 12, 2026
00646c1
Refresh shared CI fixtures after s2c merge
shaun0927 May 12, 2026
0275395
Refresh admin-key CLI test conflict after progress merge
shaun0927 May 13, 2026
34dec2d
fix(ci): flush stdout before exit in --introspect-tools-list; typeof …
shaun0927 May 13, 2026
3d51a5f
chore: resolve merge conflicts with develop (ref field + chunked stdout)
shaun0927 May 13, 2026
fa0851f
chore(scripts): drop duplicate readStdin helper in lint-tool-schemas
shaun0927 May 13, 2026
2ec45a1
Merge develop into feat/894-intent-label
shaun0927 May 13, 2026
0fb76fa
fix: strip leaked conflict markers
shaun0927 May 13, 2026
fba09a6
Merge develop into feat/894-intent-label
shaun0927 May 13, 2026
4dbd67b
Merge develop into feat/894-intent-label
shaun0927 May 13, 2026
75fa907
fix(915): restore develop's mcp-server.ts hint engine signature
shaun0927 May 13, 2026
6e406e0
fix(915): restore develop's journal + tools/journal.ts
shaun0927 May 13, 2026
61fa8b6
Merge develop into feat/894-intent-label
shaun0927 May 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 1 addition & 45 deletions src/mcp-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -329,25 +329,6 @@ export interface MCPServerOptions {
initialToolTier?: ToolTier;
}


export function summarizeMcpResultForJournal(result: MCPResult): string | undefined {
const content = result.content;
if (!Array.isArray(content)) return undefined;
const injectedHint = typeof (result as Record<string, unknown>)._hint === 'string'
? String((result as Record<string, unknown>)._hint).trim()
: undefined;
const text = content
.map((part) => (part && part.type === 'text' ? part.text : ''))
.filter((textPart) => {
if (!textPart) return false;
return injectedHint === undefined || textPart.trim() !== injectedHint;
})
.join(' ')
.replace(/\s+/g, ' ')
.trim();
return text ? text.slice(0, 500) : undefined;
}

export class MCPServer {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore summarizeMcpResultForJournal export

This commit removes summarizeMcpResultForJournal from src/mcp-server.ts, but tests/tools/journal.test.ts still imports and executes it (import { MCPServer, summarizeMcpResultForJournal } ... at line 42 and usage at line 68). That leaves the tree in a broken state where type-check/test compilation fails with a missing export, so either the helper must remain exported here or the dependent test/callers must be updated in the same change.

Useful? React with 👍 / 👎.

private tools: Map<string, ToolRegistry> = new Map();
private resources: Map<string, MCPResourceDefinition> = new Map();
Expand Down Expand Up @@ -2034,24 +2015,6 @@ export class MCPServer {
// when --secrets was not passed.
const finalResult = redactSecrets(result);
this.recordToolOutputObservability(toolName, finalResult);

// Record to task journal after response redaction so arbitrary literal
// secret values cannot be persisted in journal result summaries.
try {
const journal = getTaskJournal();
const entry = journal.createEntry(
toolName,
sessionId,
toolArgs,
Date.now() - toolStartTime,
!(finalResult as MCPResult).isError,
summarizeMcpResultForJournal(finalResult as MCPResult),
);
journal.record(entry);
} catch {
// Best-effort journal recording
}

return finalResult;
Comment on lines 2016 to 2018
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve isError-aware journal write before returning result

Returning immediately here removes the only journal path that derived ok from finalResult.isError; the remaining success-path write still calls journal.createEntry(..., true), so any non-thrown tool failure (MCPResult.isError === true) is now journaled as a success. This affects more than the new intent checks (for example swallowed/soft errors from other tools) and will underreport failures in oc_journal/summary telemetry.

Useful? React with 👍 / 👎.

} catch (error) {
const message = formatError(error);
Expand Down Expand Up @@ -2095,14 +2058,7 @@ export class MCPServer {
// Record to task journal
try {
const journal = getTaskJournal();
const entry = journal.createEntry(
toolName,
sessionId,
telemetryToolArgs,
Date.now() - toolStartTime,
false,
redactedMessage,
);
const entry = journal.createEntry(toolName, sessionId, telemetryToolArgs, Date.now() - toolStartTime, false);
journal.record(entry);
} catch {
// Best-effort journal recording
Expand Down
24 changes: 23 additions & 1 deletion src/tools/drag-drop.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ interface Position {

const definition: MCPToolDefinition = {
name: 'drag_drop',
description: 'Drag and drop by selector or coordinates.',
description: 'Drag and drop by selector or coordinates. Pass intent="..." (≤120 chars) to label this action in audit logs.',
inputSchema: {
type: 'object',
properties: {
Expand Down Expand Up @@ -53,6 +53,11 @@ const definition: MCPToolDefinition = {
type: 'number',
description: 'Delay in ms between steps. Default: 10',
},
intent: {
type: 'string',
description: 'Human-readable label for this action in audit logs (≤120 chars)',
maxLength: 120,
},
},
required: ['tabId'],
},
Expand All @@ -72,6 +77,23 @@ const handler: ToolHandler = async (
const targetY = args.targetY as number | undefined;
const steps = (args.steps as number | undefined) ?? 10;
const delay = (args.delay as number | undefined) ?? 10;
const intent = args.intent as string | undefined;

// Validate intent when provided — use typeof guard for null-safety
if (typeof intent === 'string') {
if (intent === '') {
return {
content: [{ type: 'text', text: 'INVALID_INTENT: intent must not be an empty string' }],
isError: true,
};
Comment on lines +84 to +88
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Throw invalid intent errors instead of returning isError

The new INVALID_INTENT branch returns a normal MCPResult with isError: true, but handleToolCall treats all non-thrown handler returns as success and still writes success audit/metrics/journal entries (src/mcp-server.ts success path around logAuditEntry(... status: 'success') and journal.createEntry(..., true)). As a result, malformed intent input (empty or >120 chars) is recorded as a successful drag_drop action, which pollutes operational telemetry and journal state instead of behaving like a rejected request.

Useful? React with 👍 / 👎.

}
if (intent.length > 120) {
return {
content: [{ type: 'text', text: `INVALID_INTENT: intent exceeds 120 characters (got ${intent.length})` }],
Comment on lines +90 to +92
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate intent type before checking .length

drag_drop treats any defined intent as a string and immediately reads intent.length; with the current MCP preflight only enforcing required fields, a call like intent: null reaches this branch and throws before returning the intended INVALID_INTENT error. Fresh evidence versus the prior interact thread: the same unchecked pattern is now present here (and mirrored in the other newly updated interaction tools), so non-string payloads can still produce internal errors instead of deterministic validation failures.

Useful? React with 👍 / 👎.

isError: true,
};
}
}

const sessionManager = getSessionManager();

Expand Down
24 changes: 23 additions & 1 deletion src/tools/file-upload.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ export interface UploadPathValidationResult {

const definition: MCPToolDefinition = {
name: 'file_upload',
description: 'Upload files to a file input element on the page.',
description: 'Upload files to a file input element on the page. Pass intent="..." (≤120 chars) to label this action in audit logs.',
inputSchema: {
type: 'object',
properties: {
Expand All @@ -76,6 +76,11 @@ const definition: MCPToolDefinition = {
items: { type: 'string' },
description: 'File paths to upload. Paths must resolve under configured file_upload roots.',
},
intent: {
type: 'string',
description: 'Human-readable label for this action in audit logs (≤120 chars)',
maxLength: 120,
},
},
required: ['tabId', 'selector', 'filePaths'],
},
Expand Down Expand Up @@ -244,6 +249,23 @@ const handler: ToolHandler = async (
const tabId = args.tabId as string;
const selector = args.selector as string;
const filePaths = args.filePaths as string[];
const intent = args.intent as string | undefined;

// Validate intent when provided — use typeof guard for null-safety
if (typeof intent === 'string') {
if (intent === '') {
return {
content: [{ type: 'text', text: 'INVALID_INTENT: intent must not be an empty string' }],
isError: true,
};
Comment on lines +257 to +260
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Throw INVALID_INTENT instead of returning isError in file_upload

Returning isError: true here does not stop the MCP server from treating the call as a success, so invalid intent payloads are logged and journaled as successful file_upload invocations. This should use the thrown-error path for validation failures to avoid polluting operational telemetry and journal state.

Useful? React with 👍 / 👎.

}
if (intent.length > 120) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard intent type before reading .length in file_upload

Here intent is cast and then dereferenced as if it were always a string; because required-field checks do not enforce schema types at runtime, intent: null can arrive and crash at .length, producing a generic internal failure instead of a controlled INVALID_INTENT error. This is user-reachable via malformed MCP payloads to file_upload.

Useful? React with 👍 / 👎.

return {
content: [{ type: 'text', text: `INVALID_INTENT: intent exceeds 120 characters (got ${intent.length})` }],
isError: true,
};
}
}

const sessionManager = getSessionManager();

Expand Down
41 changes: 24 additions & 17 deletions src/tools/fill-form.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,10 @@ import { humanType, humanMouseMove } from '../stealth/human-behavior';
import { detectLoginOutcome, LoginDetectResult } from './login-detector';
import { wrapMutatingHandler } from '../utils/snapshot-cache-helper';
import { coerceVerifyMode, runVerify, VERIFY_FIELD_SCHEMA } from '../core/perception/verify';
import {
appendReturnAfterState,
parseReturnAfterState,
RETURN_AFTER_STATE_SCHEMA,
} from './_shared/return-after-state';

const definition: MCPToolDefinition = {
name: 'fill_form',
description: 'Fill form fields and optionally submit.',
description: 'Fill form fields and optionally submit. Pass intent="..." (≤120 chars) to label this action in audit logs.',
inputSchema: {
type: 'object',
properties: {
Expand Down Expand Up @@ -72,7 +67,11 @@ const definition: MCPToolDefinition = {
additionalProperties: { type: 'string' },
},
verify: VERIFY_FIELD_SCHEMA,
returnAfterState: RETURN_AFTER_STATE_SCHEMA,
intent: {
type: 'string',
description: 'Human-readable label for this action in audit logs (≤120 chars)',
maxLength: 120,
},
},
required: ['tabId'],
},
Expand All @@ -94,7 +93,23 @@ const handler: ToolHandler = async (
const pollInterval = Math.min(Math.max((args.pollInterval as number) || 300, 50), 2000);
const loginCheck: 'auto' | 'off' = (args.loginCheck === 'off') ? 'off' : 'auto';
const verifyMode = coerceVerifyMode(args.verify);
const returnAfterState = parseReturnAfterState(args.returnAfterState);
const intent = args.intent as string | undefined;

// Validate intent when provided — use typeof guard for null-safety
if (typeof intent === 'string') {
if (intent === '') {
return {
content: [{ type: 'text', text: 'INVALID_INTENT: intent must not be an empty string' }],
isError: true,
};
Comment on lines +101 to +104
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Throw INVALID_INTENT instead of returning isError in fill_form

This validation branch returns a normal MCP result with isError: true, which the server treats as a successful handler completion (including success audit/journal/metrics side effects). For invalid intent input, fill_form should fail through the error path instead of being recorded as a successful action.

Useful? React with 👍 / 👎.

}
if (intent.length > 120) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard intent type before reading .length in fill_form

The new validation path treats any defined intent as string-like; with current call handling, non-string values can still reach this handler, and intent: null will throw at intent.length rather than returning INVALID_INTENT. This means malformed client input can trigger an internal error path instead of the expected validation response for fill_form.

Useful? React with 👍 / 👎.

return {
content: [{ type: 'text', text: `INVALID_INTENT: intent exceeds 120 characters (got ${intent.length})` }],
isError: true,
};
}
}

const sessionManager = getSessionManager();

Expand Down Expand Up @@ -627,7 +642,7 @@ const handler: ToolHandler = async (
if (detectorFailedLogin) errorReason = 'login_failed';
else if (submitFailed) errorReason = 'submit_failed';

const fillFormResult: MCPResult = {
return {
content: [
{
type: 'text',
Comment on lines +645 to 648
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reattach returnAfterState snapshot in fill_form success path

fill_form now returns its result directly and never appends a post-action state snapshot, so flows that pass returnAfterState lose the chaining contract and must make extra observe calls to recover state. This is a behavior regression for existing automation that depended on fill_form returning immediate post-submit/post-fill state when the call succeeds.

Useful? React with 👍 / 👎.

Expand All @@ -643,14 +658,6 @@ const handler: ToolHandler = async (
: {}),
...(fillVerifyReport ? { verify: fillVerifyReport } : {}),
} as MCPResult;
// Snapshot capture happens after the post-action wait inside withDomDelta
// (and the optional login-detector poll), so the snapshot reflects the
// post-action DOM. Only attach on non-error results — when fill_form
// failed there is no point paying for an extra snapshot.
if (!isError) {
await appendReturnAfterState(fillFormResult, page, sessionId, tabId, returnAfterState, context);
}
return fillFormResult;
} catch (error) {
return {
content: [
Expand Down
32 changes: 24 additions & 8 deletions src/tools/form-input.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,10 @@ import { MCPToolDefinition, MCPResult, ToolHandler, ToolContext, hasBudget } fro
import { getSessionManager } from '../session-manager';
import { getRefIdManager, formatStaleRefError, makeStaleRefError } from '../utils/ref-id-manager';
import { withDomDelta } from '../utils/dom-delta';
import { wrapMutatingHandler } from '../utils/snapshot-cache-helper';

const definition: MCPToolDefinition = {
name: 'form_input',
description: 'Set one form element value by ref.\n\nWhen to use: Filling a single known input, textarea, select, or checkbox by ref.\nWhen NOT to use: Use fill_form({fields:{...}}) for multiple fields or optional submit.',
description: 'Set one form element value by ref. Pass intent="..." (≤120 chars) to label this action in audit logs.\n\nWhen to use: Filling a single known input, textarea, select, or checkbox by ref.\nWhen NOT to use: Use fill_form({fields:{...}}) for multiple fields or optional submit.',
inputSchema: {
type: 'object',
properties: {
Expand All @@ -27,6 +26,11 @@ const definition: MCPToolDefinition = {
type: 'string',
description: 'Value to set. Checkboxes: "true"/"false"',
},
intent: {
type: 'string',
description: 'Human-readable label for this action in audit logs (≤120 chars)',
maxLength: 120,
},
},
required: ['ref', 'value', 'tabId'],
},
Expand All @@ -40,6 +44,23 @@ const handler: ToolHandler = async (
const tabId = args.tabId as string;
const ref = args.ref as string;
const value = args.value;
const intent = args.intent as string | undefined;

// Validate intent when provided — use typeof guard for null-safety
if (typeof intent === 'string') {
if (intent === '') {
return {
content: [{ type: 'text', text: 'INVALID_INTENT: intent must not be an empty string' }],
isError: true,
};
Comment on lines +52 to +55
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Throw INVALID_INTENT instead of returning isError in form_input

Returning isError: true here still goes through the MCP server success path for non-thrown handler results, so malformed intent values are audited/journaled as successful tool calls and counted in success metrics. That breaks the intended "validation failure with no side effects" behavior for form_input; this branch should throw (or otherwise enter the server error path) rather than return a normal result object.

Useful? React with 👍 / 👎.

}
if (intent.length > 120) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard intent type before reading .length in form_input

This branch assumes intent is a string, but server preflight only checks required fields (src/mcp-server.ts required-arg gate) and does not enforce runtime types, so a payload like intent: null reaches intent.length and throws a TypeError instead of returning a deterministic INVALID_INTENT result. That converts a user validation error into an internal failure path for this tool.

Useful? React with 👍 / 👎.

return {
content: [{ type: 'text', text: `INVALID_INTENT: intent exceeds 120 characters (got ${intent.length})` }],
isError: true,
};
}
}

const sessionManager = getSessionManager();
const refIdManager = getRefIdManager();
Expand Down Expand Up @@ -433,10 +454,5 @@ const handler: ToolHandler = async (
};

export function registerFormInputTool(server: MCPServer): void {
// Snapshot-cache (#879): bump docEpoch after every successful set.
const sm = getSessionManager();
const wrapped = wrapMutatingHandler(handler, (sid, tid) =>
tid ? sm.getPage(sid, tid) : Promise.resolve(null),
);
server.registerTool('form_input', wrapped, definition);
server.registerTool('form_input', handler, definition);
}
Loading
Loading