Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 48 additions & 5 deletions src/core/task/Task.ts
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,8 @@ export class Task extends EventEmitter<TaskEvents> implements TaskLike {
// Tool Use
consecutiveMistakeCount: number = 0
consecutiveMistakeLimit: number
consecutiveMistakeGuidanceCount: number = 0 // Track how many times we've asked for guidance
maxConsecutiveMistakeGuidance: number = 3 // Maximum times to ask for guidance before aborting
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this hardcoded limit of 3 guidance attempts be made configurable? Some complex tasks might legitimately need more attempts before we should give up. Consider making it similar to consecutiveMistakeLimit - perhaps maxConsecutiveMistakeGuidance could be passed in TaskOptions?

consecutiveMistakeCountForApplyDiff: Map<string, number> = new Map()
toolUsage: ToolUsage = {}

Expand Down Expand Up @@ -1725,10 +1727,35 @@ export class Task extends EventEmitter<TaskEvents> implements TaskLike {
}

if (this.consecutiveMistakeLimit > 0 && this.consecutiveMistakeCount >= this.consecutiveMistakeLimit) {
const { response, text, images } = await this.ask(
"mistake_limit_reached",
t("common:errors.mistake_limit_guidance"),
)
// Check if we've asked for guidance too many times
if (this.consecutiveMistakeGuidanceCount >= this.maxConsecutiveMistakeGuidance) {
// We've asked for guidance too many times, abort to prevent token burning
await this.say(
"error",
`I've been unable to proceed despite multiple attempts and guidance. The task appears to be stuck in a loop. To prevent excessive token usage, I'm stopping here. Please review the conversation and consider:\n\n1. Providing more specific instructions\n2. Breaking down the task into smaller steps\n3. Checking if there are any environmental issues preventing progress`,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message mentions "environmental issues" but doesn't provide examples. Could we be more specific? For instance: "3. Checking if there are any environmental issues preventing progress (e.g., missing dependencies, file permissions, network connectivity)"

)

// Track token burning event in telemetry
// Use captureConsecutiveMistakeError with additional logging for now
TelemetryService.instance.captureConsecutiveMistakeError(this.taskId)
console.error(
`[Task#${this.taskId}] Token burning detected - Guidance count: ${this.consecutiveMistakeGuidanceCount}, Mistake count: ${this.consecutiveMistakeCount}`,
)

// Return true to end the task loop
return true
}

// Increment guidance count before asking
this.consecutiveMistakeGuidanceCount++

// Add exponential backoff message if we've asked before
let guidanceMessage = t("common:errors.mistake_limit_guidance")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this guidance message generation logic into a separate method like generateGuidanceMessage(attemptNumber: number, maxAttempts: number) for better maintainability and testability.

if (this.consecutiveMistakeGuidanceCount > 1) {
guidanceMessage = `${guidanceMessage}\n\n(Attempt ${this.consecutiveMistakeGuidanceCount}/${this.maxConsecutiveMistakeGuidance} - I'm having difficulty making progress)`
}

const { response, text, images } = await this.ask("mistake_limit_reached", guidanceMessage)

if (response === "messageResponse") {
currentUserContent.push(
Expand All @@ -1740,10 +1767,11 @@ export class Task extends EventEmitter<TaskEvents> implements TaskLike {

await this.say("user_feedback", text, images)

// Track consecutive mistake errors in telemetry.
// Track consecutive mistake errors in telemetry
TelemetryService.instance.captureConsecutiveMistakeError(this.taskId)
}

// Reset mistake count but keep guidance count
this.consecutiveMistakeCount = 0
}

Expand Down Expand Up @@ -2304,6 +2332,11 @@ export class Task extends EventEmitter<TaskEvents> implements TaskLike {
if (!didToolUse) {
this.userMessageContent.push({ type: "text", text: formatResponse.noToolsUsed() })
this.consecutiveMistakeCount++

// Log when we're incrementing mistake count for debugging
console.log(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition of debug logging here! Consider including the task mode in the log for better debugging context: [Task#${this.taskId}] Mode: ${this._taskMode}, Consecutive mistake count: ...

`[Task#${this.taskId}] Consecutive mistake count: ${this.consecutiveMistakeCount}/${this.consecutiveMistakeLimit}, Guidance count: ${this.consecutiveMistakeGuidanceCount}/${this.maxConsecutiveMistakeGuidance}`,
)
}

if (this.userMessageContent.length > 0) {
Expand Down Expand Up @@ -2932,4 +2965,14 @@ export class Task extends EventEmitter<TaskEvents> implements TaskLike {
console.error(`[Task] Queue processing error:`, e)
}
}

/**
* Reset consecutive mistake tracking counters.
* This should be called when a tool executes successfully to indicate
* that Roo is making progress and not stuck in a loop.
*/
public resetConsecutiveMistakeCounts(): void {
this.consecutiveMistakeCount = 0
this.consecutiveMistakeGuidanceCount = 0
}
}
276 changes: 276 additions & 0 deletions src/core/task/__tests__/Task.tokenBurning.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest"
import { Task } from "../Task"
import { ClineProvider } from "../../webview/ClineProvider"
import { ApiHandler } from "../../../api"
import { TelemetryService } from "@roo-code/telemetry"
import { DEFAULT_CONSECUTIVE_MISTAKE_LIMIT } from "@roo-code/types"

// Mock TelemetryService before any imports that might use it
vi.mock("@roo-code/telemetry", () => {
const mockTelemetryService = {
captureConsecutiveMistakeError: vi.fn(),
captureEvent: vi.fn(),
captureTaskCreated: vi.fn(),
captureTaskRestarted: vi.fn(),
captureConversationMessage: vi.fn(),
}

return {
TelemetryService: {
instance: mockTelemetryService,
initialize: vi.fn(),
},
}
})

describe("Task - Token Burning Prevention", () => {
let mockProvider: any
let mockApiHandler: any
let task: Task

beforeEach(() => {
// Mock provider
mockProvider = {
context: {
globalStorageUri: { fsPath: "/test/storage" },
extensionUri: { fsPath: "/test/extension" },
},
getState: vi.fn().mockResolvedValue({
mode: "code",
apiConfiguration: {
apiProvider: "anthropic",
apiKey: "test-key",
},
}),
postStateToWebview: vi.fn(),
postMessageToWebview: vi.fn(),
updateTaskHistory: vi.fn(),
log: vi.fn(),
} as any

// Mock API handler
mockApiHandler = {
getModel: vi.fn().mockReturnValue({
id: "claude-3-opus",
info: {
contextWindow: 200000,
supportsComputerUse: false,
},
}),
createMessage: vi.fn(),
} as any
})

afterEach(() => {
vi.clearAllMocks()
})

describe("Consecutive Mistake Guidance Limit", () => {
it("should initialize with default values", () => {
task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
task: "test task",
})

expect(task.consecutiveMistakeCount).toBe(0)
expect(task.consecutiveMistakeLimit).toBe(DEFAULT_CONSECUTIVE_MISTAKE_LIMIT)
expect(task.consecutiveMistakeGuidanceCount).toBe(0)
expect(task.maxConsecutiveMistakeGuidance).toBe(3)
})

it("should reset both counters when resetConsecutiveMistakeCounts is called", () => {
task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
task: "test task",
})

// Set some values
task.consecutiveMistakeCount = 5
task.consecutiveMistakeGuidanceCount = 2

// Reset
task.resetConsecutiveMistakeCounts()

// Both should be reset
expect(task.consecutiveMistakeCount).toBe(0)
expect(task.consecutiveMistakeGuidanceCount).toBe(0)
})

it("should increment guidance count when asking for user guidance", async () => {
task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
consecutiveMistakeLimit: 3,
task: "test task",
})

// Mock the ask method to simulate user providing feedback
task.ask = vi.fn().mockResolvedValue({
response: "messageResponse",
text: "Try a different approach",
images: undefined,
})

// Mock the say method
task.say = vi.fn().mockResolvedValue(undefined)

// Set mistake count to trigger guidance request
task.consecutiveMistakeCount = 3
task.consecutiveMistakeGuidanceCount = 0

// Create a mock recursivelyMakeClineRequests that simulates the guidance flow
const mockUserContent: any[] = []
const stack = [{ userContent: mockUserContent, includeFileDetails: false }]

// Simulate the part of recursivelyMakeClineRequests that handles mistakes
if (task.consecutiveMistakeLimit > 0 && task.consecutiveMistakeCount >= task.consecutiveMistakeLimit) {
if (task.consecutiveMistakeGuidanceCount >= task.maxConsecutiveMistakeGuidance) {
// Should not reach here in this test
expect(true).toBe(false)
}

task.consecutiveMistakeGuidanceCount++

const guidanceMessage =
task.consecutiveMistakeGuidanceCount > 1
? `I've been making too many mistakes. Could you provide some guidance or corrections to help me proceed?\n\n(Attempt ${task.consecutiveMistakeGuidanceCount}/${task.maxConsecutiveMistakeGuidance} - I'm having difficulty making progress)`
: "I've been making too many mistakes. Could you provide some guidance or corrections to help me proceed?"

await task.ask("mistake_limit_reached", guidanceMessage)
task.consecutiveMistakeCount = 0
}

expect(task.consecutiveMistakeGuidanceCount).toBe(1)
expect(task.consecutiveMistakeCount).toBe(0)
})

it("should abort task when guidance limit is exceeded", async () => {
task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
consecutiveMistakeLimit: 3,
task: "test task",
})

// Mock the say method
task.say = vi.fn().mockResolvedValue(undefined)

// Set counters to exceed limit
task.consecutiveMistakeCount = 3
task.consecutiveMistakeGuidanceCount = 3 // Already at max

// Simulate the check in recursivelyMakeClineRequests
let shouldAbort = false
if (task.consecutiveMistakeLimit > 0 && task.consecutiveMistakeCount >= task.consecutiveMistakeLimit) {
if (task.consecutiveMistakeGuidanceCount >= task.maxConsecutiveMistakeGuidance) {
await task.say(
"error",
`I've been unable to proceed despite multiple attempts and guidance. The task appears to be stuck in a loop. To prevent excessive token usage, I'm stopping here. Please review the conversation and consider:\n\n1. Providing more specific instructions\n2. Breaking down the task into smaller steps\n3. Checking if there are any environmental issues preventing progress`,
)

// In the real code, this would capture telemetry
TelemetryService.instance.captureConsecutiveMistakeError(task.taskId)

shouldAbort = true
}
}

expect(shouldAbort).toBe(true)
expect(task.say).toHaveBeenCalledWith(
"error",
expect.stringContaining("unable to proceed despite multiple attempts"),
)
expect(TelemetryService.instance.captureConsecutiveMistakeError).toHaveBeenCalledWith(task.taskId)
})

it("should show attempt count in guidance message after first attempt", async () => {
task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
consecutiveMistakeLimit: 3,
task: "test task",
})

// Mock the ask method
task.ask = vi.fn().mockResolvedValue({
response: "messageResponse",
text: "Try again",
images: undefined,
})

// Set guidance count to simulate second attempt
task.consecutiveMistakeGuidanceCount = 1
task.consecutiveMistakeCount = 3

// Simulate generating the guidance message
let guidanceMessage =
"I've been making too many mistakes. Could you provide some guidance or corrections to help me proceed?"
if (task.consecutiveMistakeGuidanceCount > 0) {
task.consecutiveMistakeGuidanceCount++ // Increment before showing
guidanceMessage = `${guidanceMessage}\n\n(Attempt ${task.consecutiveMistakeGuidanceCount}/${task.maxConsecutiveMistakeGuidance} - I'm having difficulty making progress)`
} else {
task.consecutiveMistakeGuidanceCount++
}

await task.ask("mistake_limit_reached", guidanceMessage)

expect(task.ask).toHaveBeenCalledWith(
"mistake_limit_reached",
expect.stringContaining("(Attempt 2/3 - I'm having difficulty making progress)"),
)
})

it("should log debug information when incrementing mistake count", () => {
const consoleSpy = vi.spyOn(console, "log").mockImplementation(() => {})

task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
consecutiveMistakeLimit: 3,
task: "test task",
})

// Simulate incrementing mistake count with logging
task.consecutiveMistakeCount++
console.log(
`[Task#${task.taskId}] Consecutive mistake count: ${task.consecutiveMistakeCount}/${task.consecutiveMistakeLimit}, Guidance count: ${task.consecutiveMistakeGuidanceCount}/${task.maxConsecutiveMistakeGuidance}`,
)

expect(consoleSpy).toHaveBeenCalledWith(
expect.stringContaining("Consecutive mistake count: 1/3, Guidance count: 0/3"),
)

consoleSpy.mockRestore()
})
})

describe("Token Burning Detection", () => {
it("should log error when token burning is detected", async () => {
const consoleErrorSpy = vi.spyOn(console, "error").mockImplementation(() => {})

task = new Task({
provider: mockProvider as ClineProvider,
apiConfiguration: { apiProvider: "anthropic", apiKey: "test" },
task: "test task",
})

// Set counters to trigger token burning detection
task.consecutiveMistakeGuidanceCount = 3
task.consecutiveMistakeCount = 5

// Simulate token burning detection
TelemetryService.instance.captureConsecutiveMistakeError(task.taskId)
console.error(
`[Task#${task.taskId}] Token burning detected - Guidance count: ${task.consecutiveMistakeGuidanceCount}, Mistake count: ${task.consecutiveMistakeCount}`,
)

expect(consoleErrorSpy).toHaveBeenCalledWith(
expect.stringContaining("Token burning detected - Guidance count: 3, Mistake count: 5"),
)

consoleErrorSpy.mockRestore()
})
})
})
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an integration test that simulates the full flow: mistakes → guidance request → user provides guidance → more mistakes → multiple guidance attempts → final abort. This would help ensure the entire mechanism works end-to-end.

2 changes: 1 addition & 1 deletion src/core/tools/executeCommandTool.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ export async function executeCommandTool(
return
}

task.consecutiveMistakeCount = 0
task.resetConsecutiveMistakeCounts()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good integration point! The reset is called at the right time - after validation but before the actual tool execution. This ensures we only reset when the tool is actually going to do something useful.


command = unescapeHtmlEntities(command) // Unescape HTML entities.
const didApprove = await askApproval("command", command)
Expand Down
2 changes: 1 addition & 1 deletion src/core/tools/writeToFileTool.ts
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ export async function writeToFileTool(
return
}

cline.consecutiveMistakeCount = 0
cline.resetConsecutiveMistakeCounts()

// Check if preventFocusDisruption experiment is enabled
const provider = cline.providerRef.deref()
Expand Down
Loading