Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(js): Adds a guide for interrupts. #1661

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions docs/interrupts.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a section in tools docs instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's already nearly as long as the tool guide and it's only going to get longer once I add support for confirmation, I think it should be its own thing. I'll link out to this from the bottom of the tool guide though to make sure there's paths in both directions.

Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Interrupts

_Interrupts_ are a special kind of [tool](tool-calling) that can pause the
LLM generation and tool calling loop to return control back to you. When
you're ready, generation can then be *resumed* with *replies* that will be
processed by the LLM for further generation.

The most common uses for interruptions fall into a few categories:

* **Human-in-the-Loop:** Allowing the user of an interactive AI application
to clarify needed informatino or confirm the LLM's action before it is
completed, providing a measure of safety and confidence.
* **Async Processing:** Starting an asynchronous task that can only be
completed out-of-band, such as sending an approval notification to
a human reviewer or kicking off a long-running background process.
* **Exiting Autonomous Task:** In workflows that might iterate through
a long series of tool calls, an interrupt can provide the model a way
to mark the task as complete.

## Before you begin {:#you-begin}

If you want to run the code examples on this page, first complete the steps in
the [Getting started](get-started) guide. All of the examples assume that you
have already set up a project with Genkit dependencies installed.

This page discusses one of the advanced features of Genkit model abstraction, so
before you dive too deeply, you should be familiar with the content on the
[Generating content with AI models](models) page. You should also be familiar
with Genkit's system for defining input and output schemas, which is discussed
on the [Flows](flows) page and the general methods of tool calling discussed
on the [Tool Calling](tool-calling) page.

## Overview of interrupts {:#overview-interrupts}

At a high level, this is what an interrupt looks like when interacting with an LLM:

1. The calling application prompts the LLM with a request and also includes in
the prompt a list of tools including at least one interrupt tool that the LLM
can use to generate a response.
2. The LLM either generates a complete response or generates a tool call request
in a specific format. To the LLM, an interrupt looks like any other tool call.
3. If the LLM selects the interrupt among the tool calls it generates, the Genkit
library will automatically halt generation rather than immediately passing
responses back to the model for additional processing.
4. The developer checks if an interrupt is called and performs whatever task is
needed to collect the information needed for the interrupt reply.
5. The developer resumes generation by passing an interrupt reply to the model,
returning to Step 2.

## Triggering interrupts with Genkit {:#tool-calling}

Interrupts can be triggered from any tool or by using the `defineInterrupt` method.

### Defining interrupts

The most common kind of interrupt is providing a tool that allows the LLM to
request clarification from the user, for example by asking a multiple choice
question.

For this use case, use the Genkit instance's `defineInterrupt()` method:

```ts
import { genkit, z } from 'genkit';
import { googleAI, gemini15Flash } from '@genkitai/google-ai';

const ai = genkit({
plugins: [googleAI()],
model: gemini15Flash,
});

const askQuestion = ai.defineInterrupt({
name: 'askQuestion',
description: 'use this to ask the user a clarifying question',
inputSchema: z.object({
choices: z.array(z.string()).describe('the choices to display to the user'),
allowOther: z.boolean().optional().describe('when true, allow write-ins')
}),
replySchema: z.string()
});
```

Note that interrupts have a `replySchema` instead of an output schema, although
they are treated as equivalent when passing data to the model.

### Using interrupts

Interrupts are passed into the `tools` array when generating content, just like
other types of tools. You can pass both normal tools and interrupts to the same
generate call:

* {Generate}

```ts
const response = await ai.generate({
prompt: 'Ask me a movie trivia question.',
tools: [askQuestion],
});
```

* {definePrompt}

```ts
const triviaPrompt = ai.definePrompt(
{
name: 'triviaPrompt',
tools: [askQuestion],
input: {
schema: z.object({subject: z.string()})
},
prompt: 'Ask me a trivia question about {{subject}}.',
}
);

const response = await triviaPrompt({ subject: 'computer history' });
```

* {Prompt file}

```none
---
tools: [askQuestion]
input:
schema:
partyType: string
---
{{role "system}}
Use the askQuestion tool if you need to clarify something.

{{role "user"}}
Help me plan a {{partyType}} party next week.
```

Then you can execute the prompt in your code as follows:

```ts
// assuming prompt file is named partyPlanner.prompt
const partyPlanner = ai.prompt('partyPlanner');

const response = await partyPlanner({ partyType: 'birthday' });
```

* {Chat}

```ts
const chat = ai.chat({
system: 'Use the askQuestion tool if you need to clarify something.',
tools: [askQuestion],
});

const response = await chat.send('make a plan for my birthday party');
```

Genkit will immediately return a response once an interrupt tool is triggered.

### Replying to interrupts

If you've passed one or more interrupt tools to your generate call, you will
need to check the response for interrupts so that you can handle them:

```ts
// you can check the 'finishReason' of the response
response.finishReason === 'interrupted'
// or you can check to see if any interrupt requests are on the response
response.interrupts.length > 0
```

Replying to an interrupt is done using the `resume` option on a subsequent
generate call, making sure to pass in the existing history. Each tool has
a `.reply()` method on it to help construct the reply.

Once resumed, the model will re-enter the generation loop including tool
execution until it either completes or another interrupt is triggered:

```ts
let response = await ai.generate({
tools: [askQuestion],
system: 'ask clarifying questions until you have a complete solution',
prompt: 'help me plan a backyard BBQ',
});

while (response.interrupts.length) {
const answers = [];
// multiple interrupts can be called at once, so we handle them all
for (const question in response.interrupts) {
answers.push(
// use the 'reply' method on our tool to populate answers
askQuestion.reply(
question,
// send the tool request input to the user to respond
await askUser(question.toolRequest.input)
)
);
}

response = await ai.generate({
tools: [askQuestion],
messages: response.messages,
resume: {
reply: answers
}
})
}

// no more interrupts, we can see the final response
console.log(response.text);
```
16 changes: 14 additions & 2 deletions docs/tool-calling.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,10 +196,22 @@ Include defined tools in your prompts to generate content.
Genkit will automatically handle the tool call if the LLM needs to use the
`getWeather` tool to answer the prompt.

### Explicitly handling tool calls
### Pausing the tool loop with interrupts

By default, Genkit repeatedly calls the LLM until every tool call has been
resolved. If you want more control over this tool calling loop, for example to
resolved. In some situations you may wish to conditionally pause execution to:

- ask the user a question or display UI
- confirm a potentially risky action with the user
- request out-of-band approval for an action

**Interrupts** are special tools that can halt the loop and return control
to your code so that you can handle more advanced scenarios. Visit the
[interrupts guide](interrupts) to learn how to use them.

### Explicitly handling tool calls

If you want full control over this tool calling loop, for example to
apply more complicated logic, set the `returnToolRequests` parameter to `true`.
Now it's your responsibility to ensure all of the tool requests are fulfilled:

Expand Down
Loading