Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AsChatClient for OpenAI's AssistantClient #5852

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Feb 6, 2025

  • Add ChatOptions.ConversationId.
  • Add FunctionCallContext/FunctionResultContent.Context opaque property.
  • Adds an AsChatClient extension method for creating an IChatClient from an AssistantClient.
Microsoft Reviewers: Open in CodeFlow

- Add ChatOptions.ConversationId.
- Add FunctionCallContext/FunctionResultContent.Context opaque property.
- Adds an AsChatClient extension method for creating an IChatClient from an AssistantClient.
@@ -71,6 +74,7 @@ public virtual ChatOptions Clone()
{
ChatOptions options = new()
{
ConversationId = ConversationId,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to other naming suggestions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like OpenAI's term is "thread ID", but perhaps you didn't use that because of the clash with .NET's unrelated notion of thread IDs

From a quick look, I don't see any equivalent in Gemini, Mistral, or Anthropic. Context.ai does have its own equivalent and calls it a "thread" (but not this is an eval too, not an LLM itself).

Is there any prior art for the term "conversation ID"? If not and we wanted to avoid making up novel terminology, we could go with either ThreadId or ChatThreadId to avoid ambiguity with System.Threading.

Comment on lines +92 to +112
if (assistantId is null)
{
await foreach (var assistant in _assistantClient.GetAssistantsAsync(cancellationToken: cancellationToken).ConfigureAwait(false))
{
assistantId = assistant.Id;
break;
}

if (assistantId is null)
{
if (options?.ModelId is string model)
{
var assistant = await _assistantClient.CreateAssistantAsync(model, cancellationToken: cancellationToken).ConfigureAwait(false);
assistantId = assistant.Value.Id;
}
else
{
Throw.ArgumentException(nameof(options), "No assistant was available, and no model was supplied for which to create an assistant.");
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior may not be intuitive. Not sure what would be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'd be happy to bypass the problem and make assistantId be required, so that developers understand they have to think about how the assistant is defined and configured (e.g., giving it instructions, tools, etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I had initially, I can go back to it.


runOptions.AdditionalMessages.Clear();

updates = _assistantClient.CreateThreadAndRunStreamingAsync(assistantId, creationOptions, runOptions, cancellationToken: cancellationToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here. It's creating a new thread and storing the ID back into ChatOptions. Not sure what would be better.

Copy link
Member

@SteveSandersonMS SteveSandersonMS Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit quirky. Maybe I've forgotten but is this the first case where we'd mutate ChatOptions automatically? It might not compose well with middleware since middleware may use options.Clone() and hence just loses the ID.

The alternative that follows our existing patterns would be to put the ID onto the resulting ChatCompletion/StreamingChatCompletionUpdate. Developers would then assign this to their ChatOptions manually, which is the equivalent to appending the resulting message to a List<ChatMessage>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative that follows our existing patterns would be to put the ID onto the resulting ChatCompletion/StreamingChatCompletionUpdate

That doesn't work well with middleware like function invocation unless we somehow standardize that pattern.

As with assistant ID, I could also revert here back to what I had initially, which was requiring a thread I'd to be specified.


// Now that we've gathered everything we need from the chat messages, clear it out. We don't want
// the same messages resubmitted in the next request.
chatMessages.Clear();
Copy link
Member Author

@stephentoub stephentoub Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might also be unexpected, but not sure how else to make the model work well.

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Caching.Hybrid Line 86 82.77 🔻
Microsoft.Extensions.AI.OpenAI Line 77 71.76 🔻
Microsoft.Extensions.AI.OpenAI Branch 77 53.47 🔻
Microsoft.Gen.MetadataExtractor Line 98 57.35 🔻
Microsoft.Gen.MetadataExtractor Branch 98 62.5 🔻
Microsoft.Extensions.AI.Ollama Line 80 78.25 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI 88 89
Microsoft.Extensions.AI.Abstractions 83 84

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=943532&view=codecoverage-tab

/// If not <see langword="null"/>, the value of this property should be stored into any <see cref="FunctionResultContent"/>
/// created to represent the result of this function call.
/// </remarks>
public object? Context { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit weird because you might ask why not define CallId as type object, and then the provider can bundle any amount of ID/state in there that it likes, and the FRC creator would not need to write any extra code since they already know they have to provide the same ID on the result.

But then the call ID would typically get lost when round-tripping through serialization, and perhaps that's a problem we need to avoid, and is solved by having CallId be string. So my remaining question is: is it OK for Context to be lost when round-tripping through serialization? And if so, how can we communicate that to someone who's dealing with chat history serialization?

Copy link
Member

@eiriktsarpalis eiriktsarpalis Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second that thought, but I should point out that roundtripping object instances is problematic unless you use unsafe/deprecated serializers like BinaryFormatter (it certainly won't work as expected for arbitrary objects using STJ). Could Context be a concrete type like string? And if so could we just piggyback on the existing CallId property?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So bake the run id into the call I'd? I can try that. Seems like it should work.

/// <param name="assistantClient">The underlying client.</param>
/// <param name="assistantId">The ID of the assistant to use.</param>
/// <param name="threadId">The ID of the thread to use.</param>
public OpenAIAssistantClient(AssistantClient assistantClient, string? assistantId, string? threadId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will developers create/obtain a threadId in typical use cases?

Does one get created automatically if you try to send a message without any thread ID, and if so should we be exposing it on the ChatCompletion/StreamingChatCompletionUpdate?

Alternatively we could expose some abstractions over conversation thread management. I'd totally understand if we don't want to take that on right now while there aren't even multiple "assistant" providers to abstract over.

If the only way to use this right now is in conjunction with concrete use of the OpenAI client to manage threads, that's OK. Presuambly our argument for why we want to add this feature to MEAI.Abstractions right now is that we want to be sure we can extend to support this and it's only a minimally invasive change to MEAI.Abstractions if we do it now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've read the code below now which clarifies how this works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presuambly our argument for why we want to add this feature to MEAI.Abstractions right now is that we want to be sure we can extend to support this

Yes

/// <inheritdoc />
public Task<ChatCompletion> CompleteAsync(
IList<ChatMessage> chatMessages, ChatOptions? options = null, CancellationToken cancellationToken = default) =>
CompleteStreamingAsync(chatMessages, options, cancellationToken).ToChatCompletionAsync(coalesceContent: true, cancellationToken);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we could change this to use the nonstreaming APIs in the future if we wanted, and wouldn't consider that a breaking change.

Just asking because historically there have been behavioral/capability differences between streaming and nonstreaming, leading to practical reasons why the nonstreaming calls need to use the nonstreaming APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we could change this to use the nonstreaming APIs in the future if we wanted, and wouldn't consider that a breaking change.

Yes

// Ensure we have an assistant, finding one if none was specified, and creating one if none was available.
if (assistantId is null)
{
await foreach (var assistant in _assistantClient.GetAssistantsAsync(cancellationToken: cancellationToken).ConfigureAwait(false))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel right - it's just picking an arbitrary assistant. But if I understand correctly, assistants have existing configuration (e.g., "instructions" (system prompt), tools, vector store, files). So in general a developer wouldn't do well to pick one arbitrarily, since in any practical case they need to have one correctly configured for their scenario.

I think we should make assistantId be a required parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative would be to drop lines 94-98 so that, if assistantId is not specified, we always create a new one. But I don't think that's necessarily what people want either as then the OpenAI account will end up with a vast number of randomly named assistants without a good way for the developer to clean them up later.

Comment on lines +121 to +124
else if (options is null)
{
Throw.ArgumentException(nameof(options), "No thread ID was provided, and no options was supplied to store a new thread's ID into.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could avoid this restriction as per https://github.com/dotnet/extensions/pull/5852/files#r1946310023

@@ -9,6 +9,9 @@ namespace Microsoft.Extensions.AI;
/// <summary>Represents the options for a chat request.</summary>
public class ChatOptions
{
/// <summary>Gets or sets an optional identifier used to associate a request with an existing conversation.</summary>
public string? ConversationId { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our recent offline conversation, this feels to me like information that could be encapsulated by a particular IChatClient implementation. I know that our approach so far has been to offload all context minus connection strings to the message log and options, but it makes me wonder if the new assistants API is signal that we might want to consider encapsulating information pertaining to a specific session/thread/conversation.

private (string? AssistantId, string? ThreadId) GetIds(ChatOptions? options)
{
// Get the assistant ID to use.
string? assistantId = options?.AdditionalProperties?.TryGetValue(AssistantIdKey, out string? s) is true ? s : _assistantId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally prefer it if there was only one way of specifying these id's, ideally from the constructor. The ChatOptions could have been deserialized from the wire, so in a hypothetical agent implementation these could be overridden by callers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already allow things like model and dimensions to be specified in multiple places, with one baked into the instance and options used for ovverides or if one wasn't specified to construction. And in practice, there's nothing we can do to avoid that... anyone can define an IChatClient with any defaults built into it for things otherwise specified by ChatOptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants