-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AsChatClient for OpenAI's AssistantClient #5852
base: main
Are you sure you want to change the base?
Conversation
- Add ChatOptions.ConversationId. - Add FunctionCallContext/FunctionResultContent.Context opaque property. - Adds an AsChatClient extension method for creating an IChatClient from an AssistantClient.
@@ -71,6 +74,7 @@ public virtual ChatOptions Clone() | |||
{ | |||
ChatOptions options = new() | |||
{ | |||
ConversationId = ConversationId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to other naming suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like OpenAI's term is "thread ID", but perhaps you didn't use that because of the clash with .NET's unrelated notion of thread IDs
From a quick look, I don't see any equivalent in Gemini, Mistral, or Anthropic. Context.ai does have its own equivalent and calls it a "thread" (but not this is an eval too, not an LLM itself).
Is there any prior art for the term "conversation ID"? If not and we wanted to avoid making up novel terminology, we could go with either ThreadId
or ChatThreadId
to avoid ambiguity with System.Threading
.
if (assistantId is null) | ||
{ | ||
await foreach (var assistant in _assistantClient.GetAssistantsAsync(cancellationToken: cancellationToken).ConfigureAwait(false)) | ||
{ | ||
assistantId = assistant.Id; | ||
break; | ||
} | ||
|
||
if (assistantId is null) | ||
{ | ||
if (options?.ModelId is string model) | ||
{ | ||
var assistant = await _assistantClient.CreateAssistantAsync(model, cancellationToken: cancellationToken).ConfigureAwait(false); | ||
assistantId = assistant.Value.Id; | ||
} | ||
else | ||
{ | ||
Throw.ArgumentException(nameof(options), "No assistant was available, and no model was supplied for which to create an assistant."); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior may not be intuitive. Not sure what would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'd be happy to bypass the problem and make assistantId
be required, so that developers understand they have to think about how the assistant is defined and configured (e.g., giving it instructions, tools, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I had initially, I can go back to it.
|
||
runOptions.AdditionalMessages.Clear(); | ||
|
||
updates = _assistantClient.CreateThreadAndRunStreamingAsync(assistantId, creationOptions, runOptions, cancellationToken: cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here. It's creating a new thread and storing the ID back into ChatOptions. Not sure what would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit quirky. Maybe I've forgotten but is this the first case where we'd mutate ChatOptions
automatically? It might not compose well with middleware since middleware may use options.Clone()
and hence just loses the ID.
The alternative that follows our existing patterns would be to put the ID onto the resulting ChatCompletion
/StreamingChatCompletionUpdate
. Developers would then assign this to their ChatOptions
manually, which is the equivalent to appending the resulting message to a List<ChatMessage>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative that follows our existing patterns would be to put the ID onto the resulting ChatCompletion/StreamingChatCompletionUpdate
That doesn't work well with middleware like function invocation unless we somehow standardize that pattern.
As with assistant ID, I could also revert here back to what I had initially, which was requiring a thread I'd to be specified.
|
||
// Now that we've gathered everything we need from the chat messages, clear it out. We don't want | ||
// the same messages resubmitted in the next request. | ||
chatMessages.Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might also be unexpected, but not sure how else to make the model work well.
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=943532&view=codecoverage-tab |
/// If not <see langword="null"/>, the value of this property should be stored into any <see cref="FunctionResultContent"/> | ||
/// created to represent the result of this function call. | ||
/// </remarks> | ||
public object? Context { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit weird because you might ask why not define CallId
as type object
, and then the provider can bundle any amount of ID/state in there that it likes, and the FRC creator would not need to write any extra code since they already know they have to provide the same ID on the result.
But then the call ID would typically get lost when round-tripping through serialization, and perhaps that's a problem we need to avoid, and is solved by having CallId
be string
. So my remaining question is: is it OK for Context
to be lost when round-tripping through serialization? And if so, how can we communicate that to someone who's dealing with chat history serialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second that thought, but I should point out that roundtripping object
instances is problematic unless you use unsafe/deprecated serializers like BinaryFormatter (it certainly won't work as expected for arbitrary objects using STJ). Could Context
be a concrete type like string? And if so could we just piggyback on the existing CallId
property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So bake the run id into the call I'd? I can try that. Seems like it should work.
/// <param name="assistantClient">The underlying client.</param> | ||
/// <param name="assistantId">The ID of the assistant to use.</param> | ||
/// <param name="threadId">The ID of the thread to use.</param> | ||
public OpenAIAssistantClient(AssistantClient assistantClient, string? assistantId, string? threadId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will developers create/obtain a threadId
in typical use cases?
Does one get created automatically if you try to send a message without any thread ID, and if so should we be exposing it on the ChatCompletion
/StreamingChatCompletionUpdate
?
Alternatively we could expose some abstractions over conversation thread management. I'd totally understand if we don't want to take that on right now while there aren't even multiple "assistant" providers to abstract over.
If the only way to use this right now is in conjunction with concrete use of the OpenAI client to manage threads, that's OK. Presuambly our argument for why we want to add this feature to MEAI.Abstractions right now is that we want to be sure we can extend to support this and it's only a minimally invasive change to MEAI.Abstractions if we do it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've read the code below now which clarifies how this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presuambly our argument for why we want to add this feature to MEAI.Abstractions right now is that we want to be sure we can extend to support this
Yes
/// <inheritdoc /> | ||
public Task<ChatCompletion> CompleteAsync( | ||
IList<ChatMessage> chatMessages, ChatOptions? options = null, CancellationToken cancellationToken = default) => | ||
CompleteStreamingAsync(chatMessages, options, cancellationToken).ToChatCompletionAsync(coalesceContent: true, cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably we could change this to use the nonstreaming APIs in the future if we wanted, and wouldn't consider that a breaking change.
Just asking because historically there have been behavioral/capability differences between streaming and nonstreaming, leading to practical reasons why the nonstreaming calls need to use the nonstreaming APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably we could change this to use the nonstreaming APIs in the future if we wanted, and wouldn't consider that a breaking change.
Yes
// Ensure we have an assistant, finding one if none was specified, and creating one if none was available. | ||
if (assistantId is null) | ||
{ | ||
await foreach (var assistant in _assistantClient.GetAssistantsAsync(cancellationToken: cancellationToken).ConfigureAwait(false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel right - it's just picking an arbitrary assistant. But if I understand correctly, assistants have existing configuration (e.g., "instructions" (system prompt), tools, vector store, files). So in general a developer wouldn't do well to pick one arbitrarily, since in any practical case they need to have one correctly configured for their scenario.
I think we should make assistantId
be a required parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be to drop lines 94-98 so that, if assistantId
is not specified, we always create a new one. But I don't think that's necessarily what people want either as then the OpenAI account will end up with a vast number of randomly named assistants without a good way for the developer to clean them up later.
else if (options is null) | ||
{ | ||
Throw.ArgumentException(nameof(options), "No thread ID was provided, and no options was supplied to store a new thread's ID into."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could avoid this restriction as per https://github.com/dotnet/extensions/pull/5852/files#r1946310023
@@ -9,6 +9,9 @@ namespace Microsoft.Extensions.AI; | |||
/// <summary>Represents the options for a chat request.</summary> | |||
public class ChatOptions | |||
{ | |||
/// <summary>Gets or sets an optional identifier used to associate a request with an existing conversation.</summary> | |||
public string? ConversationId { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per our recent offline conversation, this feels to me like information that could be encapsulated by a particular IChatClient
implementation. I know that our approach so far has been to offload all context minus connection strings to the message log and options, but it makes me wonder if the new assistants API is signal that we might want to consider encapsulating information pertaining to a specific session/thread/conversation.
private (string? AssistantId, string? ThreadId) GetIds(ChatOptions? options) | ||
{ | ||
// Get the assistant ID to use. | ||
string? assistantId = options?.AdditionalProperties?.TryGetValue(AssistantIdKey, out string? s) is true ? s : _assistantId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally prefer it if there was only one way of specifying these id's, ideally from the constructor. The ChatOptions
could have been deserialized from the wire, so in a hypothetical agent implementation these could be overridden by callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already allow things like model and dimensions to be specified in multiple places, with one baked into the instance and options used for ovverides or if one wasn't specified to construction. And in practice, there's nothing we can do to avoid that... anyone can define an IChatClient with any defaults built into it for things otherwise specified by ChatOptions.
Microsoft Reviewers: Open in CodeFlow