-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Added continuation support for Amazon Bedrock #675
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: JonahSussman <[email protected]>
313ae17
to
fbc3408
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable. Does using a model that requires continuation break the caching?
case SupportedModelProviders.CHAT_DEEP_SEEK: | ||
return ModelProviderChatDeepSeek(config, demo_mode, cache) | ||
case _: | ||
assert_never(config.provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a typed exception rather than an assertion, unless it actually can't be reached. I'm assuming we can trigger this with bad configuration though right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new thing I recently found out about type checking in Python. If a type checker thinks that something can't happen, it will be assigned the type Never
. This allows you to do exhaustiveness checking and get a static error if you don't handle every case.
We shouldn't ever have an issue with config because the Pydantic model makes sure it's one of the enum values.
@fabianvf It shouldn't since we cache |
Signed-off-by: JonahSussman <[email protected]>
I did hit. Saw this error when processing: "Stateless EJBs can be converted to a CDI bean by replacing the @stateless annotation with a scope eg @ApplicationScoped" at Medium Effort |
Signed-off-by: JonahSussman <[email protected]>
Signed-off-by: JonahSussman <[email protected]>
Closes #136
Closes #350
I refactored
model_provider.py
, putting eachModelProvider
into its own class. This will allow you to override certain model-specific functionality. For example, a custom LLM invoke for Bedrock (the main reason I refactored)Bedrock will now continue to generate tokens even if
max_tokens
is reached. Look at the following Python script and attached logs:tester.log
Ugly langchain color codes aside, you can see how we do two requests now, resulting in one continuous poem at the end.
I tested locally with Amazon Bedrock and OpenAI, but I would like some assistance testing other providers to make sure I didn't mess anything else up.
Separately, we should probably think about how to integrate streaming into the project. It should be fairly straightforward now with the
Chatter
class. We can create a message with a separatemessageToken
and update it as we get more chunks. WDYT?