Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Why classes use the final modifier #211

Open
tryvin opened this issue Feb 11, 2025 · 7 comments
Open

[QUESTION] Why classes use the final modifier #211

tryvin opened this issue Feb 11, 2025 · 7 comments
Labels
question Further information is requested

Comments

@tryvin
Copy link

tryvin commented Feb 11, 2025

Hey Guys,

I'm trying to use llm-chain in a project here, and one thing I keep hitting is the final modifier in classes. Every class here has the final modifier, which makes some things difficult.

For example, right now I'm trying to implement provider usage metadata in the TextResponse class, so I can store how much this request was charged for, and the same applies to basically all *Response classes.

Usually, I would just make my class return an extended version of the base class, for example, I have a Bedrock ResponseConverter class, so, in my convert method, I could just return something like new TextResponseWithUsageMetadata($choices[0]->getContent(), $usageInfo), but unfortunately I can't, as the PhpLlm\LlmChain\Model\Response\TextResponse is marked with final, so I would need to copy the PhpLlm\LlmChain\Model\Response\TextResponse code to my own TextResponseWithUsageMetadata just to add a single property.

The same thing applies to the ChainProcessor in the ToolBox, I needed to change just the processOutput method and had to copy the whole class code to add two lines of code to the processOutput method.

So, my main question is, do we have a strong reason to keep the final modifier in every class? IMHO I see final useful for the "final" user of the code, where you are in total control of your code and you can decide if an XYZ class will remain final or not as the project evolves, but using final in a library which will be used embedded into other projects make the final user life kinda "difficult", as we have to workaround just to add some small features that our projects might need.

Don't get me wrong, I love what you guys are doing here.

@OskarStark
Copy link
Contributor

Hey @tryvin,

thanks for your feedback and your kind words – we really appreciate it!

The reason why we consistently use final is to ensure API stability and to uphold our BC promise. If we make everything extendable, we lose control over how the library is used, and it becomes much harder to guarantee backward compatibility.

Moreover, keeping classes final encourages users to share their use cases with us. This way, we can evaluate whether it makes sense to remove the final modifier in specific cases or if the library should be extended with a new feature to accommodate these needs.

If you have a concrete proposal for a change that would help your use case, feel free to open an issue or start a discussion – we’re always open to improving the library in a way that benefits the community.

@OskarStark OskarStark added the question Further information is requested label Feb 12, 2025
@chr-hertel
Copy link
Member

Hi @tryvin,

Thanks for reaching out and bringing your use case! Oskar already summarized it quite well and I just want to add that we want to be quite explicit about extension points - and I know that this is quite defensive. But on the other hand this is exactly the kind of feedback loop and ideas that are coming from that and the library itself can become better through your input.

In your example case I would argue that it is totally valid and we should extend the base implementation itself.

Maybe all response classes just should have a metadata bag and/or the raw response - so not everything is abstracted away. I think that is basically the same solution that would help with #207.

Do you think the idea of PR #175 would help your issue?

@tryvin
Copy link
Author

tryvin commented Feb 14, 2025

Hey @chr-hertel and @OskarStark thanks for all the responses, maybe we can discuss how to improve things from an external developer perspective?

So, lemme explain my use case, which I would say it's pretty "standard" from a "business" perspective.

We are building a pretty usual chatbot, the typical you set the personality, you can add XYZ tools to it, and a user can chat with it while we store the conversation so we can rehydrate the MessageBag for the model to have context.

But as with every business, you do need to have some things in place

  • We needed to get all the messages that went back and forth in a Chain, so we have our own MessageBagChainProcessor which is a copy of the v0.14.1 ChainProcessor, with the addition of two callbacks, the onMessageArrived and onToolCallsMessage, I see now that you have this EventDispatcher which dispatches a ToolCallsExecuted event, which is nice, but we need to have all the messages that were sent back and forth, so using this same logic, maybe adding more events here would be useful, like MessageReceived, ToolCallMessageSent, ToolCallMessageReceived, ToolCallBeingExecuted (this can be inside the ToolBox).
  • We don't care about MessageReceived events outside of the Chain as the usual sendRequest, getMessage already have the message, still, maybe we can add this MessageReceived to the platform, so all messages received from the platform would trigger this event, regardless if they are being called from a chain or not.
  • Also, one other reason we care about all the messages that are being sent back and forth is that we need to calculate the token usage per message, we did a "hack" here to get this token usage per message, even though it is not 100%, it does its job in our use case, as we don't use streamed responses, it was easier to grab this information, but I think the same principle applies to a streamed message, as you only care about the tokens at the end, or if the underlying platform does provide it, maybe we can stream messages in a "Chunk" class, which would have a TokenUsage property, so you can track tokens as they arrive and then have it summed at the end of the stream.
  • Still, I do agree with the addition of the Metadata, that would help a lot when you need to get specific vendor data, if you know which vendors you work with, you know that XYZ data will be there, and the underlying class doesn't need to conform with every vendor particularities.

Now, on the final and read-only

I think that a framework (library if you will) might not want to have everything under a final and read-only modifier, I do get it, you guys want to have some sense of "we made this work that way, so it will continue working that way", but the way libraries work is that we as users of this library (the developers), can modify small parts of the behavior to make it work with our use cases, and we take full responsibility if something breaks because of our changes, as we are well aware that we were the ones messing around at the first place.

Still, if you want to continue with the final and read-only modifiers, maybe you are open to removing them from everything in the src/Models folder, as these are basic data structures, and there are cases where we might want to extend them, for example:

Amazon Bedrock has cache identifiers, which you can use to save on some tokens, some of their models support more than one cache identifier, while others support only the system message cache, so, say I'm developing my model provider for Bedrock, and I want to add support for this cache identifier thing (which today the library doesn't provide), I as a developer could maybe just extend the SystemMessage class, and maybe add something like public string $cacheIdentifier, and then, in my ModelClient at the request method, I could just check if the input is instanceof NewSystemMessageWithCacheIdentifier and if so, format my bedrock request to add this specific field.

The same thing also applies to basic "models" everywhere, for example, speaking of the Chain class, I had to copy it because I couldn't add the token usage to the message, hence had to add the token usage to the output, which in turn I couldn't add a new property to the Output class as it was final, so I had to copy the Output class, add my new property, then I also had to copy the Chain class, because the Chain class expected Output instead of some OutputInterface.

TLDR

Sorry if this sounds like a rant or something, it's just that I spent too much time adding this feature to my project because I had to go back and forth, copying files, adding properties, and dealing with some exceptions because I just wanted to track how much tokens were used and track all the messages that were going back and forth in a chain.

@tryvin
Copy link
Author

tryvin commented Feb 14, 2025

For reference, this is my LLM folder, anything that you see with the same name from php-llm/llm-chain is exactly that, a copy because I had to add some property or chance some method

Image

@chr-hertel
Copy link
Member

Even if i don't really agree on your generalized takes on how a library should be, i still value your input - and like i wrote before - i think you have valid points here when it comes to the data classes in Model namespace.

and, about the nature of this project, it's basically a few guys sharing some code with MIT license that we need in our projects - also meaning, you are very much welcome not only to add requirements but also contribute solutions :)

and please don't get me wrong, this is an honest invitation since we're def open for your thoughts, use-cases and contributions

so let's break it down into action items and potential PRs:

  1. open up response and message classes - even switch to inheritance
  2. add raw response and metadata to response classes
  3. toolbox events
  4. token usage

WDYT?

@tryvin
Copy link
Author

tryvin commented Feb 14, 2025

Sure, that totally works for me, I was about to do a pull request, but I was in a tight time-line and couldn't do the proper thing to this code to only then import it to my project.

If I have the time over the weekend I think I can come up with a PR with what you wrapped up in your comment :)

Also, I think that specially data classes (things under /Models), and some other across the source code are the most likely to be open and available for inheritance

@OskarStark
Copy link
Contributor

Please keep in mind to structure the PRs as reviewable / small as possible 🙋‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants