You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the PR #777, the support for DecodeSpecialTokens was added. In some of my testing, I thought it would be useful to toggle this feature (right now, it appears to default to false!), but found that due to the way StreamingTokenDecoder is used (at least in StatelessExecutor), I am unable to toggle it.
It seems an implementation bug, rather than a feature request, since the feature is there, just not exposed. Additionally, I cannot implement my own StatelessExecutor implementation because it relies on methods that are internal to NativeApi.
The actual EFFECT of this is that models like Devstral that, in GGUF form, classify the token [TOOL_CALLS] as a special token to be output before any tool call-formatted output, will end up returning this token but not actually stream it out of the executor to the caller.
Reproduction Steps
Run a Devstral GGUF from LlamaSharp source with an appropriate tool call prompt - compare, for example, to Ollama in raw mode to see the [TOOL_CALLS] token.