Skip to content

[BUG]: DecodeSpecialTokens is exposed by the StreamingTokenDecoder but is not accessible when using an ILlamaExecutor #1201

@jacob-mink-1996

Description

@jacob-mink-1996

Description

In the PR #777, the support for DecodeSpecialTokens was added. In some of my testing, I thought it would be useful to toggle this feature (right now, it appears to default to false!), but found that due to the way StreamingTokenDecoder is used (at least in StatelessExecutor), I am unable to toggle it.

It seems an implementation bug, rather than a feature request, since the feature is there, just not exposed. Additionally, I cannot implement my own StatelessExecutor implementation because it relies on methods that are internal to NativeApi.

The actual EFFECT of this is that models like Devstral that, in GGUF form, classify the token [TOOL_CALLS] as a special token to be output before any tool call-formatted output, will end up returning this token but not actually stream it out of the executor to the caller.

Reproduction Steps

Run a Devstral GGUF from LlamaSharp source with an appropriate tool call prompt - compare, for example, to Ollama in raw mode to see the [TOOL_CALLS] token.

Environment & Configuration

  • Operating system:
  • .NET runtime version:
  • LLamaSharp version: 0.24.0
  • CUDA version (if you are using cuda backend):
  • CPU & GPU device:

Known Workarounds

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions