Skip to content

Show model thinking in streaming responses #47

@Federaffo

Description

@Federaffo

Summary

Flue supports streaming responses, but it does not currently seem to stream model thinking or reasoning output.

What I Expected

When using a model that supports thinking or reasoning, I expected the streaming response to include those thinking updates, similar to how it streams normal text output and tool events.

What Happens Today

The stream includes normal assistant text and tool activity, but not thinking or reasoning updates.

This makes it hard to build UIs that show what the agent is doing while it is reasoning, especially for longer-running agent tasks.

Why This Would Be Useful

Showing thinking or reasoning updates would help users understand:

  • whether the agent is still actively working
  • what phase of reasoning it is in
  • why a long-running response is taking time
  • what is happening before tools are called or final text is produced

Request

Please expose thinking or reasoning events in the streaming API when the underlying model or provider supports them.

It would also be useful to have a public option for enabling or configuring thinking level, for example off, minimal, low, medium, or high.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions