Skip to content

Conversation

chenmoneygithub
Copy link
Collaborator

This is a followup work of #8764. This problem is harder than it looks like because:

  • We need to have ChainOfThought work for both reasoning model and non-reasoning model. And if reasoning model is used, we should avoid having the model generate the reasoning field.
  • Without explicit reasoning field in output fields, dspy.Predict should not return reasoning. Basically we need to keep the signature faithful.

That basically means we need a way to identify users are specifying "reasoning" in their signature, and it fits into our definition of predefined type. Users are not forbidden from using reasoning of string type, but it won't use the native reasoning.

@chenmoneygithub chenmoneygithub marked this pull request as draft September 18, 2025 01:47
@chenmoneygithub chenmoneygithub changed the title [WIP] Allow DSPy to use the native reasoning from models Allow DSPy to use the native reasoning from models Sep 18, 2025
@chenmoneygithub chenmoneygithub marked this pull request as ready for review September 18, 2025 20:41
if not litellm.supports_reasoning(lm.model):
return False

if "reasoning_effort" in lm_kwargs:
Copy link
Collaborator

@TomeHirata TomeHirata Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to check lm_kwargs, instead we should probably overwrite reasoning_effort if native reasoning is on

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so reasoning_effort has multiple valid values - low, medium, and high. I am setting the default value as low, but if users specify the other values we want to keep those unchanged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default reasoning_effort on OpenAI reasoning models ismedium, so I'd be careful setting it differently here

def __init__(
self,
signature: str | type[Signature],
rationale_field: FieldInfo | None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we show deprecation warnings if these arguments are used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, let me add that


def _process_completion(self, response, merged_kwargs):
"""Process the response of OpenAI chat completion API and extract outputs.
Copy link
Collaborator

@TomeHirata TomeHirata Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it auto-formatting? Shall we add lint rule so that linting does not change lines unrelated a PR's scope?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's my plugin for trailing whitespaces. I didn't expect this PR to be this big at the first place...

tool_calls = []
reasoning_contents = []

for output_item in response.output:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm confused. What's the meaning of multiple items in the output field? I thought it'd be the same as choices of chat completion

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a weird point from Responses API - all these items in output belongs to one answer. Basically Responses API doesn't have the n arg that controls the number of answer to be generated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline. It seems all output items represent one choice.

return formatted

@classmethod
def adapt_to_native_lm_feature(cls, lm, lm_kwargs) -> bool:
Copy link
Collaborator

@TomeHirata TomeHirata Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the naming, how about is_native_lm_feature_available?


if "reasoning_effort" in lm_kwargs:
# `lm_kwargs` overrides `lm.kwargs`
reasoning_effort = lm_kwargs["reasoning_effort"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can simplify by reasoning_effort = lm_kwargs.get("reasoning_effort") or lm.kwargs.get("reasoning_effort")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh good catch! The structure is right but code is actually buggy - if users explicitly turn off native reasoning, we should respect the setting, so I am not using get to capture the value. Updated the code.


from dspy.adapters.types.reasoning import Reasoning

extended_signature = signature.prepend(name="reasoning", field=OutputField(), type_=Reasoning)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we continue to return string in the reasoning field for backward compatibility?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dspy.Reasoning is a str-like type. Basically:

  • Printing a dspy.Reasoning prints out the string part, i.e., content.
  • Users can directly compare a dspy.Reasoning to a string.

We need this specific type to modify the signature inside Adapter. I don't feel this is a perfectly clean solution, but it's the most robust way in my mind. would like to hear your thoughts!

not self._warned_zero_temp_rollout
and rollout_id is not None
and (temperature is None or temperature == 0)
# Normalize reasoning_effort to get reasoning summaries (for OpenAI reasoning models which don't expose
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move this logic to _convert_chat_request_to_responses_request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, done

for content_item in output_item.content:
reasoning_contents.append(content_item.text)
elif getattr(output_item, "summary", None) and len(output_item.summary) > 0:
for summary_item in output_item.summary:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this summary field used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another weird spot of Responses API - according to the API documentation, the reasoning content can be contained in content or summary of the reasoning block: https://platform.openai.com/docs/api-reference/responses/object. While we use the content as golden, the API itself suggests that it's possible the reasoning is contained in summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants