Skip to content

[Bug] JSONAdapter's parse method regex pattern bug #8759

@hufeng

Description

@hufeng

What happened?

What happened?

Use dspy.ChainOfThought to execute task signature, as shown in the following example

class Issue(BaseModel):
    issue_type: str
    severity_level: str
    problem_code_snippet: str


class FindIssue(dspy.Signature):
    code = dspy.InputField(description="check code")
    issue_list: list[Issue] = dspy.OutputField(description="issue list")


class FindIssueModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.find_issue = dspy.ChainOfThought(FindIssue)

    def forward(self, code: str):
        return self.find_issue(code=code)

fim = FindIssueModule()
fim(code="...code...")

llm output

'{\n  "reasoning": "think step by step",\n  "issue_list": [\n    {\n  "issue_type": "some type",\n      "severity_level": "fatal",\n  "problem_code_snippet": "if (user) {"\n    }\n  ]\n}'

error message

Adapter JSONAdapter failed to parse the LM response.

{
  "issue_type": "some type",
  "severity_level": "fatal",
  "problem_code_snippet": "if (user) {"
    }
  ]
}

Expected to find output fields in the LM response: [reasoning, issue_list]

Problem diagnosis

I checked the code and output and found that the problem lies in the greedy matching behavior of the regular expression and the handling of nested curly braces. When the text contains curly braces within backticks (`), these curly braces are incorrectly treated as part of the JSON structure, interfering with the regular expression matching.

The parsing error is in https://github.com/stanfordnlp/dspy/blob/main/dspy/adapters/json_adapter.py#L150.

Steps to reproduce

The parsing error is in https://github.com/stanfordnlp/dspy/blob/main/dspy/adapters/json_adapter.py#L150.

test code

text = '{\n  "reasoning": "think step by step",\n  "issue_list": [\n    {\n  "issue_type": "some type",\n      "severity_level": "fatal",\n  "problem_code_snippet": "if (user) {"\n    }\n  ]\n}'


pattern = r"\{(?:[^{}]|(?R))*\}"
match = regex.search(pattern, text, regex.DOTALL)
if match:
    text = match.group(0)

print(text)

output:

{
  "issue_type": "some type",
      "severity_level": "fatal",
  "problem_code_snippet": "if (user) {"
    }
  ]
}

DSPy version

3.0.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions