Skip to content

Conversation

lovets18
Copy link
Contributor

Changed
Modified instruction for translating promtps.

Reason
I have encountered an error trying to use adapt_prompts for Faithfulness with LLM Gigachat:

ValueError: The number of statements in the output (4) does not match the number of statements in the input (1). Translation failed.

The instruction of the metric contains: "Break down each sentence into one or more fully understandable statements.". Model decided to execute this, not only translate, the translated prompt was splitted, so the shape mismatch occured

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR addresses a critical translation issue in the TranslateStatements prompt within ragas/src/ragas/prompt/pydantic_prompt.py. The problem occurred when using adapt_prompts for metrics like Faithfulness with certain LLMs (specifically Gigachat), where the translation process would fail due to instruction misinterpretation.

The core issue was that the original translation instruction was too brief and ambiguous. When translating prompts that contained instructions like "Break down each sentence into one or more fully understandable statements," the LLM would interpret these as commands to execute rather than text to translate. This caused the model to actually break down the translated text into multiple statements instead of maintaining a 1:1 mapping, triggering a validation error in the process_output method that expects the same number of input and output statements.

The fix replaces the original short instruction with a comprehensive, defensive prompt that explicitly establishes the role as a "TRANSLATOR, not an instruction executor." It includes critical rules that prevent instruction execution during translation and emphasizes maintaining the exact structure and count of statements. This change integrates with Ragas' existing prompt customization framework, where prompts are treated as hyperparameters that can be optimized for specific domains and LLMs.

Confidence score: 4/5

• This is a well-targeted fix for a specific translation failure scenario that should resolve the reported issue without breaking existing functionality.
• The score reflects high confidence in the fix's effectiveness, though there's minimal risk that the more verbose instruction could affect performance with some LLMs or edge cases in translation scenarios.
• The critical file ragas/src/ragas/prompt/pydantic_prompt.py should be tested thoroughly with various LLMs and prompt translation scenarios.

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jul 23, 2025
@anistark
Copy link
Contributor

Thanks for the PR @lovets18

Could you please rebase and fix the ci issues?

Try running make run-ci locally for detailed output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants