-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Howie/validate sample by agent #44345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the sample testing infrastructure to use an AI agent for validating sample outputs. The main change replaces pattern-based output validation with an LLM-powered validation approach using Azure OpenAI.
Key Changes:
- Converted
SampleExecutorfrom a simple helper class to a decorator pattern with context manager support - Introduced agent-based validation to check if sample outputs indicate success or failure
- Moved environment variable mapping to a separate function and refactored execution flow
0309437 to
3a9f439
Compare
510d630 to
b55c649
Compare
|
Big change, don't forget to run 'black' tool. Thanks! |
b55c649 to
8d2ff2f
Compare
8d2ff2f to
99386f8
Compare
94f6871 to
65570ab
Compare
The existing code has already collected print call content and validate if the content after ==> Result meets certain criteria.
Now change replace this validation by submitting all print contents to response.create and validate by AI.
This response.create call will be recorded but the input which is the print contents are sanitized. So if you modify the print statement in samples, you don't need to re-record and still able to replay the record with assertion passed.
Also, if responses said test fail, it is hard to check what content was in the print call. So I write it to temp file.