-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Improve multi-tool agent tests for robustness #44354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the robustness of multi-tool agent tests by ensuring that all configured tools are actually used and verified during test execution. Previously, some tests would only verify that an agent was created with multiple tools but wouldn't confirm the tools were actually invoked. The updated tests now use non-trivial calculations that require actual tool execution, add explicit assertions to verify function calls are made, and validate computed values in function arguments.
Key changes:
- Enhanced test assertions to verify actual tool usage rather than just agent creation
- Replaced trivial calculations with complex ones that require code execution (e.g., 17^4, fibonacci(15), averaging 15 sensor readings)
- Added function call verification and argument validation logic
- Replaced Unicode checkmarks with
[PASS]for cross-platform compatibility - Removed
test_four_tools_combinationwhich only tested agent creation, not tool usage
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test_multitool_with_conversations.py | Replaced Unicode checkmarks with [PASS] for Windows compatibility |
| test_agent_file_search_code_interpreter_function.py | CRITICAL ISSUE: Added function call verification with json parsing but missing required imports; removed test_four_tools_combination; enhanced test_complete_analysis_workflow to validate computed statistics |
| test_agent_file_search_and_function.py | Enhanced test_python_code_file_search to verify both File Search and Function Tool usage; replaced Unicode checkmarks with [PASS] |
| test_agent_file_search_and_code_interpreter.py | Enhanced both tests with non-trivial data requiring actual computation; added validation of calculated results with brittle string matching |
| test_agent_code_interpreter_and_function.py | CRITICAL ISSUE: Added function call verification and argument validation but missing required imports (json, ResponseInputParam, FunctionCallOutput) |
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...i/azure-ai-projects/tests/agents/tools/multitool/test_agent_code_interpreter_and_function.py
Outdated
Show resolved
Hide resolved
...zure-ai-projects/tests/agents/tools/multitool/test_agent_file_search_and_code_interpreter.py
Outdated
Show resolved
Hide resolved
...zure-ai-projects/tests/agents/tools/multitool/test_agent_file_search_and_code_interpreter.py
Show resolved
Hide resolved
cbe7070 to
865b294
Compare
865b294 to
f69f956
Compare
Description
Tests now properly verify that all configured tools are actually used.
Changes:
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines