Hotfix/concurrent load core sweep#60
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR introduces two new GitHub Actions workflows for testing and validating Ansible playbooks through matrix-driven concurrent load tests and syntax checks. It refactors the benchmark playbook to support both core-sweep and single-core execution paths, replaces task iteration with direct shell script invocation, and enhances variable tracking with provenance information. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
514a436 to
d170391
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@automation/test-execution/ansible/llm-benchmark-concurrent-load.yml`:
- Around line 51-60: The current "Validate required parameters are provided"
task checks only test_model and base_workload; add a check that ensures at least
one of core_sweep_counts or requested_cores is defined (e.g., add a condition
like "core_sweep_counts is defined or requested_cores is defined" to the
ansible.builtin.assert 'that' list) and update the fail_msg to mention that one
of these core configuration parameters must be provided so the playbook doesn't
silently skip all phase execution blocks when neither is present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 93807736-594f-46f7-bb59-44448f1745b0
📒 Files selected for processing (6)
.github/workflows/concurrent-load-test-matrix.yml.github/workflows/playbook-syntax-check.ymlautomation/test-execution/ansible/llm-benchmark-concurrent-load.ymlautomation/test-execution/ansible/llm-core-sweep-auto.ymlautomation/test-execution/ansible/roles/vllm_server/tasks/start-llm.ymlhack/test-concurrent-load-params.sh
| - name: Validate required parameters are provided | ||
| ansible.builtin.assert: | ||
| that: | ||
| - test_model is defined | ||
| - base_workload is defined | ||
| fail_msg: | | ||
| Missing required parameters. Please provide: | ||
| -e "test_model=<model>" | ||
| -e "base_workload=<chat|rag|code|summarization|short_codegen>" | ||
|
|
There was a problem hiding this comment.
Missing validation for core configuration parameters.
The validation ensures test_model and base_workload are defined, but doesn't validate that at least one of core_sweep_counts or requested_cores is provided. If neither is specified, all phase execution blocks will be silently skipped due to their when conditions, resulting in no benchmarks running without any error message.
Proposed fix
- name: Validate required parameters are provided
ansible.builtin.assert:
that:
- test_model is defined
- base_workload is defined
+ - (core_sweep_counts is defined) or (requested_cores is defined)
fail_msg: |
Missing required parameters. Please provide:
-e "test_model=<model>"
-e "base_workload=<chat|rag|code|summarization|short_codegen>"
+ -e "core_sweep_counts=[16,32]" OR -e "requested_cores=16"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Validate required parameters are provided | |
| ansible.builtin.assert: | |
| that: | |
| - test_model is defined | |
| - base_workload is defined | |
| fail_msg: | | |
| Missing required parameters. Please provide: | |
| -e "test_model=<model>" | |
| -e "base_workload=<chat|rag|code|summarization|short_codegen>" | |
| - name: Validate required parameters are provided | |
| ansible.builtin.assert: | |
| that: | |
| - test_model is defined | |
| - base_workload is defined | |
| - (core_sweep_counts is defined) or (requested_cores is defined) | |
| fail_msg: | | |
| Missing required parameters. Please provide: | |
| -e "test_model=<model>" | |
| -e "base_workload=<chat|rag|code|summarization|short_codegen>" | |
| -e "core_sweep_counts=[16,32]" OR -e "requested_cores=16" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@automation/test-execution/ansible/llm-benchmark-concurrent-load.yml` around
lines 51 - 60, The current "Validate required parameters are provided" task
checks only test_model and base_workload; add a check that ensures at least one
of core_sweep_counts or requested_cores is defined (e.g., add a condition like
"core_sweep_counts is defined or requested_cores is defined" to the
ansible.builtin.assert 'that' list) and update the fail_msg to mention that one
of these core configuration parameters must be provided so the playbook doesn't
silently skip all phase execution blocks when neither is present.
… testing The concurrent load playbook was broken because it tried to pass core_sweep_counts to llm-benchmark-auto.yml, which doesn't accept it. Changes: - Split each phase into two separate playbook calls - Core sweep mode: calls llm-core-sweep-auto.yml with requested_cores_list - Single core mode: calls llm-benchmark-auto.yml with requested_cores - Prioritizes core_sweep_counts if both are provided - Applied to all 3 phases (baseline, realistic, production) Usage examples: # Core sweep (multiple cores) -e "core_sweep_counts=[16,32]" # Single core -e "requested_cores=16" Fixes the error: 'requested_cores is defined' assertion failed Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
When passing -e 'requested_cores_list=[16,32]' from the command line, Ansible treats it as a string not a list, causing loop errors. Added conversion task to parse string as YAML list if needed. Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Adds automated syntax checking for Ansible playbooks to catch errors early. Features: - Runs on PR and push to main - Validates syntax for all major playbooks - Uses dummy inventory to avoid needing real infrastructure - Tests common parameter combinations - Lists available tasks and tags for documentation This will catch issues like: - YAML syntax errors - Undefined variables in assertions - Missing required parameters - Type mismatches (string vs list) Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Tests 11 different parameter combinations to ensure playbook handles: - Single core vs core sweep modes - All workload types (chat, code, rag) - Phase skip combinations (phase 1, 2, 3) - Custom rate and duration parameters - Variable workload support Each test validates: - Syntax correctness - Parameter parsing (especially list vs string) - Task execution planning - Conditional logic This catches issues like: - String/list type mismatches - Missing conditional branches - Invalid parameter combinations - Undefined variable references Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Provides quick local validation of all concurrent load test configurations: - 11 different parameter combinations - Tests both single core and core sweep modes - Validates syntax and task planning - No infrastructure required (uses dummy inventory) Run with: cd automation/test-execution/ansible ./test-parameter-combinations.sh All 11 tests currently passing ✓ Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
The llm-core-sweep-auto.yml playbook was trying to use include_tasks with delegate_to to orchestrate tests across multiple host groups (dut, load_generator), which doesn't work in Ansible. Task files can't use import_playbook, and delegation with include_role has limitations. Instead, refactor to use the existing run-core-sweep.sh script which properly calls llm-benchmark-auto.yml multiple times. The script already handles all the orchestration and result collection. Also fix validation in llm-benchmark-concurrent-load.yml to use assert instead of mandatory filter, and update test script path resolution. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
d170391 to
c47755f
Compare
Summary
This PR enhances the concurrent load testing playbook with improved parameter handling, validation, and CI testing. The changes enable the playbook to work correctly in both single-core and core-sweep modes while adding comprehensive test coverage.
Key Changes
requested_coresandcore_sweep_countsparametersrun-core-sweep.shscripthack/test-concurrent-load-params.shfor local dry-run testingChanges
Fixed Issues
Recursive template loop (llm-benchmark-concurrent-load.yml:48-49)
test_modelandbase_workloadself-referential variable definitionsCore sweep vs single core mode handling
requested_cores_listtollm-core-sweep-auto.ymlfor sweep moderequested_corestollm-benchmark-auto.ymlfor single-core modeString list parsing
"[16,32]"and[16,32]input formatsNew Features
CI Testing Workflows
Local Testing Script
Files Changed
automation/test-execution/ansible/llm-benchmark-concurrent-load.yml- Core fixes and validationautomation/test-execution/ansible/llm-core-sweep-auto.yml- Improved results path handling, delegated executionautomation/test-execution/ansible/roles/vllm_server/tasks/start-llm.yml- KV cache normalization.github/workflows/concurrent-load-test-matrix.yml- New CI test matrix.github/workflows/playbook-syntax-check.yml- New syntax validationhack/test-concurrent-load-params.sh- New local testing scriptTest Plan
ansible-playbook ... -e "requested_cores=16"ansible-playbook ... -e "core_sweep_counts=[16,32]"-e "skip_phase_2=true" -e "skip_phase_3=true"-e "guidellm_max_seconds=100" -e "guidellm_rate=[1]"Breaking Changes
None - this is backward compatible with existing usage patterns.
Related Issues
base_workload=allsupport remains as future enhancementSummary by CodeRabbit
New Features
Chores