Hotfix/concurrent load core sweep by maryamtahhan · Pull Request #60 · redhat-et/vllm-cpu-perf-eval

maryamtahhan · 2026-03-23T10:04:08Z

Summary

This PR enhances the concurrent load testing playbook with improved parameter handling, validation, and CI testing. The changes enable the playbook to work correctly in both single-core and core-sweep modes while adding comprehensive test coverage.

Key Changes

Fix recursive template error: Removed circular variable definitions that caused "maximum recursion depth exceeded" errors
Unified parameter validation: Added proper validation for requested_cores and core_sweep_counts parameters
CI test coverage: Added GitHub Actions workflows to test various parameter combinations
Core sweep delegation: Refactored to properly delegate core sweep execution to the run-core-sweep.sh script
Local testing support: Added hack/test-concurrent-load-params.sh for local dry-run testing
AWX compatibility: Added detection for AWX execution environment with appropriate results path handling

Changes

Fixed Issues

Recursive template loop (llm-benchmark-concurrent-load.yml:48-49)
- Removed test_model and base_workload self-referential variable definitions
- Added proper parameter validation in tasks instead
Core sweep vs single core mode handling
- Properly pass requested_cores_list to llm-core-sweep-auto.yml for sweep mode
- Properly pass requested_cores to llm-benchmark-auto.yml for single-core mode
- Added validation to require at least one parameter
String list parsing
- Convert string-formatted lists from command line to proper YAML lists
- Handle both "[16,32]" and [16,32] input formats

New Features

CI Testing Workflows
- .github/workflows/concurrent-load-test-matrix.yml - Tests multiple parameter combinations
- .github/workflows/playbook-syntax-check.yml - Validates playbook syntax
Local Testing Script
- hack/test-concurrent-load-params.sh - Dry-run script for testing parameter combinations locally

Files Changed

automation/test-execution/ansible/llm-benchmark-concurrent-load.yml - Core fixes and validation
automation/test-execution/ansible/llm-core-sweep-auto.yml - Improved results path handling, delegated execution
automation/test-execution/ansible/roles/vllm_server/tasks/start-llm.yml - KV cache normalization
.github/workflows/concurrent-load-test-matrix.yml - New CI test matrix
.github/workflows/playbook-syntax-check.yml - New syntax validation
hack/test-concurrent-load-params.sh - New local testing script

Test Plan

Test single core mode: ansible-playbook ... -e "requested_cores=16"
Test core sweep mode: ansible-playbook ... -e "core_sweep_counts=[16,32]"
Test with phase skipping: -e "skip_phase_2=true" -e "skip_phase_3=true"
Test with GuideLLM parameters: -e "guidellm_max_seconds=100" -e "guidellm_rate=[1]"
Verify CI workflows pass
Test local dry-run script works

Breaking Changes

None - this is backward compatible with existing usage patterns.

Related Issues

Fixes the recursive template error reported in testing
Addresses Support base_workload=all to run all workload types #61 (partial) - base_workload=all support remains as future enhancement

Summary by CodeRabbit

New Features
- Added automated concurrent load testing with parameterized test matrix execution.
- Enhanced test validation with syntax-checking workflows for benchmark playbooks.
Chores
- Improved testing infrastructure with new validation scripts and CI/CD workflows for benchmark automation.
- Refactored test execution logic for better parameter handling and core-sweep testing.

coderabbitai · 2026-03-23T10:04:15Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 71d36fad-1cda-4336-8dbc-4afd197467db

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces two new GitHub Actions workflows for testing and validating Ansible playbooks through matrix-driven concurrent load tests and syntax checks. It refactors the benchmark playbook to support both core-sweep and single-core execution paths, replaces task iteration with direct shell script invocation, and enhances variable tracking with provenance information.

Changes

Cohort / File(s)	Summary
GitHub Actions Workflows `.github/workflows/concurrent-load-test-matrix.yml`, `.github/workflows/playbook-syntax-check.yml`	New CI/CD workflows for testing Ansible playbooks via matrix runs and syntax validation. The concurrent-load-test workflow iterates through parameter combinations with `--syntax-check` and `--list-tasks`, while playbook-syntax-check validates specific playbooks and performs introspection on tags and tasks.
Playbook Core Logic `automation/test-execution/ansible/llm-benchmark-concurrent-load.yml`	Refactored phase execution to support dual paths: core-sweep mode (via `llm-core-sweep-auto.yml`) when `core_sweep_counts` is defined, and single-core mode (via `llm-benchmark-auto.yml`) otherwise. Added explicit parameter validation using `ansible.builtin.assert` for `test_model` and `base_workload`.
Core Sweep Execution `automation/test-execution/ansible/llm-core-sweep-auto.yml`	Restructured iteration mechanism: replaced `include_tasks` with direct shell script invocation (`run-core-sweep.sh`), added string-to-list normalization for `requested_cores_list`, and removed result collection play (now delegated to the shell script). Updated debug output to report results collection completion.
Variable Tracking Enhancement `automation/test-execution/ansible/roles/vllm_server/tasks/start-llm.yml`	Added normalization for `kv_cache_override` and introduced provenance tracking with `model_kv_cache_source` and `model_dtype_source` facts to track which configuration path was taken. Updated KV cache and dtype assignment logic accordingly.
Testing Helper Script `hack/test-concurrent-load-params.sh`	New Bash utility for validating concurrent load test parameter combinations, executing `--syntax-check` and `--list-tasks` for each named test case and reporting pass/fail summary.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Hotfix/concurrent load core sweep' is vague and uses a format (Hotfix/) that describes the branch type rather than the actual changeset. It does not clearly convey the primary changes: refactoring playbook logic, fixing parameter validation, and enabling core-sweep testing modes.	Use a more descriptive title that summarizes the main technical change, such as 'Refactor concurrent load playbook to support core-sweep and single-core modes' or 'Fix concurrent load parameter validation and enable core-sweep execution'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@automation/test-execution/ansible/llm-benchmark-concurrent-load.yml`:
- Around line 51-60: The current "Validate required parameters are provided"
task checks only test_model and base_workload; add a check that ensures at least
one of core_sweep_counts or requested_cores is defined (e.g., add a condition
like "core_sweep_counts is defined or requested_cores is defined" to the
ansible.builtin.assert 'that' list) and update the fail_msg to mention that one
of these core configuration parameters must be provided so the playbook doesn't
silently skip all phase execution blocks when neither is present.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 93807736-594f-46f7-bb59-44448f1745b0

📥 Commits

Reviewing files that changed from the base of the PR and between 3072070 and d170391.

📒 Files selected for processing (6)

.github/workflows/concurrent-load-test-matrix.yml
.github/workflows/playbook-syntax-check.yml
automation/test-execution/ansible/llm-benchmark-concurrent-load.yml
automation/test-execution/ansible/llm-core-sweep-auto.yml
automation/test-execution/ansible/roles/vllm_server/tasks/start-llm.yml
hack/test-concurrent-load-params.sh

coderabbitai · 2026-03-23T11:04:16Z

+    - name: Validate required parameters are provided
+      ansible.builtin.assert:
+        that:
+          - test_model is defined
+          - base_workload is defined
+        fail_msg: |
+          Missing required parameters. Please provide:
+          -e "test_model=<model>"
+          -e "base_workload=<chat|rag|code|summarization|short_codegen>"
+


⚠️ Potential issue | 🟠 Major

Missing validation for core configuration parameters.

The validation ensures test_model and base_workload are defined, but doesn't validate that at least one of core_sweep_counts or requested_cores is provided. If neither is specified, all phase execution blocks will be silently skipped due to their when conditions, resulting in no benchmarks running without any error message.

Proposed fix

- name: Validate required parameters are provided ansible.builtin.assert: that: - test_model is defined - base_workload is defined + - (core_sweep_counts is defined) or (requested_cores is defined) fail_msg: | Missing required parameters. Please provide: -e "test_model=<model>" -e "base_workload=<chat|rag|code|summarization|short_codegen>" + -e "core_sweep_counts=[16,32]" OR -e "requested_cores=16"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Validate required parameters are provided

ansible.builtin.assert:

that:

- test_model is defined

- base_workload is defined

fail_msg: |

Missing required parameters. Please provide:

-e "test_model=<model>"

-e "base_workload=<chat|rag|code|summarization|short_codegen>"

- name: Validate required parameters are provided

ansible.builtin.assert:

that:

- test_model is defined

- base_workload is defined

- (core_sweep_counts is defined) or (requested_cores is defined)

fail_msg: |

Missing required parameters. Please provide:

-e "test_model=<model>"

-e "base_workload=<chat|rag|code|summarization|short_codegen>"

-e "core_sweep_counts=[16,32]" OR -e "requested_cores=16"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@automation/test-execution/ansible/llm-benchmark-concurrent-load.yml` around lines 51 - 60, The current "Validate required parameters are provided" task checks only test_model and base_workload; add a check that ensures at least one of core_sweep_counts or requested_cores is defined (e.g., add a condition like "core_sweep_counts is defined or requested_cores is defined" to the ansible.builtin.assert 'that' list) and update the fail_msg to mention that one of these core configuration parameters must be provided so the playbook doesn't silently skip all phase execution blocks when neither is present.

… testing The concurrent load playbook was broken because it tried to pass core_sweep_counts to llm-benchmark-auto.yml, which doesn't accept it. Changes: - Split each phase into two separate playbook calls - Core sweep mode: calls llm-core-sweep-auto.yml with requested_cores_list - Single core mode: calls llm-benchmark-auto.yml with requested_cores - Prioritizes core_sweep_counts if both are provided - Applied to all 3 phases (baseline, realistic, production) Usage examples: # Core sweep (multiple cores) -e "core_sweep_counts=[16,32]" # Single core -e "requested_cores=16" Fixes the error: 'requested_cores is defined' assertion failed Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

When passing -e 'requested_cores_list=[16,32]' from the command line, Ansible treats it as a string not a list, causing loop errors. Added conversion task to parse string as YAML list if needed. Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Adds automated syntax checking for Ansible playbooks to catch errors early. Features: - Runs on PR and push to main - Validates syntax for all major playbooks - Uses dummy inventory to avoid needing real infrastructure - Tests common parameter combinations - Lists available tasks and tags for documentation This will catch issues like: - YAML syntax errors - Undefined variables in assertions - Missing required parameters - Type mismatches (string vs list) Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Tests 11 different parameter combinations to ensure playbook handles: - Single core vs core sweep modes - All workload types (chat, code, rag) - Phase skip combinations (phase 1, 2, 3) - Custom rate and duration parameters - Variable workload support Each test validates: - Syntax correctness - Parameter parsing (especially list vs string) - Task execution planning - Conditional logic This catches issues like: - String/list type mismatches - Missing conditional branches - Invalid parameter combinations - Undefined variable references Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Provides quick local validation of all concurrent load test configurations: - 11 different parameter combinations - Tests both single core and core sweep modes - Validates syntax and task planning - No infrastructure required (uses dummy inventory) Run with: cd automation/test-execution/ansible ./test-parameter-combinations.sh All 11 tests currently passing ✓ Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

The llm-core-sweep-auto.yml playbook was trying to use include_tasks with delegate_to to orchestrate tests across multiple host groups (dut, load_generator), which doesn't work in Ansible. Task files can't use import_playbook, and delegation with include_role has limitations. Instead, refactor to use the existing run-core-sweep.sh script which properly calls llm-benchmark-auto.yml multiple times. The script already handles all the orchestration and result collection. Also fix validation in llm-benchmark-concurrent-load.yml to use assert instead of mandatory filter, and update test script path resolution. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

maryamtahhan force-pushed the hotfix/concurrent-load-core-sweep branch from 514a436 to d170391 Compare March 23, 2026 10:54

maryamtahhan marked this pull request as ready for review March 23, 2026 10:58

maryamtahhan marked this pull request as draft March 23, 2026 11:00

coderabbitai Bot reviewed Mar 23, 2026

View reviewed changes

maryamtahhan and others added 6 commits March 31, 2026 16:25

maryamtahhan force-pushed the hotfix/concurrent-load-core-sweep branch from d170391 to c47755f Compare April 1, 2026 11:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hotfix/concurrent load core sweep#60

Hotfix/concurrent load core sweep#60
maryamtahhan wants to merge 6 commits into
redhat-et:mainfrom
maryamtahhan:hotfix/concurrent-load-core-sweep

maryamtahhan commented Mar 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maryamtahhan commented Mar 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Changes

Fixed Issues

New Features

Files Changed

Test Plan

Breaking Changes

Related Issues

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maryamtahhan commented Mar 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading