Add post-processing pipeline for exporting benchmark results to OpenSearch#334
Add post-processing pipeline for exporting benchmark results to OpenSearch#334acalhounRH wants to merge 7 commits intomainfrom
Conversation
PullHero ReviewProvider: generativelanguage.googleapis.com Model: gemini-2.0-flash Okay, I've reviewed the changes for the post-processing functionality being added to Zathras. This is a significant addition, enabling automated results analysis and export to external systems like OpenSearch and Horreum. The changes are extensive, so I'll focus on the overall design, key implementation details, and potential areas for improvement. High-Level AssessmentThe changes introduce a well-structured post-processing pipeline with clear separation of concerns. The use of abstract base classes and benchmark-specific processors promotes modularity and extensibility. The addition of exporters enables integrations with external data warehouses. The attention to detail regarding schema design, duplicate prevention, and error handling is commendable. Key Strengths
Areas for Improvement
Specific Feedback
Vote: 0 |
PullHero ReviewProvider: generativelanguage.googleapis.com Model: gemini-2.0-flash Okay, I've reviewed the changes for the post-processing module. This is a significant addition to the Zathras framework, enabling automated extraction, transformation, and export of benchmark results to OpenSearch and Horreum. The changes include new files for schema definitions, result processors for various benchmarks, and exporters for OpenSearch and Horreum. General Observations
Specific Feedback1. .gitignore Files
2. post_processing/README.md
3. post_processing/config/apply_opensearch_templates.sh
4. post_processing/exporters/opensearch_exporter.py
5. post_processing/processors/base_processor.py
6. post_processing/processors/pyperf_processor.py
7. General
Code Snippets with Suggested Changes--- a/post_processing/README.md
+++ b/post_processing/README.md
@@ -10,6 +10,8 @@
Process entire result directories automatically:
+**Note: Before running, ensure you have Zathras installed on your controller system (see [Installation](../README.md#installation))**
+
```bash
# 1. Configure credentials
cp post_processing/config/export_config_example.yml post_processing/config/export_config.yml
@@ -20,6 +22,8 @@
--input /path/to/results \
--config post_processing/config/export_config.yml \
--opensearch
+
+# See [Documentation](docs/post_processing/README.md) for detailed instructions
# Or just generate JSON files
python3 -
**Vote**: +1
**[PullHero](https://github.com/pullhero/)** |
PullHero ReviewProvider: generativelanguage.googleapis.com Model: gemini-2.0-flash Okay, I've reviewed the changes and here's my assessment. OverviewThis PR introduces a comprehensive post-processing framework for Zathras benchmark results. It includes new modules for:
This is a significant addition to Zathras, enabling centralized analysis and long-term performance tracking. Code Quality and Maintainability
Potential Bugs and Security Issues
Adherence to Project Conventions
Documentation Completeness
Specific Feedback
Improvements and suggestions
ConclusionThis is an excellent addition to Zathras. The code is Vote: 0 |
|
If we have multiple runs of the same benchmark (it does happen), which run will this use (should be the last run of the benchmark). |
| req = urllib.request.Request( | ||
| url, | ||
| data=request_data, | ||
| headers=headers, | ||
| method=method | ||
| ) |
There was a problem hiding this comment.
I think @acalhounRH had said one of his goals was to make this data sink independent, such that we could switch to e.g. Horreum with relatively minimal interruption.
| Archive Handler Utility | ||
|
|
||
| Handles extraction of Zathras result archives: | ||
| - results_{test}.zip → results_{test}_.tar → test result files |
There was a problem hiding this comment.
@dvalinrh did we go back to a regular tarball instead of a zip -> tar -> test results?
| if field == 'Architecture': | ||
| cpu_info['architecture'] = data | ||
| elif field == 'Vendor ID': | ||
| cpu_info['vendor'] = data | ||
| elif field == 'Model name': | ||
| cpu_info['model'] = data | ||
| elif field == 'CPU(s)': | ||
| cpu_info['cores'] = int(data) if data.isdigit() else None |
There was a problem hiding this comment.
We may want to consider how to simplify this... cpu_info[field] = data would be nice if possible and reduce the length of this if/elif chain.
| if 'vendor_id' in proc_data: | ||
| cpu_info['vendor'] = proc_data['vendor_id'] | ||
| if 'model_name' in proc_data: | ||
| cpu_info['model'] = proc_data['model_name'] | ||
| if 'cpu_cores' in proc_data: | ||
| cpu_info['cores'] = int(proc_data['cpu_cores']) |
| # Convert to GB | ||
| if unit == 'K': | ||
| return round(value / 1024 / 1024) | ||
| elif unit == 'M': | ||
| return round(value / 1024) | ||
| elif unit == 'G': | ||
| return round(value) | ||
| elif unit == 'T': | ||
| return round(value * 1024) |
There was a problem hiding this comment.
Are we getting IEC base-2 amounts or SI base-10 amounts?
Mix/matching them would be probably close enough:tm:, but could provide headaches.
|
Let's move some discussion here. I think that putting the extraction scripts for each benchmark in Zathras is against how we have this set up. If a wrapper gets out of sync of Zathras (in either direction) then that can cause problems. This is not a simple change, since the wrappers get downloaded to the remote system and the controller never sees/cares about the wrappers presently. Some discussion around this would be helpful. |
ec544d8 to
c51fc7c
Compare
|
This changeset includes thousands of test results under alex_95/ - can we remove this directory from the PR? |
…earch and Horreum - Implement 12 benchmark processors (CoreMark, STREAMS, FIO, SPEC CPU 2017, etc.) - Object-based schema with full metadata extraction (CPU, memory, NUMA, OS config) - Two-index design: zathras-results (summary) + zathras-timeseries (individual points) - Content-based deduplication using SHA256 hashing - OpenSearch and Horreum exporters with automatic retry logic - Batch processing with recursive directory discovery - Zero flake8 errors (1,340 issues fixed) Tested with 34 Azure instances, 78 benchmarks successfully exported. AI-assisted implementation using Claude Sonnet 4.5 (Anthropic) via Cursor IDE.
c51fc7c to
97691e1
Compare
the processing will treat each run separately + common metadata to support the association. so all runs will(should) be exported to data store. |
Code Review: Post-Processing PipelineOverviewThis PR adds a significant post-processing pipeline (~10,500 lines of code across 34 files) for exporting Zathras benchmark results to OpenSearch and Horreum. The implementation is well-structured and follows good Python practices. Summary Assessment
Strengths1. Well-Designed Architecture
2. Excellent Modularity
3. Good Error Handling
4. Thoughtful Design Decisions
Issues & RecommendationsSecurity Concerns1. SSL Certificate Verification Disabled by DefaultFile: verify_ssl: false # Set to true for productionRisk: Users may copy this example and leave SSL verification disabled, exposing credentials to MITM attacks. Recommendation: Default to verify_ssl: true # Only set to false for development with self-signed certs2. Unverified SSL Context CreationFile: if not self.verify_ssl and url.startswith('https'):
context = ssl._create_unverified_context()Risk: Using Recommendation: Consider logging a warning when SSL verification is disabled: if not self.verify_ssl:
logger.warning("SSL verification disabled - credentials may be exposed")3. Archive Extraction Without Path ValidationFile: with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(temp_dir)Risk: Zip Slip vulnerability - malicious archives could write files outside the temp directory using Recommendation: Validate extracted paths: for member in zip_ref.namelist():
member_path = os.path.join(temp_dir, member)
if not os.path.commonpath([temp_dir, member_path]) == temp_dir:
raise ArchiveExtractionError(f"Path traversal detected: {member}")Code Style Issues1. Unused Import in requirements.txtFile: Issue: The code uses 2. Hardcoded Placeholder TimestampFile: base_time = datetime(2025, 11, 6, 5, 9, 45) # PlaceholderRecommendation: Use current time or document the limitation more clearly. 3. Magic NumbersFile: "limit": 5000 # Higher limit for dynamic fieldsRecommendation: Define as a constant at module level for clarity. Modularity Improvements1. Duplicate HTTP LogicBoth Recommendation: Extract to a common 2. Statistics Calculation DuplicationSimilar statistics calculations appear in multiple processors ( Recommendation: Add helper function in def calculate_summary_stats(values: List[float]) -> TimeSeriesSummary:
...Best Practices1. Missing
|
| duration = summary.get('total_time_secs') if summary else None | ||
|
|
||
| # Estimate timestamps (we don't have exact timestamps in CoreMark) | ||
| base_time = datetime(2025, 11, 6, 5, 9, 45) # Placeholder |
There was a problem hiding this comment.
Hardcoded timestamps will cause problems.
- Parse CoreMark results CSV with comma delimiter and optional comment lines (skip_comments in parser_utils.parse_csv_timeseries). - Use Start_Date/End_Date from the CSV for run and time-series timestamps when present. - Require valid timestamps: raise ProcessorError instead of synthetic timestamps when the CSV lacks Start_Date/End_Date or when a row has missing/invalid Start_Date. - Drop legacy colon-delimited, no-timestamps format support.
- Parse Start_Date and End_Date from comma-delimited results_streams.csv and use them for run timeseries instead of datetime.now(). - Require the CSV header to include Start_Date and End_Date as the last two columns; raise ProcessorError with a clear message when timestamps are missing, empty, or not valid ISO 8601. - Support direct CSV path via files['results_streams_csv'] and allow results_streams.csv in extracted_path (no streams_* subdir required), matching the pattern used under tmp/coremark.
- Use timestamps from benchmark output: require comma-delimited CSV with Start_Date and End_Date (ISO 8601); use them for run start_time/end_time and a single timeseries point. No datetime.now()/utcnow(). - Require timestamps and fail clearly: when timestamps are missing or malformed, raise ProcessorError with a short message (expected format T/V,N,NB,P,Q,Time,Gflops,Start_Date,End_Date). Add ISO 8601 validation helper and reject legacy colon-delimited format without timestamps.
- Use timestamps from benchmark output: require comma-delimited CSV with Start_Date and End_Date (ISO 8601). Use run-level start/end from the results and per-workload timestamps from each row's Start_Date for timeseries. No datetime.now()/utcnow(). - Require timestamps and fail clearly: when timestamps are missing from the header or malformed in any row, raise ProcessorError with a short, descriptive message (expected format Test,Multi_iterations,Single_iterations,Scaling,Start_Date,End_Date). Add ISO 8601 validation helper; reject legacy colon-delimited format without timestamps. - Match tmp/coremark path layout: allow direct results path via extracted_result['files']['results_csv'] when provided (demos pass single file path). When using extracted_path, allow results.csv directly in that directory or in a coremark_pro_* subdirectory. - Demo: add tmp/coremark_pro/ CSVs (valid, no_timestamps, invalid and empty timestamp rows) and post_processing/demo_coremark_pro_timestamps.py using the same files-based extracted_result pattern as the CoreMark demo; run valid (success), no timestamps (ProcessorError), invalid and empty timestamp (ProcessorError) cases with short summary and error message output.
…d path - Use timestamps from benchmark output: parse CSV with Start_Date/End_Date (ISO 8601) per row and use them for timeseries points instead of datetime.now(). - Require timestamps and fail clearly: when CSV lacks Start_Date/End_Date or any value is missing/malformed, raise ProcessorError with a short message (expected format and what went wrong). Reject legacy colon-delimited Warehouses:Bops format without timestamps. - Add _validate_iso8601_timestamp() and _ISO8601_PATTERN for validation. - Allow direct results path via extracted_result['files']['results_specjbb_csv'] when provided; when using extracted_path, look for results_specjbb.csv in that directory or via _find_csv_file (rglob). Assisted-by: Cursor
…path - Use timestamps from benchmark output: parse CSV with Start_Date/End_Date (ISO 8601) per row for single-CSV path; for net_results layout use run_metadata.csv run-level Start_Date/End_Date and interpolate timestamps for each data point. No datetime.now()/utcnow(). - Require timestamps and fail clearly: when CSV lacks Start_Date/End_Date or any value is missing/malformed, raise ProcessorError with a short message (expected format and what went wrong). - Add _validate_iso8601_timestamp() and _ISO8601_PATTERN for validation. - Allow direct results path via extracted_result['files']['results_uperf_csv'] when provided. When using extracted_path, look for results_uperf.csv in that directory or in a uperf_* subdir; if not found, use net_results/ and require run_metadata.csv with Start_Date,End_Date. Assisted-by: Cursor
Summary
This PR adds a post-processing pipeline that converts Zathras benchmark results into structured JSON documents and exports them to OpenSearch for analysis and performance tracking.
What's Included
12 Benchmark Processors
Key Features
zathras-results(summary) +zathras-timeseries(individual points)Usage
python3 -m post_processing.run_postprocessing
--input /path/to/results/
--config export_config.yml
--opensearch
Files Changed
run_postprocessing.py,schema.py, 12 processorsnew feature addition only.
Dependencies
pyyaml>=6.0opensearch-py>=2.0.0requests>=2.28.0python-dateutil>=2.8.0AI Assistance
This PR was developed with AI assistance using Claude Sonnet 4.5 (Anthropic) via Cursor IDE. The implementation was collaboratively developed through iterative design discussions, code generation, testing, and refinement.