Add special codegen outline benchmark #16
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR implements the special codegen benchmark requested in CG-18529. The benchmark uses the provided codegen.com Dockerfile as a base image and tests the outline repository workflow with E2B and Daytona providers.
Features Added
🚀 New CLI Command
🔧 Comprehensive Benchmark Workflow
https://github.com/codegen-sh/outline.git
📊 Rich Reporting
🐳 Custom Dockerfile
benchmarks/dockerfiles/codegen-base.dockerfile
with the provided Dockerfile📚 Documentation
benchmarks/CODEGEN_BENCHMARK.md
Usage Examples
Key Implementation Details
Files Added/Modified
New Files
grainchain/cli/codegen_benchmark.py
- Core benchmark implementationbenchmarks/dockerfiles/codegen-base.dockerfile
- Codegen base imagebenchmarks/configs/codegen_outline.json
- Benchmark configurationbenchmarks/CODEGEN_BENCHMARK.md
- Comprehensive documentationModified Files
grainchain/cli/main.py
- Added--codegen
flaggrainchain/cli/benchmark.py
- Added codegen benchmark routingREADME.md
- Updated with new benchmark examplesTesting
✅ CLI Integration: New
--codegen
flag works correctly✅ Backward Compatibility: Existing benchmarks still work
✅ Error Handling: Graceful failure when API keys missing
✅ Report Generation: JSON and Markdown reports generated correctly
Requirements
E2B_API_KEY
environment variable)DAYTONA_API_KEY
environment variable)Future Enhancements
This implementation provides a solid foundation for:
Resolves CG-18529
💻 View my work • About Codegen