Skip to content

Conversation

codegen-sh[bot]
Copy link
Contributor

@codegen-sh codegen-sh bot commented May 31, 2025

Overview

This PR implements the special codegen benchmark requested in CG-18529. The benchmark uses the provided codegen.com Dockerfile as a base image and tests the outline repository workflow with E2B and Daytona providers.

Features Added

🚀 New CLI Command

grainchain benchmark --codegen outline

🔧 Comprehensive Benchmark Workflow

  1. Clone outline repository from https://github.com/codegen-sh/outline.git
  2. Make trivial modifications (timestamped comments to README.md)
  3. Create snapshots (if supported by provider)
  4. Verify modifications persist
  5. Test snapshot reboot capabilities
  6. Compare E2B vs Daytona performance

📊 Rich Reporting

  • JSON results with detailed metrics
  • Markdown reports with executive summaries
  • Console output with real-time progress
  • Performance comparisons between providers

🐳 Custom Dockerfile

  • Added benchmarks/dockerfiles/codegen-base.dockerfile with the provided Dockerfile
  • Configured for codegen.com base image requirements

📚 Documentation

  • Comprehensive documentation in benchmarks/CODEGEN_BENCHMARK.md
  • Updated main README with usage examples
  • Configuration examples and troubleshooting guide

Usage Examples

# Run codegen outline benchmark (tests both E2B and Daytona)
grainchain benchmark --codegen outline

# Test specific provider
grainchain benchmark --codegen outline --provider e2b

# Save results to custom directory
grainchain benchmark --codegen outline --output benchmarks/results/

Key Implementation Details

  • Backward Compatible: Existing benchmark functionality unchanged
  • Provider Agnostic: Works with E2B, Daytona (designed for comparison)
  • Error Handling: Graceful handling of provider limitations
  • Configurable: Custom configuration support via JSON files
  • Extensible: Easy to add new codegen benchmark types

Files Added/Modified

New Files

  • grainchain/cli/codegen_benchmark.py - Core benchmark implementation
  • benchmarks/dockerfiles/codegen-base.dockerfile - Codegen base image
  • benchmarks/configs/codegen_outline.json - Benchmark configuration
  • benchmarks/CODEGEN_BENCHMARK.md - Comprehensive documentation

Modified Files

  • grainchain/cli/main.py - Added --codegen flag
  • grainchain/cli/benchmark.py - Added codegen benchmark routing
  • README.md - Updated with new benchmark examples

Testing

CLI Integration: New --codegen flag works correctly
Backward Compatibility: Existing benchmarks still work
Error Handling: Graceful failure when API keys missing
Report Generation: JSON and Markdown reports generated correctly

Requirements

  • E2B API key (E2B_API_KEY environment variable)
  • Daytona API key (DAYTONA_API_KEY environment variable)
  • Internet access for cloning outline repository

Future Enhancements

This implementation provides a solid foundation for:

  • Additional codegen benchmark types
  • Custom Dockerfile support per benchmark
  • Performance regression detection
  • Integration with CI/CD pipelines

Resolves CG-18529


💻 View my workAbout Codegen

- Add --codegen flag to benchmark CLI command
- Implement comprehensive codegen outline benchmark workflow:
  * Clone outline repository
  * Make trivial modifications
  * Test snapshot lifecycle
  * Compare E2B vs Daytona performance
- Add codegen-base.dockerfile with provided Dockerfile
- Generate JSON and Markdown reports
- Add comprehensive documentation
- Maintain backward compatibility with existing benchmarks

Resolves CG-18529
Copy link

linear bot commented May 31, 2025

CG-18529 Special Codegen Benchmark

  • for the grainchain repo (see CG-18511)
  • We have the dockerfile below, which the company codegen.com uses for their base image
  • We want a special benchmark suite, which uses this dockerfile as the base image
    • It should then do git clone for the outline repo
    • It should then make a trivial modification
  • It should then snapshot
  • It should then re-boot the snapshot
  • Some of this is already covered in the benchmarks I think
  • So let's make a new benchmark we can run, which is the CLI command
    • grainchain benchmark --codegen outline
    • This should not be run by default
    • Confirm you can do it!
  • We specifically want to compare the performance of E2B and Daytona for this
  • See the README.md to learn more about how to do this stuff
ARG TARGETPLATFORM=linux/amd64
FROM --platform=$TARGETPLATFORM ghcr.io/astral-sh/uv:python3.13-bookworm

# Set environment variables to prevent interactive prompts during installation
ENV NVM_DIR=/usr/local/nvm \
    NODE_VERSION=22.14.0 \
    DEBIAN_FRONTEND=noninteractive \
    NODE_OPTIONS="--max-old-space-size=8192" \
    PYTHONUNBUFFERED=1 \
    COREPACK_ENABLE_DOWNLOAD_PROMPT=0 \
    PYTHONPATH="/usr/local/lib/python3.13/site-packages" \
    IS_SANDBOX=True

ENV PATH=$NVM_DIR/versions/node/$NODE_VERSION/bin:/usr/local/nvm:/usr/local/bin:$PATH

ARG INVALIDATE_FILES_LAYER=1
# Copy configuration files and set permissions
COPY sshd_config /etc/ssh/sshd_config
COPY ssh_config /etc/ssh/ssh_config
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY start.sh /usr/local/bin/start.sh
COPY setup_ssh_user.sh /usr/local/bin/setup_ssh_user.sh
COPY setup_ssh_keys.sh /usr/local/bin/setup_ssh_keys.sh
COPY nginx.conf /etc/nginx/nginx.conf
COPY error.html /usr/share/nginx/html/error.html
COPY tmux_output_script.sh /usr/local/bin/tmux_output_script.sh

# Install dependencies and set up environment in a single layer
RUN apt-get update && apt-get install -y -o Dpkg::Options::="--force-confold" \
    git \
    curl \
    fd-find \
    gh \
    lsof \
    ripgrep \
    openssh-server \
    nginx-full \
    fcgiwrap \
    tmux \
    nano \
    vim \
    supervisor \
    netcat-openbsd \
    && rm -rf /var/lib/apt/lists/* \
    && mkdir -p -m 755 /etc/apt/keyrings \
    && wget -nv -O- https://cli.github.com/packages/githubcli-archive-keyring.gpg | tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \
    && chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
    # Set up environment variables and save it to /etc/profile.d/nvm.sh
    && echo "export NVM_DIR=\"$NVM_DIR\"" >> /etc/profile.d/nvm.sh \
    && echo "[ -s \"$NVM_DIR/nvm.sh\" ] && \. \"$NVM_DIR/nvm.sh\"" >> /etc/profile.d/nvm.sh \
    && echo "export PATH=\"$NVM_DIR/versions/node/$NODE_VERSION/bin:\$PATH\"" >> /etc/profile.d/nvm.sh \
    && echo "export NVM_BIN=\"$NVM_DIR/versions/node/$NODE_VERSION/bin\"" >> /etc/profile.d/nvm.sh \
    && echo "export NODE_VERSION=\"$NODE_VERSION\"" >> /etc/profile.d/nvm.sh \
    && echo "export NODE_OPTIONS=\"--max-old-space-size=8192\"" >> /etc/profile.d/nvm.sh \
    && echo "export DEBIAN_FRONTEND=noninteractive" >> /etc/profile.d/nvm.sh \
    && echo "export PYTHONUNBUFFERED=1" >> /etc/profile.d/nvm.sh \
    && echo "export COREPACK_ENABLE_DOWNLOAD_PROMPT=0" >> /etc/profile.d/nvm.sh \
    && echo "export PYTHONPATH=\"/usr/local/lib/python3.13/site-packages\"" >> /etc/profile.d/nvm.sh \
    && echo "export IS_SANDBOX=true" >> /etc/profile.d/nvm.sh \
    && echo "export NPM_CONFIG_YES=true" >> /etc/profile.d/nvm.sh \
    && echo "export PIP_NO_INPUT=1" >> /etc/profile.d/nvm.sh \
    && echo "export YARN_ENABLE_IMMUTABLE_INSTALLS=false" >> /etc/profile.d/nvm.sh \
    && chmod +x /etc/profile.d/nvm.sh \
    # Run the SSH setup script
    && /usr/local/bin/setup_ssh_user.sh \
    # Install nvm, Node.js, and code-server
    && mkdir -p $NVM_DIR \
    && curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash \
    && . $NVM_DIR/nvm.sh \
    && nvm install $NODE_VERSION \
    && nvm use $NODE_VERSION \
    && npm install -g yarn pnpm \
    && corepack enable \
    && corepack prepare yarn@stable --activate \
    && corepack prepare pnpm@latest --activate \
    && curl -fsSL https://raw.githubusercontent.com/coder/code-server/refs/tags/v4.99.1/install.sh | sh \
    && uv tool install uvicorn[standard]

ENTRYPOINT ["/usr/local/bin/start.sh"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants