ECCO

This repository contains the source code for the paper "ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?"

Dataset

The dataset is available on Huggingface at: CodeEff/ECCO.

It consists of 2 subsets edit and generate each with 3 splits (train, val and test).

Loading the dataset

dataset = load_dataset('CodeEff/ECCO', 'edit') # For history-based editing setting
dataset = load_dataset('CodeEff/ECCO', 'generate') # For nl-instructed generation setting

Download the test cases

mkdir data && cd data
wget https://huggingface.co/datasets/CodeEff/ECCO/resolve/main/test_cases.zip
unzip test_cases.zip

Experiments

Environment setup

conda env create -f environment.yml
conda activate ecco

Code structure

evaluation consists of scripts to run evaluation of model generated code on the Judge0 environment server hosted on AWS. Please see instructions to setup the evaluation server.
- edit_eval.py is the script for evaluating code generated on the metrics for the history-based editing setting
- generate_eval.py is the script for evaluating code generated on the metrics for the NL-instructed generation setting
experiments consists of the scripts to run modelling experiment.
- model_classes.py consists of the Inference Engine Classes for each model that is benchmarked.
- inference.py is the entrypoint for running the experiments
- prompt_formats.py and utils.py cotains utilities for prompt building and execution feedback formatting

Starting up the evaluation setup

Setup the evaluation setup with the guide in the evaluation README

Running experiments / Generating Code

We run experiments to generate code from the experiments/inference.py entrypoint. An example is provided below:

python experiments/inference.py --model deepseek \
   --temperature 0.4 --num_samples 1 --eval_mode "edit"

Model choices are in the registry

--eval_mode choices are ['edit', 'nl2code', 'self-refine', 'exec-refine','nl2code-self-refine', 'nl-exec-refine', 'nl2code-exec-refine', 'nl2code-nl-exec-refine'] for the different experiments. Modes without the prefix nl2code correspond to the history-based editing setting and with the prefix refer to the NL-instructed generation paradigm.

Citation

@article{waghjale2024ecco,
  title={ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?},
  author={Waghjale, Siddhant and Veerendranath, Vishruth and Wang, Zora Zhiruo and Fried, Daniel},
  journal={arXiv preprint arXiv:2407.14044},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
evaluation		evaluation
experiments		experiments
images		images
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECCO

Dataset

Loading the dataset

Download the test cases

Experiments

Environment setup

Code structure

Starting up the evaluation setup

Running experiments / Generating Code

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ECCO

Dataset

Loading the dataset

Download the test cases

Experiments

Environment setup

Code structure

Starting up the evaluation setup

Running experiments / Generating Code

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages