|
1 |
| - - ## **Generate LLM code** |
2 |
| - |
3 |
| - To run the script, go to the root of this repo and use the following command: |
4 |
| - |
5 |
| - ```bash |
6 |
| - python evaluation/scripts/gencode_json.py [options] |
7 |
| - ``` |
8 |
| - |
9 |
| - ### Command-Line Arguments |
10 |
| - - `--model` - Specifies the model name used for generating responses. |
11 |
| - - `--output-dir` - Directory to store the generated code outputs (Default: `evaluation/eval_results/generated_code`). |
12 |
| - - `--input-path` - Directory containing the JSON files describing the problems (Default: `evaluation/problem_json`). |
13 |
| - - `--prompt-dir` - Directory where prompt files are saved (Default: `evaluation/eval_results/prompt`). |
14 |
| - - `--temperature` - Controls the randomness of the generation (Default: 0). |
15 |
| - |
16 |
| - - ## **Evaluate generated code** |
17 |
| - |
18 |
| - Download `test_data.h5` at the path `evaluation/test_data.h5`. |
19 |
| - |
20 |
| - To run the script, go to the root of this repo and use the following command: |
21 |
| - |
22 |
| - ```bash |
23 |
| - python evaluation/scripts/test_generated_code.py |
24 |
| - ``` |
| 1 | + ## **Generate LLM code** |
| 2 | + |
| 3 | +To run the script, go to the root of this repo and use the following command from the repository root: |
| 4 | + |
| 5 | +```bash |
| 6 | +python evaluation/scripts/gencode_json.py [options] |
| 7 | +``` |
| 8 | + |
| 9 | +For example, to create model results with `gpt-4o` and the default settings, run |
| 10 | + |
| 11 | +```bash |
| 12 | +python evaluation/scripts/gencode_json.py --model gpt-4o |
| 13 | +``` |
| 14 | + |
| 15 | +### Command-Line Arguments |
| 16 | + |
| 17 | +- `--model` - Specifies the model name used for generating responses. |
| 18 | +- `--output-dir` - Directory to store the generated code outputs (Default: `eval_results/generated_code`). |
| 19 | +- `--input-path` - Directory containing the JSON files describing the problems (Default: `eval/data/problems_all.jsonl`). |
| 20 | +- `--prompt-dir` - Directory where prompt files are saved (Default: `eval_results/prompt`). |
| 21 | +- `--temperature` - Controls the randomness of the generation (Default: 0). |
| 22 | + |
| 23 | +## **Evaluate generated code** |
| 24 | + |
| 25 | +Download the [numeric test results](https://drive.google.com/drive/folders/1W5GZW6_bdiDAiipuFMqdUhvUaHIj6-pR?usp=drive_link) and save them as `./eval/data/test_data.h5` |
| 26 | + |
| 27 | +To run the script, go to the root of this repo and use the following command: |
| 28 | + |
| 29 | +```bash |
| 30 | +python evaluation/scripts/test_generated_code.py |
| 31 | +``` |
| 32 | + |
| 33 | +Please edit the `test_generated_code.py` source file to specify your model name, results directory and problem set (if not `problems_all.jsonl`). |
0 commit comments