This repository is currently used to store essential materials for our submission to Internetware'25.
We have placed the descriptions of BPC coreset, BPC set, and related passes from the paper in the quick result directory for easy reference.
- The
quick_result/pass_description.txtfile contains descriptions of the 124 transform passes from LLVM 10.0.0 used in our experiments. - The quick_result/
BPC_set.txtfile stores the BPC generated at the termination of our iterative algorithm. The passes are described using numbers, each ranging from 0 to 123, corresponding sequentially to the passes listed in the quick_result/pass_description.txtfile. - The
quick_result/BPC_coreset.txtfile contains the BPC coreset generated at the termination of our iterative algorithm, consisting of 50 pass sequences. The passes are also described using numbers, each ranging from 0 to 123, corresponding sequentially to the passes listed in thequick_result/pass_description.txtfile.
To simplify the setup process, we provide a Docker snapshot which you can download from the following link:
Use the following command to create a Docker container from our provided .zip file:
First, extract the file to ebpc4cpo.tar.
unzip ebpc4cpo.zip -d YOUR_DIRECTORYNext, load ebpc4cpo.tar as a Docker image.
docker import YOUR_DIRECTORY/ebpc4cpo.tar ebpc4cpoFinally, create a Docker container from the image.
docker run -it --name ebpc4cpo ebpc4cpo bashWe use Conda for Python package management, so after creating the Docker container, please activate the Conda environment with the following command:
conda activate compiler_gymWe have not described how to build the project from scratch because the environment setup is quite complex and has encountered various issues across different operating systems. To ensure you can run our code smoothly, we recommend using the provided Docker image.
Use the following command to generate the BPC coreset (ours).
cd /root/main/Project/EBPC4CPO/Source/Solver/IterativeSolver
python IterativeSolver.py -n base -i 3-n, or --name, specifies the name of the algorithm. Here, base represents our baseline method, while no_removal and no_trimming are used for ablation study.
-i, or --iter, specifies the number of iterations. In this case, we iterate 3 times.
The results will be stored in the 'Source/Result/base' directory.
Use the following command to generate the NVP-1 coreset.
cd /root/main/Project/EBPC4CPO/Source/OtherWork/SearchBasedCoreset
python Search.py -e 7000-e, or --episode, specifies the number of randomly generated pass sequences for each program.
The results will be stored in the 'Source/OtherWork/SearchBasedCoreset' directory.
The NVP-2 coreset uses the results published in the original paper directly.
Use the following command to generate the ICMC coreset.
cd /root/main/Project/EBPC4CPO/Source/OtherWork/ICMC
python ICMC.py -k 100 -c 5k and c are two hyperparameters mentioned in the original paper. Through testing different values, we found an optimal set with k=100 and c=5. Here, k represents the number of passes used, i.e., selecting k passes out of 120, and c represents the number of pass subsequences generated by the graph partitioning algorithm.
The results will be stored in the 'Source/OtherWork/ICMC' directory.
Use the following command to generate the GENS coreset.
cd /root/main/Project/EBPC4CPO/Source/OtherWork/CoverSet
python gen_coverset.pyThe results will be stored in the 'Source/OtherWork/CoverSet' directory.
After generating all the coresets, you need to move them to the same directory for comparison. Execute the following command:
cd /root/main/Project/EBPC4CPO/Source
cp Result/base/__it_train/__it_train_coreset_2.txt Evaluation/coreset/bpc_coreset.txt
cp OtherWork/SearchBasedCoreset/coreset___it_train_500 Evaluation/coreset/nvp_coreset_1.txt
cp OtherWork/ICMC/icmc_O124P497K100c5_coreset.txt Evaluation/coreset/icmc_coreset.txt
cp OtherWork/CoverSet/gens_coreset.txt Evaluation/coreset/gens_coreset.txtThen execute the following command to evaluate them.
cd /root/main/Project/EBPC4CPO/Source/Evaluation
python evaluation.py The results will be stored in the 'Source/Evaluation/coreset/eval' directory.
Use the following command to complete the training and evaluation of MLP:
cd /root/main/Project/EBPC4CPO/Source/Evaluation
python evluation_RQ2.pyThe results will be saved in the xxx directory.
We use extensive caching, so if you modify any coreset, you need to delete the corresponding data in the data, eval, and saved_models folders under the Evaluation/MLP directory.
Use the following command to plot the results. The images will be saved in the Source/Evaluation/MLP/eval directory (RQ2.pdf):
cd /root/main/Project/EBPC4CPO/Source/Evaluation/MLP/eval
python Draw_RQ2.pyUse the following command to complete the ablation experiment for the Refine module:
cd /root/main/Project/EBPC4CPO/Source/Solver/IterativeSolver
python IterativeSolver.py -n no_removal -i 3
python IterativeSolver.py -n no_trimming -i 3The results will be saved in the Source/Result/base/no_removal (or, no_trimming) directory.
Use the following command to complete the ablation experiment for the Coreset generation module:
cd /root/main/Project/EBPC4CPO/Source/Result
python RQ3.py -i 3-i, or --iter, specifies the number of iterations.
The results will be saved in the current directory, and the related charts will be generated. The output for the Refine module ablation will be printed on the console, while the output for the Coreset generation module ablation will be saved in ablation_2.pdf in the current directory.