A testing framework for Secure Multiparty Computation (MPC) Compilers.
This is the repository for our paper "Cost-Effective Testing of MPC Compilers".
scripts/build_docker.sh --all -p 10This creates the base image as well as an image for each MPC compiler:
babelfuzz-mpspdzbabelfuzz-empbabelfuzz-ezpcbabelfuzz-silph-pspecifies the number of parallel jobs used during the build process.
In the MP-SPDZ and EMP containers, the script additionally installs all previous versions required for the Time-to-Bug experiments.
Per default, it installs relatively recent versions of the compilers (at the time of writing).
If you want to modify the version (to e.g. test the currently available versions), modify the respective Dockerfiles in ./docker.
NOTE: Building the MP-SPDZ image requires the file boost_1_83_0.tar.bz2 to be present in the repository root, which is for example available on sourceforge.
Start the respective container as usual (example: EMP)
docker run -it babelfuzz-empThis section assumes that you have built and started the docker container of the compiler you want to test as described above.
To configure where logs, generated programs etc are dropped, modify the paths in scripts/config.sh.
The default values drop everything in the /home directory of the docker containers.
To modify the behavior of BabelFuzz (size of generated seed programs etc.), modify the configuration template src/swarm/swarm_config.json.
"SUPPORTED_DSL" is especially important. It controls which compilers are tested.
In the docker containers, the default values are set to the compiler the docker image is built for.
If you want to change the values anyways, modify the "SUPPORTED_DSL" values as follows:
"SUPPORTED_DSL": {"type": "choose", "values": ["YOUR", "VALUES", "HERE"]},Here are the values for each compiler:
- MP-SPDZ:
["mpspdz-field", "mpspdz-field-fixed", "mpspdz-ring", "mpspdz-binary", "mpspdz-emulate"] - EMP:
["emp"] - EzPC:
["ezpc-arithmetic", "ezpc-binary"] - Silph:
["silph"]
Note that for EzPC and MP-SPDZ, there are multiple sub-modes.
For example, mpspdz-field-fixed tests the fixed-point implementations of MP-SPDZ, while mpspdz-binary tests its compilation to binary representations.
Command:
scripts/start_swarm.sh -p 10This is the main testing mode.
BabelFuzz starts p instances, which all run the main loop:
- generate a new config
- execute the testing pipeline using the config
- store a log if an error was discovered
BabelFuzz prints real-time stats about the current test campaign to the console.
Notes:
- stop testing with
Ctrl-c - if you choose
-p 1, you will not get real-time stats - per default, intermediate programs are deleted at the end of an iteration. If you want to rerun an iteration and have a look at all generated programs and translations, see the next section.
If BabelFuzz finds an error during swarm testing, it stores the config and other information in a log (in the directory specified in scripts/config.sh, default /home/logs).
If you want to rerun an iteration that produced a specific log, run:
scripts/start_single_with_log.sh path/to/log.txt results_dirIf you already have a .config file you want to rerun, run:
scripts/start_single.sh path/to/config/file.config results_dirThis drops a log as well as the generated programs in the specified results_dir.
Start MP-SPDZ or EMP container (depending on experiments):
docker run -it babelfuzz-mpspdzInside the container, start the experiments like this:
scripts/start_experiment.sh -p 100 -c experiment_configs/time_to_bug/dt --temp_dir /home/experiments_temp --log_dir /home/logs --rand 42 --compilers_dir /home/compilersTo reproduce the results for each table in our paper, run the following experiments (-c option):
- Table II (Time to Bug DT mode):
experiment_configs/time_to_bug/dt(MP-SPDZ),experiment_configs/emp_time_to_bug/dt (EMP) - Table III (Time to Bug MT mode):
experiment_configs/time_to_bug/mt(MP-SPDZ),experiment_configs/emp_time_to_bug/mt(EMP) - Table IV (Time to Bug DT mode for Bugs originally discovered by MT-MPC):
experiment_configs/time_to_bug/related_work/dt(MP-SPDZ) - Table V (Time to Bug MT mode for Bugs originally discovered by MT-MPC):
experiment_configs/time_to_bug/related_work/mt(MP-SPDZ)
-pis the number of parallel processes- Since each experiment uses a single core, make sure this is <= the number of physical cores of your system
- You can't run more than
pexperiments in parallel. Keep that in mind when specifying which experiments to run.- Note also that per default, each experiment starts 10 tasks (different random seeds). Since e.g.
experiment_configs/time_to_bug/dtincludes 10 configs, each with 10 repeats, you need at least-p 100. If you want fewer repeats per experiment, you have to modify theexperiment_repeatsvalue in the respective configs.
- Note also that per default, each experiment starts 10 tasks (different random seeds). Since e.g.
-cis the path to the experiment configs- if this is a directory, all configs in this dir and in all sub-dirs are used
- can also be a single config file
--temp_dirdirectory where all intermediate programs etc are stored. Will be created if it does not exist.--log_dir: where experiment logs are stored Will be created if it does not exist.--rand: the random seed for this run. We used--rand 42for all our experiment runs.--compilers_dir: the directory containing all necessary compiler versions- In the docker container, this should be
/home/compilers
- In the docker container, this should be
Here are the most important configurations for the experiments:
experiment_duration: duration in seconds. Default: 7 days (604800seconds).experiment_repeats: number of parallel tasks this experiment starts, where each task uses a different random seed. Default: 10collect_stats: If the experiments should collect the number of miscompilations, compiler errors etc. Has no impact on experiment results, but can be used for debugging purposes. Default: trueconfig_overrides: Overrides the main configurations (src/swarm/swarm_config.json). E.g. to set the correct compilers to test for a specific experiment.emp_versionsspecifies the versions for the emp components. This is used so the installer knows which component versions to install.
Per default, each experiment run uses a single CPU core.
In hyperthreaded systems, one physical core may be represented by two virtual ones.
BabelFuzz therefore assigns two CPUs to each experiment task.
You may have to modify the get_cpus_for_task function in src/experiments/main.py to fit your setup.
Currently, it looks like this:
def get_cpus_for_task(task_id):
physical_cores = psutil.cpu_count(logical=False)
return [task_id, task_id + physical_cores]This ensures that e.g. task 0 gets CPUs 0 and 8 (assuming we have 8 physical cores), task 1 gets 1 and 9 and so on.
If your OS maps CPUs differently, you have to modify this function to ensure the experiments perform as expected.
Example: if physical core 0 maps to CPUs 0 and 1, physical core 1 to CPUs 2 and 3 etc., the function might look like this:
def get_cpus_for_task(task_id):
return [task_id * 2, task_id * 2 + 1]