A benchmark of reproducible bugs in DNN libraries
- Docker
- Conda (Anaconda/Miniconda)
- [Only if you have an NVIDIA GPU] Nvidia Container Toolkit
- Clone this repository:
$> git clone https://github.com/ncsu-swat/dnnbugs.git
- Add dnnbugs to your PATH:
$> export PATH=$PATH:<dnnbugs_path>/framework
Command | Description |
---|---|
list-tests | List the tests available on this dataset |
run-test | Runs one test |
run-tests | Runs several tests |
show-info | Shows information about the tests available on this benchmark |
stats | Shows statistics about this dataset (e.g., number of tests that require GPU, number of tests that reproduce bugs in C or Python code, etc.) |
run-tool | Runs a testing tool in the environment of a bug in the dataset to assess the tool's ability to reproduce the bug |
- Show all reproducible bugs (tests) on this dataset:
$> list-tests
- Help to use command show-info:
$> show-info --help
- Show information about bug 120903 from pytorch (id obtained from command above):
$> show-info --library-name pytorch --bug-id 120903
- Reproduce that bug:
$> run-test --library-name pytorch --bug-id 120903
Sample output:
...
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.2.0+cpu
[conda] torch 2.2.0+cpu pypi_0 pypi
====== test session starts ======
platform linux -- Python 3.10.0, pytest-8.2.0, pluggy-1.5.0
rootdir: /home/mnaziri/Documents/DL_Testing/dnnbugs/pytorch/issue_120903
collected 1 item
test_issue_120903.py Pytorch issue no. 120903
Seed: 120903
RuntimeError: !needs_dynamic_casting<func_t>::check(iter) INTERNAL ASSERT FAILED at "../aten/src/ATen/native/cpu/Loops.h":310, please report a bug to PyTorch.
====== 1 passed in 0.96s ======
- Reproduce tests, saving logs in the provided directory:
$> run-tests --log-directory ~/dnn-logs
- Show statistics about the dataset with the buggy files in a trie format
$> stats --print-format trie
Testing tools can be integrated to the dataset to recreate the environment of a specific bug and execute the tool on that environment to see if it can be reproduced with the tool. To integrate, there are four requirements:
- A docker container with the environment of the tool
- A python file for preprocessing (i.e. extracting information like error types and buggy APIs from the library version)
- A python file for postprocessing (i.e. to match the error messages generated by the fuzzer with the error types from the dataset)
- A script that takes the name of the library as an argument and that can trigger the preprocessing, the execution of the tool for that library and the postprocessing script
Once the docker container is built and is running, providing the name of the container and the location of the script that can execute the tool (either absolute path or relative to the root of the repository) to the command run-tool
in the framework along with the library and bug id will first reproduce the bug seperately and then update the environment inside the docker to run the tool with the script provided.
A demonstration of this integration is provided with FreeFuzz, a state-of-the-art testing tool. The implementation of the scripts can be followed as templates to implement for any other fuzzer.
To install and run a container for freefuzz:
$> cd tool-integration/FreeFuzz && bash install_freefuzz_docker.sh
To run the tool by specifying a library version instead of a bug ID (e.g., torch version 2.3.1):
$> run-tool --container freefuzz --library-name pytorch --use-library-version 2.2.0 --run-script tool-integration/FreeFuzz/run_freefuzz_docker.sh
Sample output:
Using bug-id 120903 for library pytorch version 2.2.0
...
====== test session starts ======
platform linux -- Python 3.10.0, pytest-8.2.0, pluggy-1.5.0
...
RuntimeError: !needs_dynamic_casting<func_t>::check(iter) INTERNAL ASSERT FAILED at "../aten/src/ATen/native/cpu/Loops.h":310, please report a bug to PyTorch.
====== 1 passed in 1.09s ======
...
Updating environment in the container of the testing tool
Successfully copied 16.4kB to freefuzz:/tmp/issue_120903
__pycache__ install_environment.sh reproduce_bug.sh requirements.txt test_issue_120903.py
...
Running the testing tool on the environment of the bug
APIs under test:
torch.fake_quantize_per_channel_affine
torch.all
torch.compile
Testing on ['torch']
torch.atanh
...
No violation of precision-oracle in the compare-bug category
No violation of precision-oracle in the potential-bug category
No violation of cuda-oracle in the compare-bug category
No violation of cuda-oracle in the potential-bug category
No violation of crash-oracle in the compare-bug category
No violation of crash-oracle in the potential-bug category
-> torch.fake_quantize_per_channel_affine did not face any failures
-> torch.all did not face any failures
-> torch.compile did not face any failures
Reproduced 0 out of 3 bugs
This will recreate a environmet containing the library version and check against the bugs we have in the database that are reproduced with this version. In this case, there were 3 bugs from pytorch 2.2.0 and FreeFuzz did not reproduce any of them.
Alternatively, to run the tool in a single bug's environment (e.g. pytorch issue_120875)
$> run-tool --container freefuzz --library-name pytorch --bug-id 120875 --run-script tool-integration/FreeFuzz/run_freefuzz_docker.sh
After the execution completes, the output can be checked for all failures produced by FreeFuzz (saved inside the container) and manually inspected to see if any failure that can reveal the bug in question was generated by the tool. In this case, the failure generated by FreeFuzz could not generate a failure that can reveal this bug since the API "torch._dynamo.export" is not supported by FreeFuzz.
Note: For demonstration purposes, the run_freefuzz_docker.sh
script uses a demo configuration of freefuzz. For a full run, this script would need to be updated with a different config file. For more instructions, see the documentation of FreeFuzz.