Skip to content

A benchmark of reproducible bugs in DNN libraries

Notifications You must be signed in to change notification settings

ncsu-swat/bugsindlls

Repository files navigation

BugsinDLLs

A benchmark of reproducible bugs in DNN libraries

Prerequisites

Steps to configure

  1. Clone this repository: $> git clone https://github.com/ncsu-swat/dnnbugs.git
  2. Add dnnbugs to your PATH: $> export PATH=$PATH:<dnnbugs_path>/framework

Commands

Command Description
list-tests List the tests available on this dataset
run-test Runs one test
run-tests Runs several tests
show-info Shows information about the tests available on this benchmark
stats Shows statistics about this dataset (e.g., number of tests that require GPU, number of tests that reproduce bugs in C or Python code, etc.)
run-tool Runs a testing tool in the environment of a bug in the dataset to assess the tool's ability to reproduce the bug

Example usage

  • Show all reproducible bugs (tests) on this dataset:
$> list-tests
  • Help to use command show-info:
$> show-info --help
  • Show information about bug 120903 from pytorch (id obtained from command above):
$> show-info --library-name pytorch --bug-id 120903
  • Reproduce that bug:
$> run-test --library-name pytorch --bug-id 120903

Sample output:

...
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.2.0+cpu
[conda] torch                     2.2.0+cpu                pypi_0    pypi
====== test session starts ======
platform linux -- Python 3.10.0, pytest-8.2.0, pluggy-1.5.0
rootdir: /home/mnaziri/Documents/DL_Testing/dnnbugs/pytorch/issue_120903
collected 1 item                                                                            

test_issue_120903.py Pytorch issue no. 120903
Seed:  120903
RuntimeError: !needs_dynamic_casting<func_t>::check(iter) INTERNAL ASSERT FAILED at "../aten/src/ATen/native/cpu/Loops.h":310, please report a bug to PyTorch. 
====== 1 passed in 0.96s ======
  • Reproduce tests, saving logs in the provided directory:
$> run-tests --log-directory ~/dnn-logs
  • Show statistics about the dataset with the buggy files in a trie format
$> stats --print-format trie

Tool integration

Testing tools can be integrated to the dataset to recreate the environment of a specific bug and execute the tool on that environment to see if it can be reproduced with the tool. To integrate, there are four requirements:

  1. A docker container with the environment of the tool
  2. A python file for preprocessing (i.e. extracting information like error types and buggy APIs from the library version)
  3. A python file for postprocessing (i.e. to match the error messages generated by the fuzzer with the error types from the dataset)
  4. A script that takes the name of the library as an argument and that can trigger the preprocessing, the execution of the tool for that library and the postprocessing script

Once the docker container is built and is running, providing the name of the container and the location of the script that can execute the tool (either absolute path or relative to the root of the repository) to the command run-tool in the framework along with the library and bug id will first reproduce the bug seperately and then update the environment inside the docker to run the tool with the script provided.

Demonstration

A demonstration of this integration is provided with FreeFuzz, a state-of-the-art testing tool. The implementation of the scripts can be followed as templates to implement for any other fuzzer.

To install and run a container for freefuzz:

$> cd tool-integration/FreeFuzz && bash install_freefuzz_docker.sh

To run the tool by specifying a library version instead of a bug ID (e.g., torch version 2.3.1):

$> run-tool --container freefuzz --library-name pytorch --use-library-version 2.2.0 --run-script tool-integration/FreeFuzz/run_freefuzz_docker.sh

Sample output:

Using bug-id 120903 for library pytorch version 2.2.0
...
====== test session starts ======
platform linux -- Python 3.10.0, pytest-8.2.0, pluggy-1.5.0
...
RuntimeError: !needs_dynamic_casting<func_t>::check(iter) INTERNAL ASSERT FAILED at "../aten/src/ATen/native/cpu/Loops.h":310, please report a bug to PyTorch.
====== 1 passed in 1.09s ======
...
Updating environment in the container of the testing tool
Successfully copied 16.4kB to freefuzz:/tmp/issue_120903
__pycache__  install_environment.sh  reproduce_bug.sh  requirements.txt  test_issue_120903.py
...
Running the testing tool on the environment of the bug
APIs under test:
torch.fake_quantize_per_channel_affine
torch.all
torch.compile
Testing on  ['torch']
torch.atanh
...
No violation of precision-oracle in the compare-bug category
No violation of precision-oracle in the potential-bug category
No violation of cuda-oracle in the compare-bug category
No violation of cuda-oracle in the potential-bug category
No violation of crash-oracle in the compare-bug category
No violation of crash-oracle in the potential-bug category
-> torch.fake_quantize_per_channel_affine did not face any failures
-> torch.all did not face any failures
-> torch.compile did not face any failures
Reproduced 0 out of 3 bugs

This will recreate a environmet containing the library version and check against the bugs we have in the database that are reproduced with this version. In this case, there were 3 bugs from pytorch 2.2.0 and FreeFuzz did not reproduce any of them.

Alternatively, to run the tool in a single bug's environment (e.g. pytorch issue_120875)

$> run-tool --container freefuzz --library-name pytorch --bug-id 120875 --run-script tool-integration/FreeFuzz/run_freefuzz_docker.sh

After the execution completes, the output can be checked for all failures produced by FreeFuzz (saved inside the container) and manually inspected to see if any failure that can reveal the bug in question was generated by the tool. In this case, the failure generated by FreeFuzz could not generate a failure that can reveal this bug since the API "torch._dynamo.export" is not supported by FreeFuzz.

Note: For demonstration purposes, the run_freefuzz_docker.sh script uses a demo configuration of freefuzz. For a full run, this script would need to be updated with a different config file. For more instructions, see the documentation of FreeFuzz.

About

A benchmark of reproducible bugs in DNN libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published