Name	Name	Last commit message	Last commit date
parent directory ..
change_info	change_info
eval_outcome	eval_outcome
information_retrieval	information_retrieval
ml_data	ml_data
reinforcement_learning	reinforcement_learning
README.md	README.md
eval_const.py	eval_const.py
eval_main.py	eval_main.py
eval_metric.py	eval_metric.py
eval_tcp.py	eval_tcp.py
eval_utils.py	eval_utils.py
extract_change_features.py	extract_change_features.py
extract_change_info_from_raw_data.py	extract_change_info_from_raw_data.py
extract_hist_features.py	extract_hist_features.py
extract_ml_data.py	extract_ml_data.py
extract_test_features.sh	extract_test_features.sh
extract_test_file_occ_hist_features.py	extract_test_file_occ_hist_features.py

Name

Last commit message

Last commit date

eval_outcome

information_retrieval

ml_data

reinforcement_learning

extract_change_features.py

extract_change_info_from_raw_data.py

extract_hist_features.py

extract_ml_data.py

extract_test_features.sh

extract_test_file_occ_hist_features.py

README

Extracting test features

To get test features for a project, we need to have processed test results and code change data (processed_test_results/ and shadata/) generated from scripts from the parent folder (See README from parent folder).

Run ./extract_test_features.sh to extract test features. It will create a folder tcp_features that stores test features per test-suite run; in tcp_features, change_feature.csv stores code change related features, historical.csv and test_file_occ.csv store test duration, outcome, (test outcome, changed file) occurance features. These features will be used by the evaluated TCP techniques listed in eval_const.py.

Additionally, we can run extract_ml_data.py to collect data for ML-based TCP techniques. Use instructions from information_retrieval/ and reinforcement_learning/ to collect data for IR-based and RL-based TCP techniques.

Running TCP techniques

Run python3 eval_main.py to evaluate TCP techniques on specific projects in the dataset in test class granularity. We can configure the evaluating techniques (variable EVAL_TCPS) and evaluation metrics (variable METRIC_NAMES) in eval_const.py, and configure the evaluating projects in ../const.py. The evaluation outcome will be saved in eval_outcome/[project]/d_[filters]/[tcp_name].csv.zip. By default, all TCP techniques will be evaluated on all defined metrics under both one-to-one and many-to-one failure-to-fault mappings.

We can also specify the version of the dataset used for evaluation by applying different filters to keep or omit some of the builds that have failed tests in eval_const.py (variable FILTER_COMBOS). We can obtain the labeled version of the test results by running ../extract_filtered_test_result.py. Here is description of each filter:

FILTER_FIRST keeps the builds that have failure of test that failed for the first time throughout the collected CI history;
FILTER_JIRA omits test failures that are due to flaky tests identified from JIRA/Github issues;
FILTER_STAGEUNIQUE omits tests that failed on only one stage when the build runs the same test suite on several stages that have different envrionments (e.g., JDK 8 vs JDK 11);
FILTER_FREQFAIL omits tests that failed more frequently than failed tests using three sigma-rule-of-thumb.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README

Extracting test features

Running TCP techniques

FilesExpand file tree

evaluation

Directory actions

More options

Directory actions

More options

Latest commit

History

evaluation

Folders and files

parent directory

README.md

README

Extracting test features

Running TCP techniques