Skip to content

Unable to reproduce paper results #9

@melfm

Description

@melfm

Thanks a lot for the effort to release the code-base. I am trying to reproduce the results from the paper, however I am finding lower performance that what was reported in the paper on most of the datasets and I am wondering whether this is a variance problem to do with seed selection ? Were the reported results ran over a single seed?

In particular, I am having issues reproducing SeqCombMV where the performance is significantly lower (even than the baselines IG and Dynamask). I get the following results when running the model on this:

Results for ours explainer on seqcomb_mv with split=1
	auprc 	 = 0.2960 +- 0.0023
	aup 	 = 0.7468 +- 0.0020
	aur 	 = 0.3036 +- 0.0021
	iou 	 = 0.1143 +- 0.0013
Results for ours explainer on seqcomb_mv with split=2
	auprc 	 = 0.1231 +- 0.0039
	aup 	 = 0.0888 +- 0.0022
	aur 	 = 0.5560 +- 0.0042
	iou 	 = 0.0584 +- 0.0028
Results for ours explainer on seqcomb_mv with split=3
	auprc 	 = 0.7016 +- 0.0038
	aup 	 = 0.7407 +- 0.0015
	aur 	 = 0.4463 +- 0.0020
	iou 	 = 0.3340 +- 0.0028
Results for ours explainer on seqcomb_mv with split=4
	auprc 	 = 0.2680 +- 0.0031
	aup 	 = 0.7546 +- 0.0034
	aur 	 = 0.1154 +- 0.0023
	iou 	 = 0.1375 +- 0.0020
Results for ours explainer on seqcomb_mv with split=5
	auprc 	 = 0.0812 +- 0.0021
	aup 	 = 0.0551 +- 0.0015
	aur 	 = 0.4215 +- 0.0067
	iou 	 = 0.0384 +- 0.0022
Results for ours explainer on all splits
	auprc 	 = 0.2940 +- 0.0039
	aup 	 = 0.4772 +- 0.0055
	aur 	 = 0.3685 +- 0.0030
	iou 	 = 0.1365 +- 0.0020

And this is what was reported in the paper:

AUPRC AUP AUR
0.6878±0.0021 0.8326±0.0008 0.3872±0.0015

I double checked the hyperparameters as well. But is it possible that there is a problem with the generated data, or some error in the hyperparameter?

Thanks a lot for your help in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions