You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the data format (only for the findings section) as R2Gen and R2GenCMN (Chen et al.) followed in this article, but I was unable to obtain the CE metric results mentioned in the paper.
I used the provided epoch=8-val_chen_cider=0.425092.ckpt model for cvt_21_to_distilgpt2 task and also tested epoch=0-val_chen_cider=0.410965.ckpt model for cvt_21_to_distilgpt2_scst task, but neither of them achieved the CE metric results mentioned in the paper.
In terms of CE metric, precision_macro can reach the result mentioned in the paper, but recall_macro and f1_macro cannot achieve it and there is a significant difference between them.
When calculating CE metrics here, only text related to findings is considered; do I need to perform any other processing?
Yes, I am using this dataset and the precision has reached the level reported in the paper. However, the recall rate is low and cannot reach the level reported in the paper.
Also, I have checked and tested the updated source code. The CE metric results did not change much, and there is a bug in the latest source code when running it. The bug is as follows:
The bug that occurred while I was executing the cvt_21_to_distilgpt2 task.
On line 281 of transmodal.model.py, the content is if not getattr(self, metric).compute_on_step:.
It indicates that the compute_on_step attribute does not exist.
I used the data format (only for the findings section) as R2Gen and R2GenCMN (Chen et al.) followed in this article, but I was unable to obtain the CE metric results mentioned in the paper.
I used the provided
epoch=8-val_chen_cider=0.425092.ckpt
model forcvt_21_to_distilgpt2
task and also testedepoch=0-val_chen_cider=0.410965.ckpt
model forcvt_21_to_distilgpt2_scst
task, but neither of them achieved the CE metric results mentioned in the paper.In terms of CE metric,
precision_macro
can reach the result mentioned in the paper, butrecall_macro
andf1_macro
cannot achieve it and there is a significant difference between them.When calculating CE metrics here, only text related to findings is considered; do I need to perform any other processing?
The results obtained from performing
cvt_21_to_distilgpt2
task are as follows:{'test_ce_f1_example': 0.36598095297813416,
'test_ce_f1_macro': 0.2593880891799927,
'test_ce_f1_micro': 0.4408090114593506,
'test_ce_num_examples': 3858.0,
'test_ce_precision_example': 0.4171517491340637,
'test_ce_precision_macro': 0.3600466549396515,
'test_ce_precision_micro': 0.4919118881225586,
'test_ce_recall_example': 0.3665845990180969,
'test_ce_recall_macro': 0.25423887372016907,
'test_ce_recall_micro': 0.3993246555328369,
'test_chen_bleu_1': 0.39292487502098083,
'test_chen_bleu_2': 0.24805393815040588,
'test_chen_bleu_3': 0.17164887487888336,
'test_chen_bleu_4': 0.1269991397857666,
'test_chen_cider': 0.3902686834335327,
'test_chen_meteor': 0.15456412732601166,
'test_chen_num_examples': 3858.0,
'test_chen_rouge': 0.286588191986084}
The results in the paper are as follows:
precision_macro: 0.3597
recall_macro: 0.4122
f1_macro: 0.3842
The results obtained from performing
cvt_21_to_distilgpt2_scst
task are as follows:{'test_ce_f1_example': 0.36484676599502563,
'test_ce_f1_macro': 0.26361414790153503,
'test_ce_f1_micro': 0.4410783648490906,
'test_ce_num_examples': 3858.0,
'test_ce_precision_example': 0.4175392985343933,
'test_ce_precision_macro': 0.3873042166233063,
'test_ce_precision_micro': 0.49624764919281006,
'test_ce_recall_example': 0.3643813729286194,
'test_ce_recall_macro': 0.2558453679084778,
'test_ce_recall_micro': 0.3969484865665436,
'test_chen_bleu_1': 0.39466917514801025,
'test_chen_bleu_2': 0.248764768242836,
'test_chen_bleu_3': 0.1718045324087143,
'test_chen_bleu_4': 0.1269892156124115,
'test_chen_cider': 0.37993040680885315,
'test_chen_meteor': 0.15499255061149597,
'test_chen_num_examples': 3858.0,
'test_chen_rouge': 0.28760746121406555}
Reproduced the above content, only modifying the task parameters in task/mimic_cxr_jpg_chen/jobs.yaml.
The text was updated successfully, but these errors were encountered: