Can you provide the prompts when when evaluating multi-modal baseline models?
When I try to reproduce the baseline results on the ChartX benchmark. I found that models results is sensitive to prompts especially on single_class_chart types. Because there is no enough information extract the header in these chart. And the typos may affect the evaluation results, such as these three output:
Air\t20.0%\nRail\t15.0%\nRoad\t45.0%\nSea\t20.0%\nOther\t0.0%
Mode \t ratio \n Air \t 20% \n Rail \t 15% \n Road \t 45% \n Sea \t 20% \n Other \t 0% \n
Mode \t Percentage \n Air \t 20% \n Rail \t 15% \n Road \t 45% \n Sea \t 20% \n Other \t 0% \n
So I want to know how you deal with this when using models which is not finetuned on your chartx dataset.