charxiv.github.io - CharXiv

Description: Chart understanding is crucial for applying Multimodal Large Language Models (MLLMs) to tasks like analyzing scientific papers and financial reports. However, current datasets often use simplified charts with template-based questions, leading to overly optimistic progress assessments. We introduce CharXiv, an evaluation suite with 2,323 diverse and challenging charts from scientific papers. CharXiv includes two question types: (1) descriptive questions on basic chart elements and (2) reasoning questions req

ai (10440) benchmarks (77) multimodal large language models. chart understanding (1)

Example domain paragraphs

Chart understanding plays a pivotal role when applying Multimodal Large Language Models (MLLMs) to real-world tasks such as analyzing scientific papers or financial reports. However, existing datasets often focus on oversimplified and homogeneous charts with template-based questions, leading to an over-optimistic measure of progress. In this work, we propose CharXiv , a comprehensive evaluation suite involving 2,323 natural, challenging, and diverse charts from scientific papers. CharXiv includes two types

Figure: Many open-source models surpass proprietary model performance on existing benchmarks (subsets of DVQA, FigureQA and ChartQA from MathVista) yet fail consistently in reasoning questions from CharXiv

We evaluate general-purpose MLLMs on CharXiv and provide a leaderboard for the community to track progress. Note that all models are evaluated in a zero-shot setting with a set of instructions for each question type.

Links to charxiv.github.io (5)