Multimodal LLMs for Visualization Understanding and Generation
Benchmarks and models for chart comprehension, narrative generation, and visual data communication—shaping how multimodal LLMs interpret, explain, and generate visualizations from and for real analytic workflows.
Representative papers
- Text2Vis: Generating Multimodal Visualizations from Natural Language (EMNLP 2025)
- ChartGemma: Visual instruction-tuning for chart reasoning in the wild (COLING 2025)
- BigCharts-R1: Visual Reinforcement Learning for Chart Reasoning (COLM 2025)
- ChartQA: A benchmark for question answering about charts with visual and logical reasoning (ACL 2022)
- Chart-to-Text: A Large-Scale Benchmark for Chart Summarization (ACL 2022)