[Minor] Support Gemini 2.5 Flash / Pro by kennymckormick · Pull Request #958 · open-compass/VLMEvalKit

kennymckormick · 2025-04-27T07:25:20Z

No description provided.

* add vgrpbench * remove unnecessary files * [Improvement] Allow setting model name for lmdeploy wrapper (#913) Signed-off-by: Isotr0py <[email protected]> * [Minor] Add GPT-4.1 * [Fix] Fix function extract_subjective in dataset/creation.py (#911) * creation: extract_subjective * fix lint * [Fix] fix LA mode in HLE * [Fix] Fix COT Prompt BUG (#922) * [Patch] Bypass SSL (#923) * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Dataset ] add support for SAIL-VL-1.5 (#926) * 修改提交 * 提交名称修改 * 去除提交 * 去除提交 * 文件名修改 * 文件名修改 * 格式修复 --------- Co-authored-by: jinfeng.km <[email protected]> Co-authored-by: qiuyan.kk <[email protected]> * remove unnecessary file * [Fix] fix physics_yale with not using custom prompt in internvl series * [Model] Support SAIL-VL-1.6 (#939) Co-authored-by: qiuyan.kk <[email protected]> * [Minor] More info in tqdm progress bar (#937) * [Feature] Add vLLM support for Qwen2-VL/Qwen2.5-VL (#935) Co-authored-by: TianhaoLiang2000 <[email protected]> * [Benchmark] Support MMIFEval (#938) * add mmifeval * add req nltk * [Fix] update url and remove unnecessary log --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Model] Add support for Janus-Pro-1B (#945) add support for Janus-Pro-1B * [Minor] Patch to fix DynaMath preprocess * add vgrpbench * [Fix] Fix Lint * remove files * add vgrpbench's format json files, and update gitignore rule * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Benchmark] Add Support for Spatial457 Benchmark (CVPR 2025 Highlight) (#932) * update spatial457 * fix format * update readme * update README.md * update summarize.py * update dataset/__init__.py * update summarize.py * Revert image_vqa.py * add back spatial457 * Implement a more robust strategy for Spatial457 answer matching --------- Co-authored-by: kennymckormick <[email protected]> Co-authored-by: Haodong Duan <[email protected]> * [Model] add support for Qwen2.5-Omni (#883) * add support for qwen2.5_omni * add support for qwen2.5_omni (only single process) * update model cls for qwen2_5omni * Delete VIDEO_DLC_scripts/MMSci_internvl2_8b.sh * Delete VIDEO_DLC_scripts/video_lb_update_cu118_smol.sh * Delete VIDEO_DLC_scripts/video_lb_update_qwen2_5_vl_7b.sh * Delete files * [Fix] Fix Lint --------- Co-authored-by: Haodong Duan <[email protected]> Co-authored-by: kennymckormick <[email protected]> * [Benchmark] Support VisuLogic (#944) Co-authored-by: Haodong Duan <[email protected]> * [Minor] Support Gemini 2.5 Flash / Pro (#958) * [Minor] Add Explicit Format Instruction for AMBER (#961) * [Minor] Fix all_finished return null (#951) * [Benchmark] Support CVBench (CV-Bench-2D, CV-Bench-3D) (#909) * [Benchmark] Support CVBench, including CV-Bench-2D, CV-Bench-3D two sub tasks. * fix(image_mcq.py): prompt error * [Fix] Fix vllm with config (#953) * fix use config with vllm * fix * update * [Fix] Fix MM-IFEval & Custom Prompt in InternVL (#959) * [Model] Support SenseNova-V6-Pro (#964) * [Model] Support SenseNova-V6 * update model config * update * update * update config * [Benchmark] Support TDBench (#947) * [Benchmark] Add TDBench for top-down images * fix REresult symlink and index * fix symlink * fix lint * [Fix] Refactor Task Launching Policy (#952) * Update run.py * [Refactor] Set CUDA_VISIBLE_DEVICES at the beginning * [Minor] auto / cuda device for several VLMs * [Doc] Update Doc * [Minor] Update CV-Bench URL * [Fix] Fix tmp ans load error in MM-IFEval (#969) * Fix tmp ans load error in MM-IFEval * Fix KeyError 0 * add vgrpbench * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> --------- Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: kennymckormick <[email protected]> Co-authored-by: Shengyuan Ding <[email protected]> Co-authored-by: Xinyu Fang <[email protected]> Co-authored-by: Haodong Duan <[email protected]> Co-authored-by: suencgo <[email protected]> Co-authored-by: cmatachuan <[email protected]> Co-authored-by: jinfeng.km <[email protected]> Co-authored-by: qiuyan.kk <[email protected]> Co-authored-by: Xiangyu Zhao <[email protected]> Co-authored-by: TianhaoLiang2000 <[email protected]> Co-authored-by: TianhaoLiang2000 <[email protected]> Co-authored-by: Jiang Li <[email protected]> Co-authored-by: Xingrui Wang <[email protected]> Co-authored-by: xwy-bit <[email protected]> Co-authored-by: psp_dada <[email protected]> Co-authored-by: MaoSong2022 <[email protected]> Co-authored-by: Scott Zhao <[email protected]>

* add vgrpbench * remove unnecessary files * [Improvement] Allow setting model name for lmdeploy wrapper (open-compass#913) Signed-off-by: Isotr0py <[email protected]> * [Minor] Add GPT-4.1 * [Fix] Fix function extract_subjective in dataset/creation.py (open-compass#911) * creation: extract_subjective * fix lint * [Fix] fix LA mode in HLE * [Fix] Fix COT Prompt BUG (open-compass#922) * [Patch] Bypass SSL (open-compass#923) * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (open-compass#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Dataset ] add support for SAIL-VL-1.5 (open-compass#926) * 修改提交 * 提交名称修改 * 去除提交 * 去除提交 * 文件名修改 * 文件名修改 * 格式修复 --------- Co-authored-by: jinfeng.km <[email protected]> Co-authored-by: qiuyan.kk <[email protected]> * remove unnecessary file * [Fix] fix physics_yale with not using custom prompt in internvl series * [Model] Support SAIL-VL-1.6 (open-compass#939) Co-authored-by: qiuyan.kk <[email protected]> * [Minor] More info in tqdm progress bar (open-compass#937) * [Feature] Add vLLM support for Qwen2-VL/Qwen2.5-VL (open-compass#935) Co-authored-by: TianhaoLiang2000 <[email protected]> * [Benchmark] Support MMIFEval (open-compass#938) * add mmifeval * add req nltk * [Fix] update url and remove unnecessary log --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Model] Add support for Janus-Pro-1B (open-compass#945) add support for Janus-Pro-1B * [Minor] Patch to fix DynaMath preprocess * add vgrpbench * [Fix] Fix Lint * remove files * add vgrpbench's format json files, and update gitignore rule * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (open-compass#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> * [Benchmark] Add Support for Spatial457 Benchmark (CVPR 2025 Highlight) (open-compass#932) * update spatial457 * fix format * update readme * update README.md * update summarize.py * update dataset/__init__.py * update summarize.py * Revert image_vqa.py * add back spatial457 * Implement a more robust strategy for Spatial457 answer matching --------- Co-authored-by: kennymckormick <[email protected]> Co-authored-by: Haodong Duan <[email protected]> * [Model] add support for Qwen2.5-Omni (open-compass#883) * add support for qwen2.5_omni * add support for qwen2.5_omni (only single process) * update model cls for qwen2_5omni * Delete VIDEO_DLC_scripts/MMSci_internvl2_8b.sh * Delete VIDEO_DLC_scripts/video_lb_update_cu118_smol.sh * Delete VIDEO_DLC_scripts/video_lb_update_qwen2_5_vl_7b.sh * Delete files * [Fix] Fix Lint --------- Co-authored-by: Haodong Duan <[email protected]> Co-authored-by: kennymckormick <[email protected]> * [Benchmark] Support VisuLogic (open-compass#944) Co-authored-by: Haodong Duan <[email protected]> * [Minor] Support Gemini 2.5 Flash / Pro (open-compass#958) * [Minor] Add Explicit Format Instruction for AMBER (open-compass#961) * [Minor] Fix all_finished return null (open-compass#951) * [Benchmark] Support CVBench (CV-Bench-2D, CV-Bench-3D) (open-compass#909) * [Benchmark] Support CVBench, including CV-Bench-2D, CV-Bench-3D two sub tasks. * fix(image_mcq.py): prompt error * [Fix] Fix vllm with config (open-compass#953) * fix use config with vllm * fix * update * [Fix] Fix MM-IFEval & Custom Prompt in InternVL (open-compass#959) * [Model] Support SenseNova-V6-Pro (open-compass#964) * [Model] Support SenseNova-V6 * update model config * update * update * update config * [Benchmark] Support TDBench (open-compass#947) * [Benchmark] Add TDBench for top-down images * fix REresult symlink and index * fix symlink * fix lint * [Fix] Refactor Task Launching Policy (open-compass#952) * Update run.py * [Refactor] Set CUDA_VISIBLE_DEVICES at the beginning * [Minor] auto / cuda device for several VLMs * [Doc] Update Doc * [Minor] Update CV-Bench URL * [Fix] Fix tmp ans load error in MM-IFEval (open-compass#969) * Fix tmp ans load error in MM-IFEval * Fix KeyError 0 * add vgrpbench * [Benchmark] Add PHYSICS Benchmark for Open-Ended Physics Reasoning (open-compass#931) * add physic.py and update dataset logic * Initial commit:integrated physics prompt eval * fix lint * [Fix] update get judge model logic in physics dataset * edit the prompt in auxeval * fix auxeval in physices * fix lint --------- Co-authored-by: FangXinyu-0913 <[email protected]> --------- Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: kennymckormick <[email protected]> Co-authored-by: Shengyuan Ding <[email protected]> Co-authored-by: Xinyu Fang <[email protected]> Co-authored-by: Haodong Duan <[email protected]> Co-authored-by: suencgo <[email protected]> Co-authored-by: cmatachuan <[email protected]> Co-authored-by: jinfeng.km <[email protected]> Co-authored-by: qiuyan.kk <[email protected]> Co-authored-by: Xiangyu Zhao <[email protected]> Co-authored-by: TianhaoLiang2000 <[email protected]> Co-authored-by: TianhaoLiang2000 <[email protected]> Co-authored-by: Jiang Li <[email protected]> Co-authored-by: Xingrui Wang <[email protected]> Co-authored-by: xwy-bit <[email protected]> Co-authored-by: psp_dada <[email protected]> Co-authored-by: MaoSong2022 <[email protected]> Co-authored-by: Scott Zhao <[email protected]>

[Minor] Support Gemini 2.5 Flash / Pro

c6cde8c

kennymckormick merged commit 5bf4a4f into main Apr 27, 2025
8 checks passed

kennymckormick added a commit to ryf1123/VLMEvalKit that referenced this pull request Apr 30, 2025

[Minor] Support Gemini 2.5 Flash / Pro (open-compass#958)

692bf18

Koii2k3 pushed a commit to wjnwjn59/VLMEvalKit that referenced this pull request Nov 13, 2025

[Minor] Support Gemini 2.5 Flash / Pro (open-compass#958)

507c049

kennymckormick deleted the gemini25 branch December 10, 2025 06:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Minor] Support Gemini 2.5 Flash / Pro#958

[Minor] Support Gemini 2.5 Flash / Pro#958
kennymckormick merged 1 commit intomainfrom
gemini25

kennymckormick commented Apr 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kennymckormick commented Apr 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant