Skip to content

Commit fea3806

Browse files
authored
Merge pull request #108 from EvolvingLMMs-Lab/internal_main_dev
[Upgrade to v0.2] Embracing Video Evaluations with LMMs-Eval
2 parents d99a24a + 05dc8e8 commit fea3806

File tree

368 files changed

+14960
-1343
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

368 files changed

+14960
-1343
lines changed

.github/issue_template.md

100644100755
File mode changed.

.github/pull_request_template.md

100644100755
File mode changed.

.github/workflows/black.yml

100644100755
File mode changed.

.gitignore

100644100755
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,11 @@ ckpt
2929
pretrained/
3030
LLaVA/
3131
*logs
32+
temp/
33+
InternVL/
34+
logs/
35+
data/
36+
llava-video/
37+
Video-MME/
38+
VATEX/
39+
lmms_eval/tasks/vatex/__pycache__/utils.cpython-310.pyc

.pre-commit-config.yaml

100644100755
File mode changed.

README.md

100644100755
Lines changed: 167 additions & 233 deletions
Large diffs are not rendered by default.

docs/README.md

100644100755
File mode changed.

docs/commands.md

100644100755
File mode changed.

docs/current_tasks.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Current Tasks
2+
3+
> () indicates the task name in the lmms_eval. The task name is also used to specify the dataset in the configuration file.
4+
> The following is manually updated documentation. You could use `lmms_eval task --list` to list all supported tasks and their task names.
5+
6+
- AI2D (ai2d)
7+
- ChartQA (chartqa)
8+
- CMMMU (cmmmu)
9+
- CMMMU Validation (cmmmu_val)
10+
- CMMMU Test (cmmmu_test)
11+
- COCO Caption (coco_cap)
12+
- COCO 2014 Caption (coco2014_cap)
13+
- COCO 2014 Caption Validation (coco2014_cap_val)
14+
- COCO 2014 Caption Test (coco2014_cap_test)
15+
- COCO 2017 Caption (coco2017_cap)
16+
- COCO 2017 Caption MiniVal (coco2017_cap_val)
17+
- COCO 2017 Caption MiniTest (coco2017_cap_test)
18+
- [ConBench](https://github.com/foundation-multimodal-models/ConBench) (conbench)
19+
- DOCVQA (docvqa)
20+
- DOCVQA Validation (docvqa_val)
21+
- DOCVQA Test (docvqa_test)
22+
- Ferret (ferret)
23+
- Flickr30K (flickr30k)
24+
- Ferret Test (ferret_test)
25+
- GQA (gqa)
26+
- HallusionBenchmark (hallusion_bench_image)
27+
- Infographic VQA (info_vqa)
28+
- Infographic VQA Validation (info_vqa_val)
29+
- Infographic VQA Test (info_vqa_test)
30+
- LLaVA-Bench (llava_in_the_wild)
31+
- LLaVA-Bench-COCO (llava_bench_coco)
32+
- MathVerse (mathverse)
33+
- MathVerse Text Dominant (mathverse_testmini_text_dominant)
34+
- MathVerse Text Only (mathverse_testmini_text_only)
35+
- MathVerse Text Lite (mathverse_testmini_text_lite)
36+
- MathVerse Vision Dominant (mathverse_testmini_vision_dominant)
37+
- MathVerse Vision Intensive (mathverse_testmini_vision_intensive)
38+
- MathVerse Vision Only (mathverse_testmini_vision_only)
39+
- MathVista (mathvista)
40+
- MathVista Validation (mathvista_testmini)
41+
- MathVista Test (mathvista_test)
42+
- MMBench (mmbench)
43+
- MMBench English (mmbench_en)
44+
- MMBench English Dev (mmbench_en_dev)
45+
- MMBench English Test (mmbench_en_test)
46+
- MMBench Chinese (mmbench_cn)
47+
- MMBench Chinese Dev (mmbench_cn_dev)
48+
- MMBench Chinese Test (mmbench_cn_test)
49+
- MME (mme)
50+
- MMMU (mmmu)
51+
- MMMU Validation (mmmu_val)
52+
- MMMU Test (mmmu_test)
53+
- MMUPD (mmupd)
54+
- MMUPD Base (mmupd_base)
55+
- MMAAD Base (mmaad_base)
56+
- MMIASD Base (mmiasd_base)
57+
- MMIVQD Base (mmivqd_base)
58+
- MMUPD Option (mmupd_option)
59+
- MMAAD Option (mmaad_option)
60+
- MMIASD Option (mmiasd_option)
61+
- MMIVQD Option (mmivqd_option)
62+
- MMUPD Instruction (mmupd_instruction)
63+
- MMAAD Instruction (mmaad_instruction)
64+
- MMIASD Instruction (mmiasd_instruction)
65+
- MMIVQD Instruction (mmivqd_instruction)
66+
- MMVet (mmvet)
67+
- Multi-DocVQA (multidocvqa)
68+
- Multi-DocVQA Validation (multidocvqa_val)
69+
- Multi-DocVQA Test (multidocvqa_test)
70+
- NoCaps (nocaps)
71+
- NoCaps Validation (nocaps_val)
72+
- NoCaps Test (nocaps_test)
73+
- OKVQA (ok_vqa)
74+
- OKVQA Validation 2014 (ok_vqa_val2014)
75+
- POPE (pope)
76+
- RefCOCO (refcoco)
77+
- refcoco_seg_test
78+
- refcoco_seg_val
79+
- refcoco_seg_testA
80+
- refcoco_seg_testB
81+
- refcoco_bbox_test
82+
- refcoco_bbox_val
83+
- refcoco_bbox_testA
84+
- refcoco_bbox_testB
85+
- RefCOCO+ (refcoco+)
86+
- refcoco+_seg
87+
- refcoco+_seg_val
88+
- refcoco+_seg_testA
89+
- refcoco+_seg_testB
90+
- refcoco+_bbox
91+
- refcoco+_bbox_val
92+
- refcoco+_bbox_testA
93+
- refcoco+_bbox_testB
94+
- RefCOCOg (refcocog)
95+
- refcocog_seg_test
96+
- refcocog_seg_val
97+
- refcocog_bbox_test
98+
- refcocog_bbox_val
99+
- ScienceQA (scienceqa_full)
100+
- ScienceQA Full (scienceqa)
101+
- ScienceQA IMG (scienceqa_img)
102+
- ScreenSpot (screenspot)
103+
- ScreenSpot REC / Grounding (screenspot_rec)
104+
- ScreenSpot REG / Instruction Generation (screenspot_reg)
105+
- SeedBench (seedbench)
106+
- SeedBench 2 (seedbench_2)
107+
- ST-VQA (stvqa)
108+
- TextCaps (textcaps)
109+
- TextCaps Validation (textcaps_val)
110+
- TextCaps Test (textcaps_test)
111+
- TextVQA (textvqa)
112+
- TextVQA Validation (textvqa_val)
113+
- TextVQA Test (textvqa_test)
114+
- VizWizVQA (vizwiz_vqa)
115+
- VizWizVQA Validation (vizwiz_vqa_val)
116+
- VizWizVQA Test (vizwiz_vqa_test)
117+
- VQAv2 (vqav2)
118+
- VQAv2 Validation (vqav2_val)
119+
- VQAv2 Test (vqav2_test)
120+
- WebSRC (websrc)
121+
- WebSRC Validation (websrc_val)
122+
- WebSRC Test (websrc_test)

docs/model_guide.md

100644100755
File mode changed.

0 commit comments

Comments
 (0)