Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader) by tchenglv520 · Pull Request #2059 · open-compass/opencompass

tchenglv520 · 2025-04-28T08:26:57Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

OpenCompass currently lacks out-of-the-box support for evaluating models on the lifescience / biology sub-sets of the MMLU and SciEval benchmarks. These slices are critical for biomedical and healthcare LLM research. This PR introduces native loaders and config files so users can run evaluations with a single command.

Modification

Dataset loader
• opencompass/datasets/SciEval_lifescience.py — parses the biology + multiple-choice sub-set of SciEval and returns a DatasetDict.

Configs – SciEval
• opencompass/configs/datasets/SciEval_lifscience/SciEval_lifescience_gen.py
• opencompass/configs/datasets/SciEval_lifscience/SciEval_lifescience_llmjudge_gen.py

Configs – MMLU
• opencompass/configs/datasets/mmlu_lifescience/* — evaluation configs for the MMLU lifescience sub-set.

Registry
• opencompass/datasets/init.py — explicitly imports SciEvalDataset; removes wildcard import.

Code quality
All new/modified files formatted with isort, yapf, docformatter; flake8 passes with zero errors.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

[√ ] Pre-commit or other linting tools are used to fix the potential lint issues.
[√ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
[√ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
[√ ] The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

[ √] If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
[√ ] CLA has been signed and all committers have signed the CLA in this PR.

…ets + configs + loader)

MaiziXiao · 2025-05-07T12:08:41Z

针对 MMLU的不同配置，直接在 MMLU的文件夹中创建一个新的配置，不要重新创建 Dataset

MaiziXiao · 2025-05-07T12:09:58Z

请根据https://opencompass.readthedocs.io/zh-cn/latest/advanced_guides/new_dataset.html 更新dataset-index.yml

MaiziXiao · 2025-05-07T12:12:18Z

不要随意更新 pre-commit-config 中的内容

已经修改，谢谢。

…set-index.yml)

MaiziXiao

LGTM

This reverts commit c5048bf.

MaiziXiao · 2025-05-09T07:02:11Z

+SciEval_lifescience_subsets = [
+    'biology',        # 大学生物学
+]


请将 sets 的配置放入对应数据集配置文件中

MaiziXiao · 2025-05-09T07:04:01Z

+            examples: List[dict] = []
+            for ex in raw_iter:
+                if (ex.get('category') != 'biology'
+                        or ex.get('type') != 'multiple-choice'):
+                    continue
+
+                ans_list = ex.get('answer') or ex.get('answers') or []
+                if not ans_list:
+                    continue
+                target = ans_list[0]


该数据集 load 方式是否针对 SciEval 的所有子维度均生效？不要将子维度写死在读取逻辑中

) * style: pass all formatting hooks (yapf & quote fixer) * revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader) * revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) * Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) --------- Co-authored-by: root <[email protected]>

…) (open-compass#2087) This reverts commit c5048bf.

) * style: pass all formatting hooks (yapf & quote fixer) * revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader) * revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) * Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) --------- Co-authored-by: root <[email protected]>

…) (open-compass#2087) This reverts commit c5048bf.

) * style: pass all formatting hooks (yapf & quote fixer) * revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader) * revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) * Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml) --------- Co-authored-by: root <[email protected]>

…) (open-compass#2087) This reverts commit 16c034a.

style: pass all formatting hooks (yapf & quote fixer)

e2f8057

mm-assistant bot assigned bittersweet1999 Apr 28, 2025

revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datas…

aedfbcc

…ets + configs + loader)

MaiziXiao reviewed May 7, 2025

View reviewed changes

tchenglv520 temporarily deployed to prod May 8, 2025 06:08 — with GitHub Actions Inactive

root added 2 commits May 9, 2025 04:48

revise name:Add Lifescience SciEval (datasets + configs + loader+data…

f238298

…set-index.yml)

Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

9c8244a

MaiziXiao approved these changes May 9, 2025

View reviewed changes

tchenglv520 temporarily deployed to prod May 9, 2025 06:20 — with GitHub Actions Inactive

MaiziXiao merged commit c5048bf into open-compass:main May 9, 2025
8 checks passed

MaiziXiao added a commit that referenced this pull request May 9, 2025

Revert "[Dataset] Add Lifescience Sub-set Support for SciEval (#2059)"

b1b429b

This reverts commit c5048bf.

MaiziXiao mentioned this pull request May 9, 2025

[Revert] "Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)" #2087

Merged

MaiziXiao added a commit that referenced this pull request May 9, 2025

[Revert] Add Lifescience Sub-set Support for SciEval (#2059) (#2087)

d72df59

This reverts commit c5048bf.

MaiziXiao reviewed May 9, 2025

View reviewed changes

stephen-nju pushed a commit to stephen-nju/opencompass that referenced this pull request May 14, 2025

[Revert] Add Lifescience Sub-set Support for SciEval (open-compass#2059…

b5d2946

…) (open-compass#2087) This reverts commit c5048bf.

zyc140345 pushed a commit to zyc140345/opencompass that referenced this pull request Oct 23, 2025

[Revert] Add Lifescience Sub-set Support for SciEval (open-compass#2059…

953aed0

…) (open-compass#2087) This reverts commit c5048bf.

iamkaia pushed a commit to iamkaia/opencompass that referenced this pull request Feb 4, 2026

[Revert] Add Lifescience Sub-set Support for SciEval (open-compass#2059…

e8708fc

…) (open-compass#2087) This reverts commit 16c034a.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)#2059

Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)#2059
MaiziXiao merged 4 commits intoopen-compass:mainfrom
tchenglv520:mmlu_andscieval__tc

tchenglv520 commented Apr 28, 2025

Uh oh!

MaiziXiao commented May 7, 2025

Uh oh!

MaiziXiao commented May 7, 2025

Uh oh!

MaiziXiao May 7, 2025

Uh oh!

tchenglv520 May 9, 2025

Uh oh!

MaiziXiao left a comment

Uh oh!

Uh oh!

MaiziXiao May 9, 2025

Uh oh!

MaiziXiao May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tchenglv520 commented Apr 28, 2025

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

MaiziXiao commented May 7, 2025

Uh oh!

MaiziXiao commented May 7, 2025

Uh oh!

MaiziXiao May 7, 2025

Choose a reason for hiding this comment

Uh oh!

tchenglv520 May 9, 2025

Choose a reason for hiding this comment

Uh oh!

MaiziXiao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MaiziXiao May 9, 2025

Choose a reason for hiding this comment

Uh oh!

MaiziXiao May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants