Skip to content

Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)#2059

Merged
MaiziXiao merged 4 commits intoopen-compass:mainfrom
tchenglv520:mmlu_andscieval__tc
May 9, 2025
Merged

Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)#2059
MaiziXiao merged 4 commits intoopen-compass:mainfrom
tchenglv520:mmlu_andscieval__tc

Conversation

@tchenglv520
Copy link
Copy Markdown
Contributor

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

OpenCompass currently lacks out-of-the-box support for evaluating models on the lifescience / biology sub-sets of the MMLU and SciEval benchmarks. These slices are critical for biomedical and healthcare LLM research. This PR introduces native loaders and config files so users can run evaluations with a single command.

Modification

Dataset loader
• opencompass/datasets/SciEval_lifescience.py — parses the biology + multiple-choice sub-set of SciEval and returns a DatasetDict.

Configs – SciEval
• opencompass/configs/datasets/SciEval_lifscience/SciEval_lifescience_gen.py
• opencompass/configs/datasets/SciEval_lifscience/SciEval_lifescience_llmjudge_gen.py

Configs – MMLU
• opencompass/configs/datasets/mmlu_lifescience/* — evaluation configs for the MMLU lifescience sub-set.

Registry
• opencompass/datasets/init.py — explicitly imports SciEvalDataset; removes wildcard import.

Code quality
All new/modified files formatted with isort, yapf, docformatter; flake8 passes with zero errors.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

  • [√ ] Pre-commit or other linting tools are used to fix the potential lint issues.
  • [√ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • [√ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • [√ ] The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • [ √] If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
  • [√ ] CLA has been signed and all committers have signed the CLA in this PR.

@MaiziXiao
Copy link
Copy Markdown
Contributor

针对 MMLU的不同配置,直接在 MMLU的文件夹中创建一个新的配置,不要重新创建 Dataset

@MaiziXiao
Copy link
Copy Markdown
Contributor

Comment thread .pre-commit-config.yaml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要随意更新 pre-commit-config 中的内容

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经修改,谢谢。

Copy link
Copy Markdown
Contributor

@MaiziXiao MaiziXiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MaiziXiao MaiziXiao merged commit c5048bf into open-compass:main May 9, 2025
8 checks passed
MaiziXiao added a commit that referenced this pull request May 9, 2025
MaiziXiao added a commit that referenced this pull request May 9, 2025
Comment on lines +1 to +3
SciEval_lifescience_subsets = [
'biology', # 大学生物学
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请将 sets 的配置放入对应数据集配置文件中

Comment on lines +36 to +45
examples: List[dict] = []
for ex in raw_iter:
if (ex.get('category') != 'biology'
or ex.get('type') != 'multiple-choice'):
continue

ans_list = ex.get('answer') or ex.get('answers') or []
if not ans_list:
continue
target = ans_list[0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该数据集 load 方式是否针对 SciEval 的所有子维度均生效?不要将子维度写死在读取逻辑中

stephen-nju pushed a commit to stephen-nju/opencompass that referenced this pull request May 14, 2025
)

* style: pass all formatting hooks (yapf & quote fixer)

* revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)

* revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

* Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

---------

Co-authored-by: root <[email protected]>
stephen-nju pushed a commit to stephen-nju/opencompass that referenced this pull request May 14, 2025
zyc140345 pushed a commit to zyc140345/opencompass that referenced this pull request Oct 23, 2025
)

* style: pass all formatting hooks (yapf & quote fixer)

* revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)

* revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

* Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

---------

Co-authored-by: root <[email protected]>
zyc140345 pushed a commit to zyc140345/opencompass that referenced this pull request Oct 23, 2025
iamkaia pushed a commit to iamkaia/opencompass that referenced this pull request Feb 4, 2026
)

* style: pass all formatting hooks (yapf & quote fixer)

* revise name:Add Lifescience Sub-set Support for MMLU & SciEval (datasets + configs + loader)

* revise name:Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

* Add Lifescience SciEval (datasets + configs + loader+dataset-index.yml)

---------

Co-authored-by: root <[email protected]>
iamkaia pushed a commit to iamkaia/opencompass that referenced this pull request Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants