[Dataset] Add SeedBench Dataset by ChenZiHong-Gavin · Pull Request #2020 · open-compass/opencompass

ChenZiHong-Gavin · 2025-04-14T06:27:57Z

Motivation

This PR introduces a new domain-specific benchmark dataset, SeedBench, which is the first multi-task benchmark designed to evaluate large language models (LLMs) in seed science, focusing on seed breeding.

Modification

Added a new dataset class SeedBenchDataset and implemented some metrics like F1Evaluator in opencompass/datasets/SeedBench.py.

Added configuration file seedbench_gen_5d5ea1.py, seedbench_gen.py and README.md in configs/datasets/SeedBench/.

Registered the dataset in datasets/init.py.

Updated datasets_info.py with dataset metadata.

Updated dataset-index.yml with dataset metadata.

BC-breaking (Optional)

No backward compatibility breaking changes introduced.

Use cases (Optional)

SeedBench assesses LLMs across three core seed breeding stages:

Gene Information Retrieval
Gene Function and Regulation Analysis
Variety Breeding with Agronomic Trait Optimization

Built with domain experts, SeedBench features 2,264 expert-validated questions across 11 task types and 10 subcategories, initially targeting rice breeding. Future updates will include other crops like maize, soybean, and wheat.

Following the instruction, we can evaluate with SeedBench using:

DATASET_SOURCE=ModelScope python run.py --hf-type chat --hf-path Qwen/Qwen2.5-0.5B-Instruct  --datasets seedbench_gen --debug

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Tested on ModelScope and local environment.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
CLA has been signed and all committers have signed the CLA in this PR.

tpoisonooo · 2025-04-14T07:32:06Z

cc @tonysy

…ompass into SeedBench

Copilot

Pull Request Overview

This PR adds a new domain-specific benchmark dataset, SeedBench, for evaluating LLMs in seed science and breeding.

Introduces the SeedBenchDataset and multiple evaluators (F1ScoreEvaluator, AverageRougeScoreEvaluator, AccScoreStr_Evaluator) in opencompass/datasets/SeedBench.py.
Adds a new dataset configuration along with corresponding documentation and metadata updates in datasets_info.py, dataset-index.yml, and configs/datasets/SeedBench/.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
opencompass/utils/datasets_info.py	Registers SeedBench metadata in dataset info
opencompass/datasets/init.py	Imports the new SeedBench dataset module
opencompass/datasets/SeedBench.py	Implements SeedBenchDataset and its evaluators
opencompass/configs/datasets/SeedBench/seedbench_gen_5d5ea1.py	Provides configuration for SeedBench evaluation
opencompass/configs/datasets/SeedBench/seedbench_gen.py	Reads base configuration for SeedBench datasets
opencompass/configs/datasets/SeedBench/README.md	Documents the SeedBench dataset details
dataset-index.yml	Adds SeedBench entry for dataset indexing

Comments suppressed due to low confidence (2)

opencompass/datasets/SeedBench.py:305

[nitpick] The evaluator class name 'AccScoreStr_Evaluator' is inconsistent with its base class naming convention. Consider renaming it to 'AccScoreStrEvaluator' to maintain clarity and consistency.

class AccScoreStr_Evaluator(AccScoreStrEvaluator):

opencompass/utils/datasets_info.py:233

[nitpick] The dataset key 'opencompass/seedbench' uses lowercase while the corresponding module file is named 'SeedBench.py'. Ensure consistent casing across modules and identifiers to avoid potential issues on case-sensitive systems.

"opencompass/seedbench": {

tonysy · 2025-06-05T13:39:03Z

Please fix the lint issue.

ChenZiHong-Gavin · 2025-06-09T02:32:53Z

Please fix the lint issue.

The lint issue has been fixed.

tpoisonooo · 2025-08-13T03:12:14Z

@MaiziXiao @Myhs-phz @bittersweet1999 pls review.

Myhs-phz

LGTM

tonysy

LGTM

* [Dataset] Add SeedBench Dataset * docs: add README for SeedBench * refactor: delete unnecessary comment * fix: fix load function for SeedBenchDataset * fix: delete unnecessary code * fix: fix typo * fix: fix lint problem * docs: update summary of SeedBench * docs: add paper link * Update dataset-index.yml --------- Co-authored-by: Songyang Zhang <[email protected]> Co-authored-by: Myhs_phz <[email protected]>

[Dataset] Add SeedBench Dataset

dae700e

mm-assistant bot assigned bittersweet1999 Apr 14, 2025

ChenZiHong-Gavin and others added 7 commits April 14, 2025 19:51

docs: add README for SeedBench

f9b1636

refactor: delete unnecessary comment

db04df7

Merge branch 'main' into SeedBench

8000375

fix: fix load function for SeedBenchDataset

c9ea024

Merge branch 'open-compass:main' into SeedBench

cbfac1e

fix: delete unnecessary code

39b34d6

Merge branch 'SeedBench' of https://github.com/ChenZiHong-Gavin/openc…

332acdf

…ompass into SeedBench

ChenZiHong-Gavin marked this pull request as ready for review April 15, 2025 06:27

ChenZiHong-Gavin and others added 2 commits April 15, 2025 15:34

fix: fix typo

e335b29

Merge branch 'main' into SeedBench

2ded84a

ChenZiHong-Gavin temporarily deployed to prod April 24, 2025 11:31 — with GitHub Actions Inactive

tonysy requested review from MaiziXiao, Myhs-phz and Copilot April 24, 2025 11:31

Copilot AI reviewed Apr 24, 2025

View reviewed changes

Merge branch 'main' into SeedBench

d26e808

tpoisonooo mentioned this pull request Jun 5, 2025

How to use SeedBench ? InternScience/SeedBench#11

Closed

Merge branch 'main' into SeedBench

873a448

tonysy temporarily deployed to prod June 5, 2025 13:35 — with GitHub Actions Inactive

ChenZiHong-Gavin and others added 2 commits June 8, 2025 12:42

Merge branch 'open-compass:main' into SeedBench

6dcfc1f

fix: fix lint problem

c2bd67b

Merge branch 'open-compass:main' into SeedBench

9980697

ChenZiHong-Gavin mentioned this pull request Jun 25, 2025

opencompass中找不到Seedbench InternScience/SeedBench#13

Closed

ChenZiHong-Gavin and others added 2 commits July 1, 2025 14:31

Merge branch 'open-compass:main' into SeedBench

b89cca2

docs: update summary of SeedBench

64a8445

Merge branch 'main' into SeedBench

1e6f491

ChenZiHong-Gavin and others added 4 commits August 19, 2025 15:05

Merge branch 'open-compass:main' into SeedBench

da6cee1

docs: add paper link

8ce498a

Merge branch 'open-compass:main' into SeedBench

a77b845

Update dataset-index.yml

b18b2e7

Myhs-phz approved these changes Aug 28, 2025

View reviewed changes

tonysy approved these changes Sep 2, 2025

View reviewed changes

Myhs-phz merged commit e055d12 into open-compass:main Sep 3, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dataset] Add SeedBench Dataset#2020

[Dataset] Add SeedBench Dataset#2020
Myhs-phz merged 22 commits intoopen-compass:mainfrom
ChenZiHong-Gavin:SeedBench

ChenZiHong-Gavin commented Apr 14, 2025 •

edited

Loading

Uh oh!

tpoisonooo commented Apr 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

tonysy commented Jun 5, 2025

Uh oh!

ChenZiHong-Gavin commented Jun 9, 2025

Uh oh!

tpoisonooo commented Aug 13, 2025

Uh oh!

Myhs-phz left a comment

Uh oh!

tonysy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ChenZiHong-Gavin commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

tpoisonooo commented Apr 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

tonysy commented Jun 5, 2025

Uh oh!

ChenZiHong-Gavin commented Jun 9, 2025

Uh oh!

tpoisonooo commented Aug 13, 2025

Uh oh!

Myhs-phz left a comment

Choose a reason for hiding this comment

Uh oh!

tonysy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ChenZiHong-Gavin commented Apr 14, 2025 •

edited

Loading