LTD-Bench

Setup

Before running LTD-Bench, please ensure that your Linux environment has already installed Xvfb, as it may be required for the Hard-level generation tasks.

You can install it using the following command.

apt-get install xvfb
apt-get install ghostscript

or

yum install xorg-x11-server-Xvfb
yum install ghostscript

Then you need to run Xvfb

Xvfb :1 -screen 0 800x600x24&

Setup your Python environment

pip install -r requirements.txt

Run

Set up the model configuration in "run.sh" file, including your model_id, API_BASE_URL and API_KEY.

Then you can start running model inference!

sh run.sh

Evaluation

Set up your GPT-4.1 configuration in "run_eval.sh" file, including your OPENAI_BASE_URL and OPENAIL_KEY.

Then you can run GPT-4.1 automatic evaluation.

sh run_eval.sh

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
evaluation		evaluation
README.md		README.md
prompt.py		prompt.py
requirements.txt		requirements.txt
run.sh		run.sh
run_eval.py		run_eval.py
run_eval.sh		run_eval.sh
run_test.py		run_test.py
similarity_score.py		similarity_score.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LTD-Bench

Setup

Run

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

walktaster/LTD-Bench

Folders and files

Latest commit

History

Repository files navigation

LTD-Bench

Setup

Run

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages