Skip to content

xxxiaol/RefTool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code and Data of REFTOOL: Enhancing Model Reasoning with Reference-Guided Tool Creation

Code

Tool Creation

Code related to tool creation is in code/tool_creation/.

Take the causality book as an example. Place the LaTeX content of the book in books/.

  • Extract the book structure:
python extract_book_structure.py --domain causality
  • Generate tools:
python initial_tool_generation.py --model_name gpt4o --domain causality
  • Validate tools:
python validate_tools.py --model_name gpt4o --domain causality --stage unfiltered
  • Refine tools:
python refine_tools.py --model_name gpt4o --domain causality
  • Validate again:
python validate_tools.py --model_name gpt4o --domain causality --stage refined

Inference

Code related to tool utilization and evaluation is in code/inference/.

Tool Utilization

  • Chapter selection:
python select_chapter.py --model_name gpt4o --domain causality
  • Tool selection within chapter:
python select_skills_by_chapter.py --model_name gpt4o --domain causality
  • Solution generation:
python run_tool_0shot.py --model_name gpt4o --domain causality

Evaluation

python evaluator.py --model_name gpt4o --domain causality --method tool_0shot --force_generate

Evaluation Data

Evaluation questions for causality, physics, and chemistry are in evaluation_data/qrdata_causal.json, evaluation_data/theoremqa_phy.json, and evaluation_data/scibench_chem.json, respectively.

For causality, please also download the corresponding data from the original benchmark https://github.com/xxxiaol/QRData/blob/main/benchmark/data.zip. Unzip and place the data/ directory under evaluation_data/.

We do not provide the LaTeX files of reference materials because of intellectual copyright.

About

Code and Data of REFTOOL: Enhancing Model Reasoning with Reference-Guided Tool Creation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages