[ACL 2025] An LLM Agent for Molecular Optimization and Drug Pharmaceutical Improvement

RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models

Xufeng Liu^*, Yixuan Ding^*†, Jingxiang Qu, Yichi Zhang, Wenhan Gao^‡, Yi Liu^‡
* Equal contribution.
† Work done during an internship at Stony Brook University.
‡ Equal senior contribution.
ACL 2025

Introduction: Recent advances in large language models (LLMs) across diverse domains highlight their potential to transform scientific discovery, including drug editing. Traditional drug editing relies on iterative conversations with domain experts, refining a molecule until the desired property is achieved. This interactive process mirrors the strengths of LLMs. However, existing approaches edit each molecule independently without leveraging knowledge from past edits.

Human experts develop intuition about effective modifications over time by learning from historical experience. Accumulating past knowledge is pivotal for both humans and LLMs. In this work, we propose RL-Guider — a reinforcement-learning agent that suggests edits to LLMs and improves over time by learning from evaluation feedback on past results.

RL-Guider is the first framework to combine the comprehensive "world-level" knowledge of LLMs with knowledge accumulated from historical feedback. As a result, RL-Guider mitigates shortcomings of existing approaches and achieves superior performance.

Framework

Environment

All dependencies are listed in requirements.txt. Install them with:

pip install -r requirements.txt

Quickstart

Run an example to perform drug editing (small molecule) with LLaMA:

python run_ChatDrug.py --task_id=101 --C=0 --constraint='loose' --conversational_LLM='llama' --conversation_type='single'

Arguments:

--task_id: The task identifier.
--C: Constraint strength.
--constraint: Editing constraint type (e.g., loose, strict).
--conversational_LLM: Choice of LLM (e.g., llama).
--conversation_type: Mode of conversation (e.g., single, multi).

Using RL-Guider

Follow these steps to use RL-Guider:

Run the scripts in gather_buffer/ to interact with LLMs and collect data.
Run the scripts in process_buffer/ to preprocess the collected data.
Train RL models using the scripts in the rl_train/ folder.
Run run_planner_tree.py to perform drug editing with RL-Guider.

License

Released under the MIT License. See LICENSE.

Citation

If you find this work useful, please cite:

@inproceedings{liu2025rl,
  title={RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models},
  author={Liu, Xufeng and Ding, Yixuan and Qu, Jingxiang and Zhang, Yichi and Gao, Wenhan and Liu, Yi},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2025},
  pages={13121--13138},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ChatDrug/task_and_evaluation		ChatDrug/task_and_evaluation
Data		Data
figure		figure
gather_buffer		gather_buffer
notebooks		notebooks
process_buffer		process_buffer
rl_train		rl_train
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run_ChatDrug.py		run_ChatDrug.py
run_planner_tree.py		run_planner_tree.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[ACL 2025] An LLM Agent for Molecular Optimization and Drug Pharmaceutical Improvement

Framework

Environment

Quickstart

Using RL-Guider

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

xufliu/RL-Guider

Folders and files

Latest commit

History

Repository files navigation

[ACL 2025] An LLM Agent for Molecular Optimization and Drug Pharmaceutical Improvement

Framework

Environment

Quickstart

Using RL-Guider

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages