This repository provides a pipeline for PDDL generation, refinement, and evaluation with Large Language Models (LLMs) and retrieval-augmented generation (RAG).
- Install Elasticsearch 8.1.2
- Set your API key into the environment variable:
export API_KEY=your_api_key
Generate PDDL from natural language:
python source/pipeline/llm_as_formalizer.py \
--domain DOMAIN \
--model MODEL \
--data DATA \
--index_start INDEX_START \
--index_end INDEX_ENDThe results will be saved in the output/ folder.
Refine PDDL using solver feedback:
python source/pipeline/run_solver_error_rag_LLM.py \
--domain DOMAIN \
--model MODEL \
--data DATA \
--index_start INDEX_START \
--index_end INDEX_END \
--solver SOLVERpython source/pipeline/run_val.py \
--domain DOMAIN \
--model MODEL \
--data DATA \
--index_start INDEX_START \
--index_end INDEX_END \
--prediction_type PREDICTION_TYPE \
--csv_result \
--pipeline_type PIPELINE_TYPEpython source/pipeline/run_val_rag.py \
--domain DOMAIN \
--model MODEL \
--data DATA \
--index_start INDEX_START \
--index_end INDEX_END \
--prediction_type PREDICTION_TYPE \
--csv_result \
--pipeline_type PIPELINE_TYPE--domain: Planning domain (e.g., blocksworld)
--model: LLM model name (e.g., meta-llama/llama-4-maverick-17b-128e-instruct)
--data: Dataset name (e.g., Heavily_Templated_BlocksWorld-100)
--index_start / --index_end: Index range of tasks
--solver: Solver name (used in refinement stage)
--prediction_type: Type of prediction to evaluate
--pipeline_type: One of ["rag", "formalize", "rag_refine", "steady", ...]
python run_solver_error_rag.py \
--domain blocksworld \
--model meta-llama/llama-4-maverick-17b-128e-instruct \
--data Heavily_Templated_BlocksWorld-100 \
--index_start 1 \
--index_end 101 \
--solver dual-bfws-ffparserAll generated or refined PDDL files and logs are stored in the output/ directory.