llm-as-formalizer-constraints consists of datasets and pipelines for using LLMs to generate PDDL and plans when constraints are introduced.
To run the pipeline, the following are needed:
- Spot: Install
spotlibrary for LTL support. An easy way is to install it through conda, by
conda install -c conda-forge spot
- OpenAI: DeepSeek models are called using OpenAI, replace the following line in each file DeepSeek is called with your API key:
openai_api_key = open(f'../../../../_private/key_deepseek.txt').read()
- Kani: install Kani using the following:
pip install "kani[all]" torch 'accelerate>=0.26.0'
- VAL: follow VAL GitHub repository for instructions to install VAL, then change the following line (line 57 in
source/pipeline/run_val.py) to the VAL executable:
validate_executable = "../../../../../Research/PDDL_Project/VAL/build/macos64/Release/bin/Validate"
- TCORE: follow TCORE Github repository for instructions to install TCORE, then change the following line (line 70 in
source/pipeline/pddl3_formalizer/run_compiler.py) to the TCORE executable:
tcore_executable = '../../../../tcore/launch_tcore.py'
All datasets can be found in the /data folder.
In /data the following folders can be found:
blocksworld: datasets for the BlocksWorld domainBlocksWorld-100: taken from Huang and Zhang (2025) with added constraints.BlocksWorld-100-XL: dataset consisting of BlocksWorld instances where each instance contains 50 blocks
coin_collector: dataset for the CoinCollector domainCoinCollector-100_includeDoors0: dataset of fully-observed CoinCollector problems where closed doors are removed
mystery_blocksworld: datasets for the Mystery BlocksWorld domainMystery_BlocksWorld-100: dataset of Mystery BlocksWorld instances. Each instance is aBlocksWorld-100instance where the keywords are obfuscated.
Each dataset has the following folders:
-
action_headsaction_heads.txt: PDDL action heads for domain when constraints are not introduced
-
constraints: augmented constraints to the dataset. Each dataset has the following categories:baseline: no constraint added, used as baselinegoal: new goal introduced to probleminitial: new initial state introduced to problemaction: constraints that impact the plan found for the problemstate: constraints that impact the state space for the problem
Each constraint category contains the following folders:
action_heads: action_heads file for each problem-constraint pair namedp**_constraint**.txt. Each file constaints the required action heads for the problem-constraint pair.constraints: constraint descriptions for the category, namedconstraint**.txt. Each file constains a natural language description of the constraint and each category contains 20 description files.pddl: Ground-truth PDDL for a problem-constraint pair. This folder constains problem foldersp**where each folder contains two file for each pair, a domain file (p**_constraint**_df.pddl) and a problem file (p**_constraint**_pf.pddl). There is also a.jsonlfile that contains information about whether a plan exists for the given problem-constraint pair.
-
descriptions: non-constrained natural language descriptions of the domain and problem. Each problem consists of a domain description (p**_domain.txt) and a problem description (p**_problem-txt). -
pddl: non-constrainted ground-truth PDDL files for each problem instance. There is one domain file (domain.pddl) and 100 problem files (p**.pddl) for the 100 problem instances in each dataset.
To run the LLM-as-Planner pipeline, run the following:
cd source/pipeline
python llm-as-planner.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
where
--domainis which domain to evaluate (blocksworldormystery_blocksworld)--datais which dataset to use (BlocksWorld-100,Mystery_BlocksWorld-100orBlocksWorld-100-XL)--modelis which model to run (deepseek-reasoner,deepseek-chat,Qwen3-32B, orQwen2.5-Coder-32B-Instruct)--constraint_typeis which constraint category to run (baseline,goal,initial,action,state)--defaultindicates that the default problem-constraint pairs in that category are ran--problemsis which problems to run, can be single number, comma-separated list, or range (e.g., 1,3,5 or 1-10)--constraintsis which constraints to pair with the problems, can be single number, comma-separated list, or range (e.g., 1,3,5 or 1-10)
output will be written in /output/llm-as-planner/DOMAIN/DATA/MODEL/CONSTRAINT_TYPE
To generate entire PDDL (without revision), run the following:
cd source/pipeline/pddl_formalizer
python pddl_formalizer_generate.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-pddl-formalizer/DOMAIN/DATA/generate/MODEL/CONSTRAINT_TYPE
To first generate the PDDL without constraints then edit the output to satisfy the new constraint (without revision), run the following:
cd source/pipeline/pddl_formalizer
python pddl_formalizer_edit_1.py --domain DOMAIN --data DATA --model MODEL [--default | --problems PROBLEMS]
python pddl_formalizer_edit_2.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-pddl-formalizer/DOMAIN/DATA/edit/MODEL/CONSTRAINT_TYPE
To revise generated PDDL with deepseek models (deepseek-reasoner or deepseek-chat), run the following:
cd source/pipeline/pddl_formalizer
python pddl_formalizer_revision_deepseek.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE --solver SOLVER [--default | --problems PROBLEMS --constraints CONSTRAINTS]
where
--run_typeis which method was used to generate PDDL (generateoredit)--solveris which PDDL solver to use for revisions (default isdual-bfws-ffparser)
To revise generated PDDL with Qwen models (Qwen3-32B or Qwen2.5-Coder-32B-Instruct), run the following:
cd source/pipeline/pddl_formalizer
python pddl_formalizer_revision_qwen.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE --solver SOLVER [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-pddl-formalizer/DOMAIN/DATA/revisions/RUN_TYPE/MODEL/CONSTRAINT_TYPE
After generating the PDDL, we can run the solver using the following:
cd source/pipeline/pddl_formalizer
python run_solver.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--revision] --solver SOLVER [--default | --problems PROBLEMS --constraints CONSTRAINTS]
where --revision is an optional flag to run the solver for the generated PDDL after revision is introduced.
output will be written as .txt in the same folder as the generated PDDL
To generate entire PDDL3 (without revision), run the following:
cd source/pipeline/pddl3_formalizer
python pddl3_formalizer_generate.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-pddl3-formalizer/DOMAIN/DATA/generate/MODEL/CONSTRAINT_TYPE
To first generate the PDDL3 without constraints then edit the output to satisfy the new constraint (without revision), run the following:
cd source/pipeline/pddl3_formalizer
python pddl3_formalizer_edit_1.py --domain DOMAIN --data DATA --model MODEL [--default | --problems PROBLEMS]
python pddl3_formalizer_edit_2.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-pddl3-formalizer/DOMAIN/DATA/edit/MODEL/CONSTRAINT_TYPE
After generating the PDDL3 code, we can run the compiler using the following:
cd source/pipeline/pddl3_formalizer
python run_compiler.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--revision --attempt ATTEMPT] --solver SOLVER [--default | --problems PROBLEMS --constraints CONSTRAINTS]
where --revision is an optional flag to run the compiler for generated PDDL3 after revision is introduced, and --attempt indicates which revision trial to run
output will be written as .txt in the same folder as the generated PDDL3
After compiling the PDDL3, we can run the solver using the following:
cd source/pipeline/pddl3_formalizer
python run_solver.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--revision --attempt ATTEMPT] --solver SOLVER [--default | --problems PROBLEMS --constraints CONSTRAINTS]
To revise generated PDDL3, run the following:
cd source/pipeline/pddl3_formalizer
python pddl3_formalizer_revision.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS] --attempt ATTEMPT
output will be written in /output/llm-as-pddl3-formalizer/DOMAIN/DATA/revisions/RUN_TYPE/MODEL/CONSTRAINT_TYPE/attemptATTEMPT
To generate entire SMT (without revision), run the following:
cd source/pipeline/smt_formalizer
python smt_formalizer_generate.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-smt-formalizer/DOMAIN/DATA/generate/MODEL/CONSTRAINT_TYPE
To first generate the SMT without constraints then edit the output to satisfy the new constraint (without revision), run the following:
cd source/pipeline/smt_formalizer
python smt_formalizer_edit_1.py --domain DOMAIN --data DATA --model MODEL [--default | --problems PROBLEMS]
python smt_formalizer_edit_2.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-smt-formalizer/DOMAIN/DATA/edit/MODEL/CONSTRAINT_TYPE
To revise generated SMT, run the following:
cd source/pipeline/smt_formalizer
python smt_formalizer_revision.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--default --problems PROBLEMS --constraints CONSTRAINTS]
where
--run_typeis which method was used to generate PDDL (generateoredit)
output will be written in /output/llm-as-smt-formalizer/DOMAIN/DATA/revisions/RUN_TYPE/MODEL/CONSTRAINT_TYPE
After generating the PDDL, we can run the solver using the following:
cd source/pipeline/smt_formalizer
python run_solver.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE [--revision] [--default | --problems PROBLEMS --constraints CONSTRAINTS]
where --revision is an optional flag to run the solver for the generated PDDL after revision is introduced.
output will be written as .txt in the same folder as the generated PDDL
To run the LTL pipeline for DeepSeek models (deepseek-reasoner or deepseek-chat), run the following:
python ltl_formalizer_deepseek.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
To run the LTL pipeline for Qwen models (Qwen3-32B, Qwen2.5-Coder-32B-Instruct), run the following:
python ltl_formalizer_qwen.py --domain DOMAIN --data DATA --model MODEL --constraint_type CONSTRAINT_TYPE [--default | --problems PROBLEMS --constraints CONSTRAINTS]
output will be written in /output/llm-as-ltl-formalizer/DOMAIN/DATA/generate/MODEL/CONSTRAINT_TYPE
Note: LTL currently only works for the Generate setting on CoinCollector using only action, state and goal constraints.
To evaluate the result of LLM-as-PDDL-Formalizer, LLM-as-PDDL3-Formalizer, LLM-as-SMT-Formalizer, LLM-as-LTL-Formalizer, or LLM-as-Planner using VAL, run the following:
cd source/pipeline
python run_val.py --domain DOMAIN --data DATA --model MODEL --run_type RUN_TYPE --constraint_type CONSTRAINT_TYPE --prediction_type PREDICTION_TYPE [--revision] [--attempt ATTEMPT] [--csv_result] [--baseline_type BASELINE_TYPE]
where
--prediction_typeis which pipeline to use--attemptis used for PDDL3 only to evaluate a revision trial--csv_resultis an optional flag that outputs all results (plans and errors) in a csv file.
VAL output will be written as .txt in the same folder as the generated PDDL. Correctness and number of syntax errors will be printed in the terminal. Optional csv file will be saved to directory containing all constraint categories (edit, generate, revisions/generate or revisions/edit).