PDDLego+: Zero-Shot Iterative Formalization and Planning in Partially Observable Environments

This repository contains the implementation for our paper: "Zero-Shot Iterative Formalization and Planning in Partially Observable Environments".

Overview

PDDLego+ is a framework that iteratively formalizes, plans, grows, and refines PDDL (Planning Domain Definition Language) representations in a zero-shot manner for partially observable environments. Unlike previous approaches that focus on fully observable environments, PDDLego+ tackles the more realistic and challenging scenario where complete information is not available, eliminating the need for access to existing trajectories.

Our framework demonstrates:

Superior performance in textual simulated environments
Robustness against increasing problem complexity
Interpretable domain knowledge that benefits future tasks

Environments

We evaluate PDDLego+ on two textual simulated environments:

CoinCollector: A grid-world environment where an agent must collect coins while avoiding obstacles
ALFWorld: A household environment with complex object interactions

Models

We evaluate the performance of various language models with our framework:

DeepSeek-R1-671B
GPT-4.1-2025-04-14
o3-mini-2025-01-31
o4-mini-2025-04-16
gpt-4o-2024-05-13
DeepSeek-R1-Distill-Llama-70B
DeepSeek-R1-Distill-Qwen-32B

Repository Structure

PDDLEGO-PLUS/
├── df_cache/                    # Cache for fixed domain file operations
├── Figures/                     # Generated figures in paper
├── output/                      # Trial output log
│   ├── Alfworld/               # ALFWorld trial logs
│   ├── CoinCollector/          # CoinCollector trial logs
│   ├── Alfworld.csv            # Complete ALFWorld trial data
│   ├── Alfworld_summary.csv    # Summarized ALFWorld metrics
│   ├── CoinCollector.csv       # Complete CoinCollector trial data
│   └── CoinCollector_summary.csv # Summarized CoinCollector metrics
├── test_files/                  # Test files for development
├── .gitignore                  
├── domain.pddl                  # Example PDDL domain file
├── extract_actions.py           # Script to extract actions count from trial results
├── interactive_Alfworld.py      # Main ALFWorld environment interaction code
├── interactive_CoinCollector.py # Main CoinCollector environment interaction code
├── LICENSE                      
├── plan.txt                     # Example generated plan
├── problem.pddl                 # Example PDDL problem file
├── requirement.txt              # Dependencies and requirements
├── results_parse_final.py       # Script to parse trial data into metrics
└── visualize_pddlego+.ipynb     # Jupyter notebook for result visualization

Experiment Configuration

Our experiments test different configurations:

PlanGen: Direct action generation without formal planning
PDDLego+: Our iterative PDDL formalization and planning approach
Goal Types:
- detailed: Full goal description (simple + hint + goal description)
- subgoal: simple prompt
- without_hint: simple + goal description
- without_detailed_goal: simple + hint
Fixed Domain File: Experiments controlling domain files from df_cache folder

Installation and Setup

Our implementation uses Python scripts including interactive_Alfworld.py and interactive_CoinCollector.py. Results are analyzed using Python scripts including results_parse_final.py and extract_actions.py. The visualizations are done in Jupyter Notebook visualize_pddlego+.ipynb.

# Create a virtual environment (optional but recommended)
conda create -n pddlego-plus python=3.10
conda activate pddlego-plus

# Install dependencies
pip install -r requirement.txt

# Enter your OpenAI API key and deepseek API key
export OPENAI_API_KEY="your_openai_api_key_here"
export deepseek_API="your_deepseek_api_key_here"

Usage

Run interactive_CoinCollector.py or interactive_Alfworld.py to launch experiments. Each script supports:

single-trial runs
fixed 100-trial batches (*_50)
fixed-df experiments that reuse a cached domain file

All helper functions are already defined inside the two scripts; just call them when the file is executed as main.

Before running, set

folder_name: sub-folder under output/ where raw logs will be written
result_name: CSV file name logging the summary of each trial

Simply uncomment the blocks you need.

Available Models

deepseek: representing DeepSeek-R1-671B
o3-mini-2025-01-31
o4-mini-2025-04-16
gpt-4.1-2025-04-14
gpt-4o-2024-05-13
DeepSeek-R1-Distill-Llama-70B
DeepSeek-R1-Distill-Qwen-32B

Note:

Remember to uncomment the server block at the beginning of the python file to run models on server.
If you want to run other OpenAI models, remember to add model names in the close_source_model_lists.

Available Goal Types

detailed: Full detailed goal specification
subgoal: Simplified goal specification
without_hint: Goal without hints (ALFWorld only)
without_detailed_goal: Simplified goal without details (ALFWorld only)

Running CoinCollector Experiments

Run the interactive_CoinCollector.py script with the following functions:

# Configuration variables
model = "your_chosen_model"
folder_name = "results_folder"
result_name = "results_file"
i = 0  # starting trial index
num_trials = 10  # number of trials for single trial runs

# PDDLego+ with detailed goals (single trial batch)
run_iterative_model(model, i, i+num_trials, folder_name, result_name, goal_type="detailed")

# PDDLego+ with detailed goals (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="detailed")

# PDDLego+ with simple prompt (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="subgoal")

# PlanGen baseline (single trial batch)
run_baseline_model(model, i, i+num_trials, folder_name, result_name)

# PlanGen baseline (fixed 100 trials)
run_baseline_model_50(model, folder_name, result_name)

# PDDLego+ with fixed domain file (100 trials)
run_iterative_model_fixed_df(model, folder_name, result_name, goal_type="detailed")

Running ALFWorld Experiments

Run the interactive_Alfworld.py script with the following functions:

# Configuration variables
model = "your_chosen_model"
folder_name = "results_folder"
result_name = "results_file"
i = 0  # starting trial index
num_trials = 10  # number of trials for single trial runs

# PDDLego+ with detailed goals (single trial batch)
run_iterative_model(model, i, i+num_trials, folder_name, result_name, goal_type="detailed")

# PDDLego+ with detailed goals (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="detailed")

# PDDLego+ with simple prompt (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="subgoal")

# PDDLego+ with simple+hint prompt (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="without_detailed_goal")

# PDDLego+ with simple+goal prompt (fixed 100 trials)
run_iterative_model_50(model, folder_name, result_name, goal_type="without_hint")

# PlanGen baseline (single trial batch)
run_baseline_alfworld(model, i, i+num_trials, folder_name, result_name)

# PlanGen baseline (fixed 100 trials)
run_baseline_alfworld_50(model, folder_name, result_name)

# PDDLego+ with fixed domain file (100 trials)
run_iterative_model_fixed_df(model, folder_name, result_name, goal_type="detailed")

Analyzing Results

Parsing Results

Modify the file_name variable in results_parse_final.py to generate summary statistics:

python results_parse_final.py

Extracting Action Data

Update the folder path in extract_actions.py to analyze action counts:

python extract_actions.py

Visualizing Results

Use the Jupyter notebook visualize_pddlego+.ipynb to generate visualizations of experimental results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDDLego+: Zero-Shot Iterative Formalization and Planning in Partially Observable Environments

Overview

Environments

Models

Repository Structure

Experiment Configuration

Installation and Setup

Usage

Available Models

Available Goal Types

Running CoinCollector Experiments

Running ALFWorld Experiments

Analyzing Results

Parsing Results

Extracting Action Data

Visualizing Results

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Figures		Figures
archived_results_parse		archived_results_parse
output		output
test_files		test_files
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
domain.pddl		domain.pddl
extract_actions.py		extract_actions.py
interactive_Alfworld.py		interactive_Alfworld.py
interactive_CoinCollector.py		interactive_CoinCollector.py
plan.txt		plan.txt
problem.pddl		problem.pddl
requirement.txt		requirement.txt
results_parse_final.py		results_parse_final.py
visualize_pddlego+.ipynb		visualize_pddlego+.ipynb

License

zharry29/pddlego-plus

Folders and files

Latest commit

History

Repository files navigation

PDDLego+: Zero-Shot Iterative Formalization and Planning in Partially Observable Environments

Overview

Environments

Models

Repository Structure

Experiment Configuration

Installation and Setup

Usage

Available Models

Available Goal Types

Running CoinCollector Experiments

Running ALFWorld Experiments

Analyzing Results

Parsing Results

Extracting Action Data

Visualizing Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages