🌐 Project Page | 📄 Paper
Kaiqu Liang1,2, Julia Kruk1, Shengyi Qian1, Xianjun Yang1, Shengjie Bi1, Yuanshun Yao1, Shaoliang Nie1, Mingyang Zhang1, Lijuan Liu1, Jaime Fernández Fisac2, Shuyan Zhou1,3, Saghar Hosseini1
1Meta Superintelligence Labs, 2Princeton University, 3Duke University
The system implements two types of personal agents:
- Embodied Agent: For robot assistance tasks in household scenarios
- Shopping Agent: For e-commerce product recommendation tasks
Both agents support:
- Memory-based learning (SQL or FAISS backends)
- Pre-action feedback (asking clarifying questions)
- Post-action feedback (learning from corrections)
PAHF/
├── run_agent.py # Main entry point
├── agents/ # Agent implementations
│ ├── base.py # BasePersonalAgent class
│ ├── embodied_agent.py # EmbodiedAgent for robot tasks
│ └── shopping_agent.py # ShoppingAgent for shopping tasks
├── data/ # Datasets
│ ├── embodied/ # Embodied agent data
│ │ ├── personas/ # User persona definitions
│ │ └── scenarios/ # Task scenarios
│ └── shopping/ # Shopping agent data
├── memory/ # Memory system
│ ├── banks.py # SQLiteMemoryBank, FAISSMemoryBank
│ └── utils.py # Memory retrieval utilities
├── prompts/ # Prompt templates
│ ├── embodied_prompts.py # Prompts for embodied agent
│ └── shopping_prompts.py # Prompts for shopping agent
└── utils/ # Utility functions
├── llm.py # LLMClient for OpenAI API
├── json_utils.py # JSON read/write utilities
└── agent_utils.py # Metrics, memory initialization
- Python 3.8 or higher
- OpenAI API key
- Clone this repository
- Install dependencies:
pip install -r requirements.txt- Set your OpenAI API key:
export OPENAI_API_KEY='your-api-key-here'Alternatively, you can pass the API key directly when initializing the LLM client in the code.
python run_agent.py --agent embodiedpython run_agent.py --agent shopping--agent: Choose agent type (embodiedorshopping)--mem_style: Memory backend (sqlorfaiss, default:sql)--no-memory: Run baseline without memory (disables PAHF feedback loop)--learning_iter: Number of learning iterations for phases 1 and 3 (default: 3)--model: LLM model to use (default:gpt-4o)
Run PAHF embodied agent with SQLite memory:
python run_agent.py --agent embodiedRun PAHF embodied agent with FAISS memory:
python run_agent.py --agent embodied --mem_style faissRun baseline (no memory) for comparison:
python run_agent.py --agent embodied --no-memoryRun with custom model and learning iterations:
python run_agent.py --agent embodied --learning_iter 5utils/llm.py: OpenAI API client for LLM generationmemory/banks.py: Memory bank implementations (SQL and FAISS)agents/base.py: Base class for personal agentsagents/embodied_agent.py: Implementation for robot assistance tasksagents/shopping_agent.py: Implementation for shopping recommendation tasksrun_agent.py: Entry point for running experiments
The system supports two memory backends:
-
SQLite: Stores memories in a relational database with embeddings
- Good for: Persistent storage, easy inspection
- Storage: Single database file per experiment
-
FAISS: Uses Facebook AI Similarity Search for vector storage
- Good for: Fast similarity search, large-scale deployments
- Storage: Index file + metadata file per experiment
Data files are organized in data/embodied/:
scenarios/original_scenarios_A.json: Training scenarios (phase 1)scenarios/original_scenarios_B.json: Evaluation scenarios (phase 2)scenarios/evolved_scenarios_A.json: Training with evolved personas (phase 3)scenarios/evolved_scenarios_B.json: Evaluation with evolved personas (phase 4)personas/original_persona.py: Original user personaspersonas/evolved_persona.py: Evolved user personas
Each scenario in the JSON files has the following structure:
{
"index": 0,
"scene": "a black coffee, an herbal tea, a green juice, walnuts",
"task": "Could you bring me my favourite drink?",
"context": "General preference",
"user": "Alex",
"user_intent_object": "herbal tea",
"user_intent_location": "pick-up",
"scene_objects": ["black coffee", "herbal tea", "green juice", "walnuts"]
}Data files are in data/shopping/:
phase1.json: Training shopping scenariosphase2.json: Evaluation shopping scenariosphase3.json: Training with updated preferencesphase4.json: Evaluation with updated preferencesoriginal_persona.json: Original user personasupdated_persona.json: Updated user personaspersona_info.json: Detailed preference information
Note on Shopping Performance: The shopping agent in this release may perform better than the version reported in the paper due to minor prompt and feedback-logic refinements.
Prompts are defined in the prompts/ directory:
prompts/embodied_prompts.py: Prompts for embodied agentprompts/shopping_prompts.py: Prompts for shopping agent
You can customize these prompts to change agent behavior.
You can customize the human persona (simulator) that provides feedback to the agent. The persona definitions are located in:
data/embodied/personas/original_persona.py: Embodied agent personasdata/embodied/personas/evolved_persona.py: Evolved embodied personasdata/shopping/original_persona.json: Shopping agent personasdata/shopping/updated_persona.json: Updated shopping personas
Each persona defines user preferences that the simulated human uses when providing feedback. You can modify these to test different user behaviors or create entirely new personas.
Future Research Direction: The human simulator is currently using predefined personas. An interesting research direction is to develop more sophisticated human simulators that can:
- Model diverse user communication styles
- Simulate realistic user patience and feedback quality
- Adapt feedback based on agent performance history
- Generate more naturalistic preference expressions
To add a new memory backend, implement the MemoryBank abstract class in memory/banks.py with these methods:
add(text): Add a new memorysearch(query, top_k): Search for similar memoriesfind_similar_memory(text, threshold): Find most similar memory above thresholdupdate_memory(memory_id, new_text): Update existing memoryget_memory(memory_id): Get memory by IDget_all_memories(): Get all memoriesclose(): Cleanup resources
If you use this code in your research, please cite:
@article{liang2026learning,
title={Learning Personalized Agents from Human Feedback},
author={Liang, Kaiqu and Kruk, Julia and Qian, Shengyi and Yang, Xianjun and Bi, Shengjie and Yao, Yuanshun and Nie, Shaoliang and Zhang, Mingyang and Liu, Lijuan and Fisac, Jaime Fern{\'a}ndez and others},
journal={arXiv preprint arXiv:2602.16173},
year={2026}
}
This project is MIT licensed, as found in the LICENSE file.
- This public version focuses on GPT models
- The codebase has been simplified for public release
