This is the codebase to reproduce the results of the paper Grounded Test-Time Adaptation for LLM Agents.
| Parametric Adaptation Framework | Non-Parametric Adaptation Framework |
|---|---|
![]() |
![]() |
We adopt NNetnav's codebase for web navigation exploration and task evaluation. To reproduce our results on WebArena, please refer to this.
For BFCLv3 experiment, we modify our method based on the official gorilla codebase. To reproduce our results on BFCLv3, please refer to this.
For Tau-Bench experiment, please refer to official codebase with parametric adaptation enabled.
If you find this work useful, please cite:
@article{chen2025grounded,
title={Grounded Test-Time Adaptation for LLM Agents},
author={Chen, Arthur and Liu, Zuxin and Zhang, Jianguo and Prabhakar, Akshara and Liu, Zhiwei and Heinecke, Shelby and Savarese, Silvio and Zhong, Victor and Xiong, Caiming},
journal={arXiv preprint arXiv:2511.04847},
year={2025}
}This work is licensed under the MIT License.

