TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models

We propose TablePilot, a pioneering tabular data analysis framework leveraging large language models to autonomously generate comprehensive and superior analytical results without relying on user profiles or prior interactions. The framework incorporates key designs in analysis preparation and analysis optimization to enhance accuracy. Additionally, we construct DART, a benchmark tailored for comprehensive tabular data analysis recommendation.

Quick Start 🚀

Step 1: Build Environment

conda create -n tablepilot
conda activate tablepilot

pip install -r requirements.txt

Step 2: Tabular Data Process

cd data_process
bash table_txt_fmt.sh

Step 3: Analysis Generation

This step is the core generation component of TablePilot and consists of two main phases:

Table Explanation Generation
Module-based Analysis Generation, which includes three parts:
- Basic Analysis
- Visualization
- Modeling

Replace the corresponding .py files as needed to generate specific content, then run:

bash run_generation.sh

Step 4: Analysis Optimization

We employ a multimodal revision approach to refine the generated data analysis operations.

Before revision, we first obtain the execution results of the initial round of generated data analysis operations

cd execution/run
bash run_code_exec_error.sh

After that, we perform optimization based on these results

cd generation/run
bash run_revision.sh

We perform only a single round of revision to obtain the final optimized results

cd execution/run
bash run_code_exec_revision.sh

Step 5: Analysis Ranking

After optimization, the ranking module is used to return the highest-quality recommendations.

We first need to aggregate all the results from the module-based analysis

cd evaluation/run
bash run_process_module_res.sh

Then we apply the ranking module to return the highest-quality recommendations

cd generation/run
bash run_rank.sh

Step 6: Evaluation

Execution Rate

cd evaluation/run
bash run_exec_rate.sh

Recall
- Total Recall, the overall recall of all results generated by the framework
```
bash run_recall_all_results,sh
```
- Recall@k, where k represents the number of recommended data analysis operations the user wishes to receive
```
bash run_sum_ranking_res.sh
bash run_recall_ranked_res.sh
```

Citation

If you find this repository useful, please considering giving ⭐ or citing:

@article{yi2025tablepilot,
  title={TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models},
  author={Yi, Deyin and Liu, Yihao and Cao, Lang and Zhou, Mengyu and Dong, Haoyu and Han, Shi and Zhang, Dongmei},
  journal={arXiv preprint arXiv:2503.13262},
  year={2025}
}

@inproceedings{yi2025tablepilot,
    title = "{T}able{P}ilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models",
    author={Yi, Deyin and Liu, Yihao and Cao, Lang and Zhou, Mengyu and Dong, Haoyu and Han, Shi and Zhang, Dongmei},
    editor = "Rehm, Georg and Li, Yunyao",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-industry.28/",
    pages = "355--410",
    ISBN = "979-8-89176-288-6",
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
data_process		data_process
figure		figure
framework		framework
prompt/system		prompt/system
response		response
result		result
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models

Quick Start 🚀

Step 1: Build Environment

Step 2: Tabular Data Process

Step 3: Analysis Generation

Step 4: Analysis Optimization

Step 5: Analysis Ranking

Step 6: Evaluation

Citation

Contributing

Trademarks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

microsoft/TablePilot

Folders and files

Latest commit

History

Repository files navigation

TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models

Quick Start 🚀

Step 1: Build Environment

Step 2: Tabular Data Process

Step 3: Analysis Generation

Step 4: Analysis Optimization

Step 5: Analysis Ranking

Step 6: Evaluation

Citation

Contributing

Trademarks

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages