FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Paper | Model | Dataset | Training | Local Deployment

Installation

pip install -r requirements.txt
pip install flash-attn --no-build-isolation

Datasets

Training Stage 1: Public Food Datasets

VIREO Food-172, Recipe1M, Nutrition5k, FoodSeg103, UECFoodPixComplete

Note: You only need to download the extracted Nutrition5k images provided in our FoodDialogues instead of the original Nutrition5k dataset.

Stage 2: GPT-4 Generated Conversation Datasets

FoodDialogues, FoodReasonSeg

Training

Training Data Preparation

Download them from the above links, and organize them as follows.

├── dataset
│   ├── FoodSeg103
│   │   ├── category_id.txt
│   │   ├── FoodReasonSeg_test.json
│   │   ├── FoodReasonSeg_train.json
│   │   └── Images
│   │       └── ...
│   │   └── ImageSets
│   │       └── ...
│   ├── UECFOODPIXCOMPLETE
│   │   └── data
│   │       ├── category.txt
│   │       ├── train9000.txt
│   │       ├── test1000.txt
│   │       └── UECFOODPIXCOMPLETE
│   │           └── train
│   │               └── ...
│   │           └── test
│   │               └── ...
│   ├── VireoFood172
│   │   └── train_id.json
│   │   └── ingre.json
│   │   └── foodlist.json
│   │   └── ready_chinese_food
│   │       └── ...
│   ├── Recipe1M
│   │   └── recipe1m_train_1488.json
│   │   └── images
│   │       └── ...
│   ├── Nutrition5k
│   │   ├── train_id.json
│   │   ├── cafe_1_id.json
│   │   ├── cafe_2_id.json
│   │   ├── dish_metadata_cafe1.csv
│   │   ├── dish_metadata_cafe2.csv
│   │   ├── FoodDialogues_train.json
│   │   ├── FoodDialogues_test.json
│   │   ├── images
│   │       └── ...

Training Stage 1

To train FoodLMM, you need to download the pre-trained weights of LISA-7B-v1-explanatory and SAM ViT-H weights, and set their paths in train_config_Stage1.yaml.

deepspeed --master_port=XXX train_ds_Stage1.py --cfg_file=train_config_Stage1.yaml

The weights merging processes will be done autonomously, if you couldn't find the weights in the configed path ('./runs/FoodLMM_S1' by default), try the following commands.

cd ./runs/EXP_NAME/ckpt_model && python zero_to_fp32.py . ../pytorch_model.bin
CUDA_VISIBLE_DEVICES="" python merge_lora_weights_and_save_hf_model.py \
  --cfg_file=train_config.yaml \
  --weight="PATH_TO_pytorch_model.bin" \
  --save_path="PATH_TO_SAVED_MODEL"

Training Stage 2

deepspeed --master_port=XXX train_ds_Stage2.py --cfg_file=train_config_Stage2.yaml

Deployment

CUDA_VISIBLE_DEVICES=0 python online_demo.py --version='PATH_TO_FoodLMM_Chat'

Citation

If you find this project useful in your research, please consider citing:

@article{yin2023foodlmm,
  title={FoodLMM: A Versatile Food Assistant using Large Multi-modal Model},
  author={Yin, Yuehao and Qi, Huiyan and Zhu, Bin and Chen, Jingjing and Jiang, Yu-Gang and Ngo, Chong-Wah},
  journal={arXiv preprint arXiv:2312.14991},
  year={2023}
}

Acknowledgement

This work is built upon LISA, and our datasets are generated from Nutrition5k and FoodSeg103 using GPT-4.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
evaluate		evaluate
examples		examples
food_imgs		food_imgs
model		model
resources		resources
runs		runs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
merge_lora_weights_and_save_hf_model.py		merge_lora_weights_and_save_hf_model.py
online_demo.py		online_demo.py
requirements.txt		requirements.txt
train_config_Stage1.yaml		train_config_Stage1.yaml
train_config_Stage2.yaml		train_config_Stage2.yaml
train_ds_Stage1.py		train_ds_Stage1.py
train_ds_Stage2.py		train_ds_Stage2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Installation

Datasets

Training Stage 1: Public Food Datasets

Stage 2: GPT-4 Generated Conversation Datasets

Training

Training Data Preparation

Training Stage 1

Training Stage 2

Deployment

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

YuehaoYin/FoodLMM

Folders and files

Latest commit

History

Repository files navigation

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Installation

Datasets

Training Stage 1: Public Food Datasets

Stage 2: GPT-4 Generated Conversation Datasets

Training

Training Data Preparation

Training Stage 1

Training Stage 2

Deployment

Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages