KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints
- Table of Contents
- 🤗KORE
- 🤗KORE-Augmentations
- 🛠️Requirements and Installation
- 💥Training
- 🤖Evaluation
- 🤝 Acknowledgments
- 📝 Citation
To address the challenge of balancing knowledge adaptation and retention, we propose KORE, a synergistic method of Knowledge-oRientEd augmentations and constraints.
Existing methods suffer from poor generalization. General data augmentation is often "superficial and discrete" (e.g., simple rephrasing or rotation), creating isolated data points. This approach fails to build a coherent knowledge structure and offers limited support for true "knowledge internalization".
Knowledge-oRientEd AUGMENTATION uses an automated pipeline to convert knowledge into a "profound and structured" format. It constructs a comprehensive knowledge structure by generating "multi-rounds of dialogue" data (the trunk) and "instruction tasks" data (the branches), such as VQA and Image Caption. This process creates the KORE-74K dataset , enabling the model to achieve accurate adaptation and true "knowledge internalization" rather than just "data memorization".
You can download data 🤗 Huggingface Dataset. And the expected structure of files is:
KORE-74K
|-- json/jsonl
| |-- KORE-74K-training_data.json
|-- imgs
| |-- imgs_of_recognition_caption_description.zip
| |-- imgs_of_vqa
| | |-- split_zip_part_00
| | |-- split_zip_part_01
| | |-- split_zip_part_02
conda env create -f kore.yml
If there are any issues, you can refer to https://github.com/haotian-liu/LLaVA
or
conda create -n kore python=3.10 -y
cd env
pip install -r kore.txt
Step 1: extract covariance matrix and reconstruct weights
bash kore_tool/extract_covariance_matrix/step1_benchmark.sh -d "MME MMBench_DEV_EN" -n 128 -r 235 -s 233
The selection of -d refers to 'DATASET_CONFIG' in benchmark_load.py, like: MME, HallusionBench, MathVision......
bash kore_tool/extract_covariance_matrix/step1_onevision_data.sh -d "onevision" -n 64 -r 235 -s 233The OneVision dataset used can be downloaded from here 🤗 LLaVA-OneVision-Data.
Step 2: training
bash kore_tool/training/training_kore.sh --data_path KORE-74K-training_data.json --output_dir train_ckpt/kore_epoch1 --num_train_epochs 1 --swanlab_project "kore" --swanlab_experiment_name "epoch1"
--lora_null_v1 True does not freeze the 'A' matrix, whereas --lora_null_v2 True does.Step 3: merge
python kore_tool/merge/merge_llava.py --model_id training_model --save_model True --save_path merge_modelEvaluate EVOKE
CUDA_VISIBLE_DEVICES=0,1,2,3 bash kore_tool/evaluate_evoke/evoke.sh -c /path/to/checkpoint -o /path/to/output -q EVOKE/evoke_evaluation_data.jsonlEvaluate Knowledge Retention Benchmark (MME,MMBench,POPE,ScienceQA is based on the Llava framework itself)
bash kore_tool/evaluate_retention_benchmark/mmbench.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/mme.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/pope.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/sqa.sh -m /path/to/model/checkpointEvaluate Knowledge Retention Benchmark
Other benchmarks is based on VLMEvalKitReplace the ckpt path with the trained model here.
We thank the following open-source projects for making this work possible:
- LLaVA for the model training framework.
- CorDA and LoRA-Null for the constraint fine-tuning framework.
- EVOKE for the knowledge adaptation evaluation.
- VLMEvalKit for the knowledge retention evaluation.
- MCITlib and CoIN for the continual learning methods framework.
If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 :)
@article{jiang2025kore,
title = {KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints},
author={Jiang, Kailin and Jiang, Hongbo and Jiang, Ning and Gao, Zhi and Bi, Jinhe and Ren, Yuchen and Li, Bin and Du, Yuntao and Liu, Lei and Li, Qing},
journal={arXiv preprint arXiv:2510.19316},
year={2025}
url = {https://arxiv.org/abs/2510.19316}
}


