KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints

🤗KORE

To address the challenge of balancing knowledge adaptation and retention, we propose KORE, a synergistic method of Knowledge-oRientEd augmentations and constraints.

🤗KORE-Augmentations

Existing methods suffer from poor generalization. General data augmentation is often "superficial and discrete" (e.g., simple rephrasing or rotation), creating isolated data points. This approach fails to build a coherent knowledge structure and offers limited support for true "knowledge internalization".

Knowledge-oRientEd AUGMENTATION uses an automated pipeline to convert knowledge into a "profound and structured" format. It constructs a comprehensive knowledge structure by generating "multi-rounds of dialogue" data (the trunk) and "instruction tasks" data (the branches), such as VQA and Image Caption. This process creates the KORE-74K dataset , enabling the model to achieve accurate adaptation and true "knowledge internalization" rather than just "data memorization".

You can download data 🤗 Huggingface Dataset. And the expected structure of files is:

KORE-74K
|-- json/jsonl
|   |-- KORE-74K-training_data.json
|-- imgs
|   |-- imgs_of_recognition_caption_description.zip
|   |-- imgs_of_vqa
|   |   |-- split_zip_part_00
|   |   |-- split_zip_part_01
|   |   |-- split_zip_part_02

🛠️Requirements and Installation

conda env create -f kore.yml
If there are any issues, you can refer to https://github.com/haotian-liu/LLaVA

or

conda create -n kore python=3.10 -y
cd env
pip install -r kore.txt

💥Training

Step 1: extract covariance matrix and reconstruct weights

bash kore_tool/extract_covariance_matrix/step1_benchmark.sh -d "MME MMBench_DEV_EN" -n 128 -r 235 -s 233

The selection of -d refers to 'DATASET_CONFIG' in benchmark_load.py, like: MME, HallusionBench, MathVision......

bash kore_tool/extract_covariance_matrix/step1_onevision_data.sh -d "onevision" -n 64 -r 235 -s 233

The OneVision dataset used can be downloaded from here 🤗 LLaVA-OneVision-Data.

Step 2: training

bash kore_tool/training/training_kore.sh --data_path KORE-74K-training_data.json --output_dir train_ckpt/kore_epoch1 --num_train_epochs 1 --swanlab_project "kore" --swanlab_experiment_name "epoch1"

--lora_null_v1 True does not freeze the 'A' matrix, whereas --lora_null_v2 True does.

Step 3: merge

python kore_tool/merge/merge_llava.py --model_id training_model --save_model True --save_path merge_model

🤖Evaluation

Evaluate EVOKE

CUDA_VISIBLE_DEVICES=0,1,2,3 bash kore_tool/evaluate_evoke/evoke.sh -c /path/to/checkpoint -o /path/to/output -q EVOKE/evoke_evaluation_data.jsonl

Evaluate Knowledge Retention Benchmark (MME,MMBench,POPE,ScienceQA is based on the Llava framework itself)

bash kore_tool/evaluate_retention_benchmark/mmbench.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/mme.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/pope.sh -m /path/to/model/checkpoint
bash kore_tool/evaluate_retention_benchmark/sqa.sh -m /path/to/model/checkpoint

Evaluate Knowledge Retention Benchmark

Other benchmarks is based on VLMEvalKit

Replace the ckpt path with the trained model here.

https://github.com/open-compass/VLMEvalKit/blob/688e9da4a27e2691cd9a1723df6b65e5453f0889/vlmeval/config.py#L709

🤝 Acknowledgments

We thank the following open-source projects for making this work possible:

LLaVA for the model training framework.
CorDA and LoRA-Null for the constraint fine-tuning framework.
EVOKE for the knowledge adaptation evaluation.
VLMEvalKit for the knowledge retention evaluation.
MCITlib and CoIN for the continual learning methods framework.

📝 Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 :)

@article{jiang2025kore,
  title = {KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints},
  author={Jiang, Kailin and Jiang, Hongbo and Jiang, Ning and Gao, Zhi and Bi, Jinhe and Ren, Yuchen and Li, Bin and Du, Yuntao and Liu, Lei and Li, Qing},
  journal={arXiv preprint arXiv:2510.19316},
  year={2025}
  url = {https://arxiv.org/abs/2510.19316}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.devcontainer		.devcontainer
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
env		env
figs		figs
kore_tool		kore_tool
llava		llava
lora_null		lora_null
scripts		scripts
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark_load.py		benchmark_load.py
cog.yaml		cog.yaml
configuration_oursvd_llama.py		configuration_oursvd_llama.py
modeling_oursvd_llama.py		modeling_oursvd_llama.py
predict.py		predict.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints

Table of Contents

🤗KORE

🤗KORE-Augmentations

🛠️Requirements and Installation

💥Training

🤖Evaluation

🤝 Acknowledgments

📝 Citation

About

Uh oh!

Releases

Packages

Languages

License

KORE-LMM/KORE

Folders and files

Latest commit

History

Repository files navigation

KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints

Table of Contents

🤗KORE

🤗KORE-Augmentations

🛠️Requirements and Installation

💥Training

🤖Evaluation

🤝 Acknowledgments

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages