MT4CrossOIE

Abstract

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multi-stage tuning framework called MT4CrossOIE, designed for enhancing cross-lingual open information extraction by injecting language-specific knowledge into the shared model. Specifically, the cross-lingual pre-trained model is first tuned in a shared semantic space (e.g., embedding matrix) in the fixed encoder and then other components are optimized in the second stage. After enough training, we freeze the pre-trained model and tune the multiple extra low-rank language-specific modules using mixture-of- LoRAs for model-based cross-lingual transfer. In addition, we leverage two-stage prompting to encourage the large language model (LLM) to annotate the multilingual raw data for data-based cross-lingual transfer. The model is trained with multilingual objectives on our proposed dataset OpenIE4++ by combing the model-based and data-based transfer techniques. Experimental results on various benchmarks emphasize the importance of aggregating multiple plug-in-and-play languagespecific modules and demonstrate the effectiveness of MT4CrossOIE in cross-lingual OIE.

Datasets

OpenIE4++ Dataset

Architecture

License

Please refer to the LICENSE file for more details.

Citation

arXiv: https://arxiv.org/abs/2308.06552

@article{wang2023mt4crossoie,
  title   = {MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction},
  author  = {Wang, Zixiang and Chai, Linzheng and Yang, Jian and Bai, Jiaqi and Yin, Yuwei and 
             Liu, Jiaheng and Guo, Hongcheng and Li, Tongliang and Yang, Liqun and
             Hebboul, Zine el-abidine and Li, Zhoujun},
  journal = {arXiv preprint arXiv:2308.06552},
  year    = {2023},
}

Name		Name	Last commit message	Last commit date
parent directory ..
benchie		benchie
carb		carb
datasets (sample)		datasets (sample)
evaluate		evaluate
local		local
utils		utils
LICENSE		LICENSE
README.md		README.md
Spanish-Clean_transfer.py		Spanish-Clean_transfer.py
dataset.py		dataset.py
environment.yml		environment.yml
extract.py		extract.py
fuzz.py		fuzz.py
main.py		main.py
model.py		model.py
preprocess.py		preprocess.py
preprocess_Multi.py		preprocess_Multi.py
request_trans.py		request_trans.py
requirements.txt		requirements.txt
test.py		test.py
test.sh		test.sh
test_benchie.py		test_benchie.py
test_model_topk.py		test_model_topk.py
test_router_lora.py		test_router_lora.py
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

MT4CrossOIE

Abstract

Datasets

Architecture

License

Citation

FilesExpand file tree

MT4CrossOIE

Directory actions

More options

Directory actions

More options

Latest commit

History

MT4CrossOIE

Folders and files

parent directory

README.md

MT4CrossOIE

Abstract

Datasets

Architecture

License

Citation