Code for our EMNLP2019 paper,
Core Semantic First: A Top-down Approach for AMR Parsing. [paper][bib]
Deng Cai and Wai Lam.
python2==Python2.7
python3==Python3.6
sh setup.sh (install dependencies)
in the directory preprocessing
- make a directory, for example
preprocessing/2017. Put filestrain.txt,dev.txt,test.txtin it. The format of these files be the same as our example filepreprocessing/data/dev.txt. sh go.sh(you may make necessary changes inconvertingAMR.java)python2 preprocess.py
- We use Stanford Corenlp to extract NER, POS and lemma, see
go.shandconvertingAMR.javafor details. - We already provided the alignment information used for concept prediction as in
common/out_standford(LDC2017T10 by the aligner of Oneplus/tamr).
in the directory parser
python3 extract.py && mv *vocab *table ../preprocessing/2017/.Make vocabularies for the dataset in../preprocessing/2017(you may make necessary changes inextract.pyand the command line as well)sh train.shBe patient! Checkpoints will be saved in the directoryckptby default. (you may make necessary changes intrain.sh).
in the directory parser
sh work.sh(you should make necessary changes inwork.sh)
- The most important argument is
--load_path, which is supposed to be set to a specific checkpoint file, for example,somewhere/some_ckpt. The output file will be in the same folder with the checkpoint file, for example,somewhere/some_ckpt_test_out
in the directory amr-evaluation-tool-enhanced
python2 smatch/smatch.py --helpA large of portion of the code under this directory is borrowed from ChunchuanLv/amr-evaluation-tool-enhanced, we add more options as follows.
--weighted whether to use a weighted smatch or not
--levels LEVELS how deep you want to evaluate, -1 indicates unlimited, i.e., full graph
--max_size MAX_SIZE only consider AMR graphs with limited size <= max_size, -1 indicates no limit
--min_size MIN_SIZE only consider AMR graphs with limited size >= min_size, -1 indicates no limitFor examples:
-
To calculate the smatch-weighted metric in our paper.
python2 smatch/smatch.py --pr -f parsed_data golden_data --weighted -
To calculate the smatch-core metric in our paper
python2 smatch/smatch.py --pr -f parsed_data golden_data --levels 4
We release our pretrained model at Google Drive.
To use the pretrained model, move the vocabulary files under [Google Drive]/vocabs to preprocessing/2017/ and adjust work.sh accordingly (set --load_path point to [Google Drive]/model.ckpt).
We also provide the exact model output reported in our paper. The output file and the corresponding reference file are in the legacy folder.
If you find the code useful, please cite our paper.
@inproceedings{cai-lam-2019-core,
title = "Core Semantic First: A Top-down Approach for {AMR} Parsing",
author = "Cai, Deng and
Lam, Wai",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1393",
pages = "3790--3800",
}
For any questions, please drop an email to Deng Cai.