Modern Talking: Key-Point Analysis using Modern Natural Language Processing
Participation at the Quantitative Summarization – Key Point Analysis Shared Task (data on GitHub).
First, install Python 3.9 or higher and then clone this repository. From inside the repository directory, create a virtual environment and activate it:
python3.9 -m venv venv/
source venv/bin/activateThen, install the test dependencies:
pip install -e .Run a pipeline to train and evaluate a matcher with respect to a given metric:
python -m modern_talking [MATCHER] [MATCHER_OPTIONS] [METRIC]This will automatically download all datasets, train the matcher on the train set and evaluate the metric for predicted labels on the dev and test set (test evaluation will be skipped if test labels are unknown).
Predicted labels are also saved to data/out/predictions-[MATCHER].json in JSON format as described in the shared task documentation.
List available matchers with:
python -m modern_talking --helpList individual matcher's options with:
python -m modern_talking [MATCHER] --helpTerm overlap baseline:
python -m modern_talking term-overlap mapTerm overlap baseline (with preprocessing):
python -m modern_talking term-overlap --stemming --stop-words --custom-stop-words --synonyms mapBERT classifier:
python -m modern_talking transformers --type bert --name bert-base-uncased mapEvaluate predicted matches in JSON format:
python modern_talking/evaluation/track_1_kp_matching.py data/ data/out/predictions-[METRIC]-[MATCHER].jsonReplace data/out/predictions-[METRIC]-[MATCHER].json with the path to a file containing predicted matches in JSON format as described in the shared task documentation.
Run all unit tests:
pytestThis repository is licensed under the MIT License except for the evaluation script from the shared tasks organizers, licensed under the Apache License 2.0.