This repository contains the code and evaluation scripts for matching the data of 59 argument corpora with three topic ontologies. The data used in this experiment can be found here.
The units and topics sof 59 argument corpora were cleaned and organized with an individual script for each corpus in preprocessing
Manually matching the topics of 39 argument corpora with the topics of three topic ontologies was done manually using the judgment interface. Candidate topic matches were generated using BM25 (Whoosh) in manual topic_matching
Automatically matching units with topic ontologies were implemented using two approaches semantic indexing (explicit semantic analysis) and bert (and other transformers). These approaches were evaluated in a depth of 5 pooling setup which was carried out using the judgement interface. The implementation for the two aproaches can be found here and a baseline can be found here.
To evaluate the suggestions of the three approaches the following script can be used.
automatic topic matching evaluation
The judgement interface was used for both manual matching of units with the three topic ontologies and the evaluation of the automatic appraoches.
Topic Ontologies for Arguments has been published in Findings of EACL 2023 "Topic Ontologies for Arguments" which can be cited as follows:
@InProceedings{ajjour:2023,
author = {Yamen Ajjour and Johannes Kiesel and Benno Stein and Martin Potthast},
booktitle = {17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)},
month = may,
numpages = 17,
publisher = {Association for Computational Linguistics},
site = {Dubrovnik, Croatia},
title = {{Topic Ontologies for Arguments}},
todo = {dataurl, doi, editor, url, pages},
year = 2023
}