Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. Although few domain-specific knowledge graphs exist (e.g., Pubmed for medicine), developing specialized retrieval applications for many domains still requires constructing knowledge graphs from scratch. To facilitate knowledge graph construction, we introduce WAKA: a Web application that allows domain experts to create knowledge graphs through the medium with which they are most familiar: natural language.
To use WAKA, you can either use the publicly available service or deploy WAKA locally on your machine.
The public service is available at https://waka.webis.de.
In addition to the knowledge graph authoring GUI, there is an API endpoint available to automatically construct knowledge graphs from text.
Domain: POST waka.webis.de/api/v1/kg
Request body:
{"content": "<your text>"}Response body:
{
"text": "<your text>",
"triples": [
{"subject": "<ENTITY_OBJ>", "predicate": "<PROPERTY_OBJ>", "object": "<ENTITY_OBJ>"},
{"subject": "<ENTITY_OBJ>", "predicate": "<PROPERTY_OBJ>", "object": "<ENTITY_OBJ>"},
"..."
],
"entities": [
"<ENTITY_OBJ>",
"<ENTITY_OBJ>",
"..."
],
"entity_mentions": [
"<ENTITY_MENTION_OBJ>",
"<ENTITY_MENTION_OBJ>",
"..."
]
}JSON object schemas:
{ // <ENTITY_OBJ>
"url": "http://www.wikidata.org/entity/...",
"label": "label in Wikidata",
"description": "description in Wikidata",
"score": 1.0,
"mentions": [
"<ENTITY_MENTION_OBJ>",
"..."
]
}{ // <ENTITY_MENTION_OBJ>
"url": "http://www.wikidata.org/entity/...",
"label": "label in Wikidata",
"description": "description in Wikidata",
"start_idx": 0,
"end_idx": 20,
"text": "mention span content",
"score": 1.0,
"e_type": "NER Type"
}{ // <PROPERTY_OBJ>
"url": "http://www.wikidata.org/prop/direct/...",
"label": "label in Wikidata",
"description": "description in Wikidata"
}Example call with curl:
curl -X POST -H "Content-Type: application/json" -d "{\"content\": \"The Bauhaus-Universität Weimar is a university located in Weimar, Germany.\"}" https://waka.webis.de/api/v1/kg
The local deployment of WAKA requires a Nvidia GPU with at least 10GB of VRAM and a minimum of 20GB RAM.
Clone this repository and execute the following command (requires build-essential):
make clean installmake runAfter starting the server, WAKA will be available at http://localhost:8000/static/index.html
A prebuilt docker image of WAKA is available. To spawn a container with this image execute the following command (requires nvidia-container-toolkit for GPU support):
docker run --gpus all -P 8000:8000 registry.webis.de/code-lib/public-images/waka:latestAfter the container is done setting up, WAKA is available at http://localhost:8000/static/index.html
This only becomes necessary if you make adjustments to the code. Execute the following command to build a new image of WAKA from the project directory.
docker build -t <my-name>:<version> . Performance is measured on the test set of the RED^FM dataset (446 texts).
| Step | Task | Macro Precision |
Macro Recall |
Macro F1 |
Micro Precision |
Micro Recall |
Micro F1 |
|---|---|---|---|---|---|---|---|
| 1 | Entity Recognition | 0.0675 | 0.9162 | 0.1220 | 0.1544 | 0.9892 | 0.2671 |
| 2 | Entity Retrieval | 0.0021 | 0.8258 | 0.0042 | 0.0016 | 0.8340 | 0.0042 |
| 3 | Entity Reranking | 0.0110 | 0.7849 | 0.0212 | 0.0063 | 0.7907 | 0.0124 |
| 4 | Relation Extraction | 0.3033 | 0.7775 | 0.4069 | 0.5505 | 1.0000 | 0.7101 |
| 5 | Relation Linking | 0.3033 | 0.7775 | 0.4069 | 0.5505 | 1.0000 | 0.7101 |
| 6 | Knowledge Fusion | 0.1548 | 0.3028 | 0.1824 | 0.1425 | 0.3065 | 0.1946 |
| 7 | Natural Language Inference | 0.2057 | 0.3284 | 0.2270 | 0.1999 | 0.3325 | 0.2497 |
If you make use of WAKA's authoring GUI or the knowledge graph creation algorithm, please cite the following work.
@InProceedings{gohsen:2024a,
author = {Marcel Gohsen and Benno Stein},
booktitle = {9th ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR 2024)},
doi = {10.1145/3627508.3638340},
isbn = {979-8-4007-0434-5/24/03},
month = mar,
publisher = {ACM},
site = {Sheffield, United Kingdon},
title = {{Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language}},
year = 2024
}