Turkish Tourism is a domain specific treebank consisting of 19,750 manually annotated sentences and 92,200 tokens. These sentences were taken from the original customer reviews of a tourism company.
Turkish Tourism is the first domain specific treebank of Turkish. It consists of 19,750 manually annotated sentences and and 92,200 tokens. The corpus consists of hotel/restaurant reviews of a booking company. The data is split into half by test and training files.
We wish to thank the Starlang Software for funding and supporting this work.
- 2021-05-15 v2.8
- Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.8 License: CC BY-SA 4.0 Includes text: yes Parallel: no Genre: reviews Lemmas: converted from manual UPOS: converted from manual XPOS: converted from manual Features: converted from manual Relations: converted from manual Contributors: Kuzgun, Aslı; Cesur, Neslihan; Yıldız, Olcay Taner; Kuyrukçu, Oğuzhan; Marşan, Büşra; Arıcan, Bilge Nas; Kara, Neslihan; Aslan, Deniz Baran; Sanıyar, Ezgi; Asmazoğlu, Cengiz Contributing: elsewhere Contact: [email protected] / [email protected] ===============================================================================