The Pashto-Sikaram treebank is a native UD treebank with manually annotated texts from various sources.
The treebank contains manual annotations of 40 created natively in UD. This includes:
- 20 Cairo CICLing sentences with interesting syntactic constructions translated from English
- 20 original Pashto sentences from the book "Pashto and the Need for Translation" (Salih Mohammad Salih) In the future, the treebank will be populated with more sentences from the book and hopefully also news articles.
Apart from the manual native annotation of lemmas, universal Part-of-Speech tags and dependency relations, the Pashto-Sikaram treebank contains transliteration of forms and lemmas into Latin alphabet and English translation and glosses.
We thank Salih Mohammad Salih and Asmatullah Sarwan for providing the texts for the treebank and language consultations. We also thank Shah Wali Faryad for helping with the manual annotation.
- 2024-05-15 v2.14
- Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.15 License: CC BY-SA 4.0 Includes text: yes Parallel: cairo Genre: grammar-examples nonfiction Lemmas: manual native UPOS: manual native XPOS: not available Features: manual native Relations: manual native Contributors: Faryad, Ján; Zeman, Daniel Contributing: here Contact: [email protected] ===============================================================================