Skip to content

UniversalDependencies/UD_Azerbaijani-TueCL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

This is a small treebank of grammatical examples for Azerbaijani. The treebank tries to be neutral about the particular variety (North or South Azerbaijani, hence, uses the ISO code for the macrolanguage (az).

Introduction

The Azerbaijani-TueCL treebank consists of 148 sentences, which include 20 sentences from Cairo and 119 sentences provided by the UD Turkic Group. This treebank is a part of the UD Turkic Treebank. To account for instances of pronominal subject omission, we have added 9 additional sentences (versions) to the Cairo corpus. It's important to note that the original grammatical sentences are in Turkish, and some constructions and translations do not exist in Azerbaijani. As a result, we have included duplicate sentences, which are as follows:

  • Sentence IDs 14 and 15
  • Sentence IDs 35, 36, and 37
  • Sentence IDs 54 and 55
  • Sentence IDs 74 and 75
  • Sentence IDs 78 and 79
  • Sentence IDs 80 and 81
    Currently, Azerbaijani is written using three different alphabets: the Persian alphabet in the South, and the Cyrillic and Latin alphabets in the North. The current treebank only includes sentences written in the Latin script, with plans to include sentences in the Perso-Arabic alphabet in the future. Translations of all sentences into English are also available.

Acknowledgments

We are deeply thankful to the UD Turkic Group for their weekly informative meetings and discussions and for all the support we have received.

References

  • (citation)

Changelog

  • 2025-09-04 v2.16
    • add parallel corpus information to machine-readable metadata
    • add parallel data support with parallel_id metadata for cross-lingual sentence matching
  • 2024-05-15 v2.14
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.14
License: CC BY-SA 4.0
Includes text: yes
Parallel: cairo tuecl
Genre: grammar-examples
Lemmas: manual native
UPOS: manual native
XPOS: not available
Features: not available
Relations: manual native
Contributors: Eslami, Soudabeh; Çöltekin, Çağrı
Contributing: here
Contact: [email protected], [email protected]
===============================================================================

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •