Skip to content

UniversalDependencies/UD_Japanese-Modern

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Summary

This Universal Dependencies (UD) Japanese treebank is based on the definition of
UD Japanese convention described in the UD documentation.
The original sentences are from `Corpus of Historical Japanese' (CHJ).

# Introduction

The Japanese UD treebank contains the sentences from CHJ Meiji Era / Taishō Era
Series I: Magazines - Meiroku Zasshi samples
http://pj.ninjal.ac.jp/corpus_center/chj/meiji_taisho-en.html
with BCCWJ-DepPara[2] compatible annotation [5].

We prepared conversion rules from BCCWJ-DepPara to UD_Japanese v2.1 guidelines [3][4].

## Spliting

The all data in UD_Japanese-Modern is test data.

test: all

## Citation

You are encouraged to cite the following paper when you refer to the
Universal Dependencies Japanese Treebank.

Omura, M., Takahashi, Y., & Asahara, M. (2017).
Universal Dependency for Japanese Modern.
In JADH-2017.

Asahara, M., Kanayama, H., Tanaka, T., Miyao, Y., Uematsu, S., Mori, S.,
Matsumoto, Y., Omura, M., & Murawaki, Y. (2018).
Universal Dependencies Version 2 for Japanese.
In LREC-2018.

# Acknowledgments

This work was supported by JSPS KAKENHI Grants Numbers JP15K12888 and
 JP17H00917 and is a project of the Center for Corpus Development, NINJAL.

The original treebank was provided by:

- National Instutite for Japanese Language and Linguistics, Japan

The corpus was converted by:

- Mai Omura
- Masayuki Asahara

through discussion and validation with

- Yuta Takahashi

# License

See file LICENSE.txt

# Reference

[1] National Institute for Japanese Language and Linguistics, Center for Corpus Development (Kondō, Asuko; Mabuchi, Yōko; Hattori, Noriko, et. al.) (eds.) (2017) Corpus of Historical Japanese, Meiji Era / Taishō Era Series I: Magazines (Short Unit Word Data Version 1.1) http://pj.ninjal.ac.jp/corpus_center/chj/meiji_taisho.html (accessed March 27, 2018)
[2] Asahara, M., & Matsumoto, Y. (2016). Bccwj-deppara: A syntactic annotation treebank on the ‘Balanced Corpus of Contemporary Written Japanese’. In Proceedings of the 12th Workshop on Asian Language Resources (ALR12) (pp. 49-58).
[3] Tanaka, T., Miyao, Y., Asahara, M., Uematsu, S., Kanayama, H., Mori, S., &
Matsumoto, Y. (2016). Universal Dependencies for Japanese. In LREC-2016.
[4] Asahara, M., Kanayama, H., Tanaka, T., Miyao, Y., Uematsu, S., Mori, S.,
Matsumoto, Y., Omura, M., & Murawaki, Y. (2018). Universal Dependencies Version 2 for Japanese. In LREC-2018.
[5] Omura, M., Takahashi, Y. & Asahara, M. (2017). Universal Dependency for Japanese Modern, In JADH-2017.

Changelog

2018-11-01   v2.4
  * Update v2.3 to v2.4
2018-11-01   v2.3
  * Update v2.2 to v2.3
2018-03-28   v2.2
  * Initial release in Universal Dependencies.

=== Machine-readable metadata =================================================
Data available since: UD v2.2
License: CC BY-NC-ND 3.0
Includes text: yes
Genre: nonfiction
Lemmas: converted from manual
UPOS: converted from manual
XPOS: manual native
Features: not available
Relations: converted from manual
Contributors: Omura, Mai; Asahara, Masayuki; Takahashi, Yuta
Contributing: elsewhere
Contact: [email protected]
===============================================================================

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •