Skip to content

UniversalDependencies/UD_French-Rhapsodie

 
 

Repository files navigation

Summary

A Universal Dependencies corpus for spoken French.

Introduction

The corpus was converted automatically from the Rhapsodie treebank with manual corrections. The treebank in maintained in the repository SUD_French-Rhapsodie in the SUD annotation schema.

The SUD version is also available with prosodic annotation (see SUD README.md).

Structure

  • fr_rhapsodie-ud-train.conllu 1,288 sentences and 19,144 tokens
  • fr_rhapsodie-ud-dev.conllu 1,081 sentences 12,907 tokens
  • fr_rhapsodie-ud-test.conllu 840 sentences 12,191 tokens
  • total 3,209 sentences 44,242 tokens

Changelog

  • 2025-11-15 v2.17
    • More metadata
    • Sound alignement
  • 2024-05-15 v2.14
    • Fix a few inconsistent annotation of idioms
    • See SUD commit logs for more details
  • 2021-11-15 v2.9
    • Repository renamed from UD_French-Spoken to UD_French-Rhapsodie.
  • 2020-11-15 v2.7
    • Morphology added
  • 2018-04-15 v2.2
    • Initial release

=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.2 License: CC BY-SA 4.0 Includes text: yes Parallel: no Genre: spoken Lemmas: converted from manual UPOS: converted with corrections XPOS: not available Features: not available Relations: converted with corrections Contributors: Gerdes, Kim; Kahane, Sylvain; Nakhlé, Mariam; Yan, Chunxiao; Etienne, Aline; Courtin, Marine Contributing: here Contact: [email protected]

About

Spoken French data.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 6