-
Notifications
You must be signed in to change notification settings - Fork 4
Handling of language tags in KGCL #60
Description
Currently handling of language tags is under-specified in KGCL, both in terms of
Recall also that most OBO ontologies use a mixture of uncommitted literals, xsd:string, and @en to denote english language labels.
As a general principle, the KGCL DSL is intended to be user-friendly. The user shouldn't have to know detailed implementation knowledge about each ontology. In fact it is very hard for them to know these details. As a case in point, for the following two terms in ENVO it's impossible to know from OLS that the first uses an explicit @en and the second does not:
- https://www.ebi.ac.uk/ols4/ontologies/envo/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FENVO_1000745
- https://www.ebi.ac.uk/ols4/ontologies/envo/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FENVO_2000038
At the most recent OMO meeting there was heated discussion about whether we should expect cardinality=1 of rdfs:label given that some ontologies may want to be international. It's not up to KGCL to adjudicate here. However, we can make things easy for users:
- matching should be liberal; if a language tag is not specified this should not be interpreted as "must match untyped literal", it should instead be interpreted as "match this at the string level"
- application should be configurable at the ontology level
- if the user does not specify a language tag, and the ontology is configured to always use language tags then the configured default language should be applied
- if the user does specify a language tag then this should be used (it is up to the ontology to configure GH actions to reject any or all language tags if their policy is always untyped literals)
2 This does place more of a burden on implementors as there needs to be some configuration mechanism, but having this default to untyped literals will work for pretty much all OBO ontologies for now