-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need for a common approach to modeling dataset series in DCAT-AP #155
Comments
I can confirm that the Czech National Open Data Catalog (https://data.gov.cz) will soon be implementing dataset series through |
Also, the article - DCAT-AP: How to model Dataset series? - has previously been published but the document is 4 years old and the status is unclear. |
@aidig thanks for the good overview. Lets work towards a clearer proposal |
There are several examples of a an approach not mentioned in the above list, namely specifying the annual 'versions' as dcat:Distributions. For instance: This approach does not seems to not take into consideration the DCATs note on the use of dcat:Distribution - that is "all distributions of one dataset should broadly contain the same data." DCAT does also state that "the distributions might have different levels of fidelity to the underlying data" and the interpretation is 'application specific', but such use seems problematic and advice and recommendations from the DCAT Application Profile is still required. It would be great if the DCAT-AP's proposal for guidelines on this topic could address this too. |
In addition, note that JoinUp uses dct:isVersionOf (in an ADMS-AP to link solutions (eg. a vocabulary modelled as a dcat:Dataset) to a release (eg. a versioned vocabulary modelled as a dcat:Dataset). A different scenario to times series, but relevant for scoping the properties needed. See related issue: Modelling three-level structures with DCAT/ADMS #149 |
The DCAT Application Profile for Base Registries (bregDCAT-AP) has - as noted in the above - already made the decision to model relationships in which datasets are contained in other datasets, that is, a dataset is a subset of another using dct:hasPart/dct:isPartOf and state that similar mechanism adopted in the future should be based on these Dublin Core terms. To ensure interoperability, please ensure close coordination and collaboration between DCAT-AP and bregDCAT-AP. Generally, there is a need for modelling a dataset that is part of another dataset, and one can only hope that the various profiles of DCAT take the same approach in modelling this relationship. |
To complement your survey, @aidig , DCAT-AP_IT (the Italian profile of DCAT-AP) provides guidelines on the use of I include below the (automatic) English translation:
|
Many thanks for the info @andrea-perego! Much appreciated :-) How does the below solution align with the semantics of dcat:Distribution and the corresponding W3C note: "all distributions of one dataset should broadly contain the same data" . DCAT also states though that "the question of whether different representations can be understood to be distributions of the same dataset, or distributions of different datasets, is application specific."
|
I think it complies with it. The |
@jakubklimek will that be with direct use of the DC properties or by creation of more specifik subproperties? |
@pebran It will be the direct use of |
So we now have two types of Datasets, those inSeries and normal Datasets. The Dataset member of a Dataset Series has only 6 properties. I think this should be better explained. |
Indeed, there are 2 types. Observe that the type InSeries Dataset is a subclass of a normal Dataset. Only the properties that require special attention for an InSeries Dataset are included for that class. So we rely on that users understand the notion of a subclass as: "all rules and constraints of the superclass apply to me". We could add in the class usage note an additional sentence such as "This class is a subclass of Dataset and therefore all properties with with their constraints apply to this. For readability purposes these are not copied to this class." Note that a similar general statement w.r.t. DCAT is mentioned in the last paragraph of https://semiceu.github.io/DCAT-AP/releases/3.0.0/#specoverview. |
I think the subclass relationship is difficult because it's technically not a subclass. It uses the same URI as the "normal" Dataset. My suggestion would be to remove the subclass relationship and adjust the usage note to something like this: If a Dataset is used as part of a DatasetSeries, the properties listed here can be used additionally, or slightly differently to those listed for the Dataset outside of a DatasetSeries. |
The need for a common approach to modeling dataset series has already been identified as a significant outstanding issue in DCAT (w3c/dxwg#868), and it will hopefully be adressed in DCAT 3.
However, in the meantime, various national and domain specific profiles of DCAT-AP 2.0 already suggest to implement structures to handle dataset series despite DCAT or DCAT-AP not offering the necessary properties/class or specific guidelines for this directly in the specification documents. Several approaches seems to indicate use of dct:hasPart/dct:isPartOf although other proposals have also emerged.
It would be beneficial if this issue could be prioritised in DCAT-AP future work.
The text was updated successfully, but these errors were encountered: