-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize dcat:byteSize to dcat:size #313
Comments
I agree that a discussion on this topic is meritted. But I am not sure an additional property is warranted. Too many ways to do the same thing has a cost. While a more flexible property makes things easier for the data provider, it creates more work for the consumer. I believe |
In fact, DCAT 2014 originally had a property
I found some of the old discussions here: |
Right. The additional point in that post
addresses the lurking issue ('what if I only want to indicate the size in round numbers?')
Perhaps this could be clarified with an example or two? |
@agbeltran It seems to me that |
Thanks @makxdekkers - according to the discussion, I think we would not un-deprecate the property, but keep |
Some examples of the use of |
@agbeltran Are you now proposing to drop the idea of adding a more general 'size' property, and just revise the axiom (datatype) of |
Yes, we opened the issue to investigate if the property I wonder though if with the current representation is too cumbersome (or if there are limitations) to represent dataset distributions that are actually terabytes of data (e.g. multi-dimensional microscopy images can weigh up to several TB each and datasets can be hundreds of TB in total). |
About changing the datatype, I agree that we should be careful about current implementations. Maybe we can continue that discussion in the specific issue #125 |
I just saw the proposal from @riccardoAlbertoni in today's call https://www.w3.org/2018/08/30-dxwgdcat-minutes#x10 to create a new class for size with a number and a unit of measurement. @agbeltran then said that the object would be assigned a IRI. |
Thanks @makxdekkers - I totally agree with your view and my comment on the call was pointing out that I don't think that creating a size object is useful, as it would require to assign an IRI to such object which is not really reusable and bears the maintenance costs that you referred to. |
@makxdekkers I am not sure about what is realistic and what is not. My comment in today's call was more a reaction to an emerging proposals to have distinct size properties for every possible unit of measures, which sound to me as bad modelling, and dangerous in a longer-term perspective. If the rationale behind this discussion is to make users more comfortable in expressing and reading the size, we have to consider that the name for multiples of bytes will evolve and which scale to use might be application dependent: if we add the property TerabyteSize, sooner or later we might need to add exabyteSize ... etc. I am not against the use of blank node in this specific case n-ary relation if there is such a dire need of expressing the size in different unit of measures. However, I tend to agree with you, If we do not want to have blank nodes, and no other solutions than adding new properties with hard-coded scale/size are on the table, we should replicate the simple approach from VOID which probably corresponds to live with bytesize. |
@riccardoAlbertoni The issue of granularity/scale -- whether the size is expressed in bytes, kilobytes, megabytes etc -- is really a case of trying to be helpful to people at the expense of efficiency of data. Creating a complex mechanism with an additional class to reduce the number of digits, e.g. from "1000000000" (bytes) to "1" (terabyte) will actually increase the number of bytes on the wire. |
There seem to be four distinct issues with dcat:byteSize as the only option:
what feels to me "reasonable" is to keep byteSize with tighter definition about its expected semantics and introduce a new term with a simple string literal with a microformat eg dcat:approxSize "23 MB" such microformats are extremely common, but I havent had too much luck tracking down a standard for such a format, but there are ones for the actual postfix part
Here are two major development platforms that explicitly support such formats, without citing standards conformance, but do reference this issue of interpretation. |
@makxdekkers @agbeltran As in #300, I have to say that I do not see the problem in creating an IRI such as
|
I have a strong preference for using actual values rather than URIs for things like numbers or timestamps. For programming and for human readability, looking up a URI for such a thing strikes me as far more complex than necessary, to the point of being somewhat comical. |
Though the examples of programmatic formatting of numbers of bytes are the reverse of what I would call programmatic support of the suggested microformats (They take a long and turn it into a string with a convenient number and unit. Support of the microformats suggested would require a function to read the particular microformat and return the long.) I don't think it's too much to ask of a programmer to write such a thing, if we can specify the microformat. I would not worry about KiB etc, as they can be converted to KB etc, and they are rarely used. |
Any reason not to relate So even for the simple, single-property-including-units case, you relate to a comprehensive ontology for complex cases. There is also I can't see anything in QUDT about approximate values but perhaps there are. |
@nicholascar What would be the advantage of including a relationships between |
There is clearly an area that could has the potential for revision as part of future work beyond DCAT 2. As well as Tagging for future work, and moving to future milestone (alongside #84) |
There was no further discussion on this issue since 2018, and DCAT 2 has not eventually included a @agbeltran , do you think we can close it? |
Noting no objections, I'm closing this issue. |
At the moment, DCAT provides a property to indicate the size of a distribution in bytes (dcat:byteSize). We discussed that this should be generalized to dcat:size with an additional indication of the unit of measurement. For the latter, we would consider an existing ontology (such as UO, QUDT, OM etc).
Related to #125
As per discussions in meeting (https://www.w3.org/2018/07/19-dxwgdcat-minutes.html#x07) and action (https://www.w3.org/2017/dxwg/track/actions/158).
The text was updated successfully, but these errors were encountered: