Schema.org Release v7 by Turnerj · Pull Request #136 · RehanSaeed/Schema.NET

Turnerj · 2020-02-09T07:21:13Z

Adds support for PronounceableText-to-string
Adds all other Schema.org v7 changes
Minor allocation cleanup for SchemaClass - no longer allocates an array per call to IsPrimitive

Closes #126

Turnerj · 2020-02-09T07:25:01Z

So it looks like we will have a unique problem - the Offers type now has this as a property:

/// <summary>
/// An item being offered (or demanded). The transactional nature of the offer or demand is documented using &lt;a class="localLink" href="http://schema.org/businessFunction"&gt;businessFunction&lt;/a&gt;, e.g. sell, lease etc. While several common expected types are listed explicitly in this definition, others can be used. Using a second type, such as Product or a subtype of Product, can clarify the nature of the offer.
/// </summary>
Values<IAggregateOffer, ICreativeWork, IEvent, IMenuItem, IProduct, IService, ITrip> ItemOffered { get; set; }

I think we're gonna need to make a Values7 type...

My thought is rather than maintain all of these ValuesX types manually is actually having them be built by a tool as well.

RehanSaeed · 2020-02-10T10:29:47Z

The order of some inherited interfaces changed which caused some noise in the generated code. If we ordered them by name, we could solve that problem in future at the expense of a one time noisy commit.

Not sure about generating ValuesN, might be worth the effort. I'll leave that to your judgement.

Turnerj · 2020-02-11T03:23:31Z

If we ordered them by name, we could solve that problem in future at the expense of a one time noisy commit.

Yeah, that's a fair call. Might be worth doing as a separate PR to this after this is merged.

That said, I know we have a little bit of logic in the ValuesJsonConverter that handles the types from right-to-left. We might want to handle from the most derived to the least in terms of order?

Not sure about generating ValuesN, might be worth the effort. I'll leave that to your judgement.

I'm just thinking the overhead of manually updating Values2, Values3, Values4 and now potentially have a Values5, Values6 and Values7. If we were wanting to change or fix one thing between them, it could be a lot of effort plus we have a higher chance of making a mistake.

Not sure what we would do for the tests though - would be kinda silly to automatically generate tests for it as we would have potential for the tests themselves to be wrong.

RehanSaeed · 2020-02-11T09:15:13Z

Good point about tests. Feels a bit weird but then generating code using StringBuilder feels weird too.

Have you seen the news about code generation in .NET? Early days but they plan to add code generation as a 1st class feature. They talk about a pre-compile step where code-gen code in your method is run and then spits out autogenerated code. We might be able to use that when it's more concrete.

Turnerj · 2020-03-17T07:34:28Z

Sorry, I've been meaning to get back to this for a while - I've looked up a few solutions that we could use now however they have different pros/cons:

T4 Templates

Compile time output
Code structure can be similar to the desired output
Would require some (minor) modification to support greater number of generic arguments
Lack of syntax highlighting in VS2019 (this one kinda surprised me)

CodeDOM

Runtime output
Code structure would resemble very little to the desired output
Coded in C# with normal syntax highlighting
Greater flexibility to what we could output

I think with the right helper methods, we could make a solution with CodeDOM work pretty well and not be too much of a burden to the code base.

This tutorial from MS would be a good start: https://docs.microsoft.com/en-us/dotnet/framework/reflection-and-codedom/how-to-create-a-class-using-codedom
Followed with: https://docs.microsoft.com/en-us/dotnet/framework/reflection-and-codedom/how-to-create-an-xml-documentation-file-using-codedom

We should be able to generate code that is 100% compatible with the editor config rules etc so we don't need to deal with any weirdness.

Turnerj · 2020-03-17T07:36:55Z

While not in the scope of this PR, if we did want to migrate away from using the StringBuilder approach for all the other classes, we could potentially use CodeDOM for those too. That or just wait for that 1st class code generation feature so we don't need to do the work twice.

RehanSaeed · 2020-03-17T09:26:49Z

I'm hesitant to invest heavily in code generation when I know that Microsoft is building something as we speak. The thing is I'm not certain when that will become available. At this stage it's all experimental. If I had to guess, it'd take at least several months.

RehanSaeed · 2020-03-17T09:28:14Z

Lets just fix up the CI and get this merged for now. Schema.org v7 came out with some Corona Virus updates. Might be worth getting that in too.

Turnerj · 2020-03-17T13:31:32Z

I've added support for Values7 (that took a long while to write all the tests). Looks to now be just breaking changes with some of the other models and their tests.

RehanSaeed · 2020-03-17T13:34:54Z

Looks like there are 6 failed tests.

Thanks for updating. It'd be nice to stick it to the CoronaVirus in a very very small way with this PR.

ISO 8601 defines the offset as numbers after a "+" as the offset or after a "Z" meaning UTC0. Without this change, the timezone offset of the current machine is used which is (in many cases) undesired.

Turnerj · 2020-03-18T02:56:51Z

So the fix for those tests was a slight change to how we handle DateTimeOffset.

For a Values<int?, DateTime?, DateTimeOffset?>, keep in mind we read the generic parameters from right-to-left in our conversion code. With a string like "2020-03-18", when we encounter the DateTimeOffset, we check if DateTimeOffset.TryParse is successful which, even if it doesn't have an offset, it is - that right there was our problem.

Some ValuesN properties have DateTime and DateTimeOffset, some have just DateTimeOffset. Any time we had DateTimeOffset, it would always usurp anything that was also a valid DateTime. This is an additional problem because if you don't specify an offset, it will use the current system's offset. This is what happened in our tests, our test JSON strings were dates but they were converted to DateTimeOffset rather than DateTime so had the offset of the target machine.

To alleviate this, I made the check to try parsing as a DateTimeOffset include specifically checking for the ISO 8601 offset characters of "+" and "Z" - if either are there, try parsing as a DateTimeOffset otherwise don't.

Turnerj · 2020-03-18T03:19:09Z

There are a few quirks with what I've mentioned above so will be adding some tests and tweaking some more behaviour.

Turnerj · 2020-03-18T03:58:56Z

One obvious mistake I missed, I'm not handling negative offsets. Whoops!

Turnerj · 2020-03-18T06:10:39Z

I'm going down a rabbit hole of little issues. The short of it is, I'm going to disable the built-in DateTime parsing by JSON.NET and rely entirely on our own as there are various issues around offset handling. With that in mind, I'm going to create our own SchemaSerializer class which has all the configuration necessary to both read and write - currently we only have our overloads on Thing.ToString() etc which allow writing.

Using this SchemaSerializer is entirely optional for reading with the exception that offsets for datetime values do not work. This class will replace the inner-workings of Thing.ToString() etc however those methods will still exist and just point to SchemaSerializer.

The reason these changes belong in this PR is that my first "fix" for the tests, while passes, actually fails at the few additional tests I'm adding around those changes. I'd rather have a properly tested and functional PR than split a half-working PR into two PRs.

Turnerj · 2020-03-18T07:33:58Z

To summarise everything:

Schema.org v7 Models are now used
Fixing issues with DateTime and DateTimeOffset - only using DateTimeOffset if there is an offset (this is what broke those 6 tests the other day)

Its that last part which brings through these changes:

Removed support for using JSON.NET Date parsing - it prevented the offsets from correctly coming through under certain conditions
Added new SchemaSerializer which helps fix the date parsing issue by disabling the built-in JSON.NET Date parsing. This does mean for anyone to use our "fixed" date time handling for reads, they need to use SchemaSerializer instead of JsonConvert directly. In theory, if people haven't yet encountered weirdness around date time handling, they won't notice a difference by not using it. I think though it might be good to say this is the "recommended" way as it may help us moving to System.Text.Json in the future.
As all date parsing is now done on our end, fixed support for MS DateTime string (eg. /Date(12356789000-0100)/) (JSON.NET didn't support offsets for this)

With that last point, it means we are no longer skipping one of our tests for MS DateTime.

Turnerj · 2020-03-18T07:43:14Z

Oh, one other little note - I added an allocation free path using Span for the date parsing. While that would normally be compatible, there is no int.TryParse that takes in Span for anything less than .NET Core - with this in mind, I added .NET Core 3.1 as an additional target for the main library.

RehanSaeed · 2020-03-18T08:41:07Z

Looks reasonable to me. I'll see if I can arrange a release today.

Thanks for your efforts as usual. Stay safe!

RehanSaeed · 2020-04-30T11:33:00Z

@Turnerj Source generators is here now. It came earlier than I expected.

https://devblogs.microsoft.com/dotnet/introducing-c-source-generators/

Turnerj · 2020-04-30T12:03:11Z

Yeah it sounds interesting though I'm not actually sure its going to be much of a difference for this library.

As the source is injected into the build, the source it injects in terms of namespacing and actual code still looks to be built with a StringBuilder (at least based on the example in that blog). The advantage it seems to provide is that the generated code doesn't actually need to exist in the repository

While the generated code might not be in the repository isn't necessarily bad (its great in theory for PRs and general repository size), I actually thought source generators were more like a new paradigm in source generation whereas it seems to just be a new way to hook code into compilation without files. I imagine when dealing with compilation issues, source generator ones may be a lot harder to debug vs file-generated source (like we do now) as we likely can't see the full "generated" source to see our errors more obviously (sometimes error messages from compilation aren't the most intuitive).

Hopefully there are some more examples in the future where source generators are used that our workflow seems more suited for.

Turnerj added 2 commits February 9, 2020 17:47

PronounceableText support

742ce2d

Schema v6 entity updates

e3b6a6b

Turnerj added the enhancement Issues describing an enhancement or pull requests adding an enhancement. label Feb 9, 2020

Turnerj added 3 commits March 17, 2020 22:18

Update to schema.org release v7

42e364e

Added Values7 type

3f8b8a8

Added tests for Values7

a382c6d

Turnerj changed the title ~~Schemaorg Release v6~~ Schemaorg Release v7 Mar 17, 2020

Turnerj changed the title ~~Schemaorg Release v7~~ Schema.org Release v7 Mar 17, 2020

Only support DateTimeOffset conversion if it looks like it has an offset

19ed537

ISO 8601 defines the offset as numbers after a "+" as the offset or after a "Z" meaning UTC0. Without this change, the timezone offset of the current machine is used which is (in many cases) undesired.

Turnerj added 2 commits March 18, 2020 17:36

Adding better support for time offsets in dates

ad1f217

Fixed bug in time offset handling via substrings

a0c3c92

RehanSaeed merged commit 3664cf7 into RehanSaeed:master Mar 18, 2020

Turnerj deleted the schemaorg-release-v6 branch April 30, 2020 12:03

Uh oh!

Conversation

Turnerj commented Feb 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turnerj commented Feb 9, 2020

Uh oh!

RehanSaeed commented Feb 10, 2020

Uh oh!

Turnerj commented Feb 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RehanSaeed commented Feb 11, 2020

Uh oh!

Turnerj commented Mar 17, 2020

Uh oh!

Turnerj commented Mar 17, 2020

Uh oh!

RehanSaeed commented Mar 17, 2020

Uh oh!

RehanSaeed commented Mar 17, 2020

Uh oh!

Turnerj commented Mar 17, 2020

Uh oh!

RehanSaeed commented Mar 17, 2020

Uh oh!

Turnerj commented Mar 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turnerj commented Mar 18, 2020

Uh oh!

Turnerj commented Mar 18, 2020

Uh oh!

Turnerj commented Mar 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Turnerj commented Mar 18, 2020

Uh oh!

Turnerj commented Mar 18, 2020

Uh oh!

RehanSaeed commented Mar 18, 2020

Uh oh!

RehanSaeed commented Apr 30, 2020

Uh oh!

Turnerj commented Apr 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Turnerj commented Feb 9, 2020 •

edited

Loading

Turnerj commented Feb 11, 2020 •

edited

Loading

Turnerj commented Mar 18, 2020 •

edited

Loading

Turnerj commented Mar 18, 2020 •

edited

Loading