Skip to content

Schema.org Release v7#136

Merged
RehanSaeed merged 8 commits intoRehanSaeed:masterfrom
Turnerj:schemaorg-release-v6
Mar 18, 2020
Merged

Schema.org Release v7#136
RehanSaeed merged 8 commits intoRehanSaeed:masterfrom
Turnerj:schemaorg-release-v6

Conversation

@Turnerj
Copy link
Copy Markdown
Collaborator

@Turnerj Turnerj commented Feb 9, 2020

  • Adds support for PronounceableText-to-string
  • Adds all other Schema.org v7 changes
  • Minor allocation cleanup for SchemaClass - no longer allocates an array per call to IsPrimitive

Closes #126

@Turnerj Turnerj added the enhancement Issues describing an enhancement or pull requests adding an enhancement. label Feb 9, 2020
@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Feb 9, 2020

So it looks like we will have a unique problem - the Offers type now has this as a property:

/// <summary>
/// An item being offered (or demanded). The transactional nature of the offer or demand is documented using &lt;a class="localLink" href="http://schema.org/businessFunction"&gt;businessFunction&lt;/a&gt;, e.g. sell, lease etc. While several common expected types are listed explicitly in this definition, others can be used. Using a second type, such as Product or a subtype of Product, can clarify the nature of the offer.
/// </summary>
Values<IAggregateOffer, ICreativeWork, IEvent, IMenuItem, IProduct, IService, ITrip> ItemOffered { get; set; }

I think we're gonna need to make a Values7 type...

My thought is rather than maintain all of these ValuesX types manually is actually having them be built by a tool as well.

@RehanSaeed
Copy link
Copy Markdown
Owner

The order of some inherited interfaces changed which caused some noise in the generated code. If we ordered them by name, we could solve that problem in future at the expense of a one time noisy commit.

Not sure about generating ValuesN, might be worth the effort. I'll leave that to your judgement.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Feb 11, 2020

If we ordered them by name, we could solve that problem in future at the expense of a one time noisy commit.

Yeah, that's a fair call. Might be worth doing as a separate PR to this after this is merged.

That said, I know we have a little bit of logic in the ValuesJsonConverter that handles the types from right-to-left. We might want to handle from the most derived to the least in terms of order?

Not sure about generating ValuesN, might be worth the effort. I'll leave that to your judgement.

I'm just thinking the overhead of manually updating Values2, Values3, Values4 and now potentially have a Values5, Values6 and Values7. If we were wanting to change or fix one thing between them, it could be a lot of effort plus we have a higher chance of making a mistake.

Not sure what we would do for the tests though - would be kinda silly to automatically generate tests for it as we would have potential for the tests themselves to be wrong.

@RehanSaeed
Copy link
Copy Markdown
Owner

Good point about tests. Feels a bit weird but then generating code using StringBuilder feels weird too.

Have you seen the news about code generation in .NET? Early days but they plan to add code generation as a 1st class feature. They talk about a pre-compile step where code-gen code in your method is run and then spits out autogenerated code. We might be able to use that when it's more concrete.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 17, 2020

Sorry, I've been meaning to get back to this for a while - I've looked up a few solutions that we could use now however they have different pros/cons:

T4 Templates

  • Compile time output
  • Code structure can be similar to the desired output
  • Would require some (minor) modification to support greater number of generic arguments
  • Lack of syntax highlighting in VS2019 (this one kinda surprised me)

CodeDOM

  • Runtime output
  • Code structure would resemble very little to the desired output
  • Coded in C# with normal syntax highlighting
  • Greater flexibility to what we could output

I think with the right helper methods, we could make a solution with CodeDOM work pretty well and not be too much of a burden to the code base.

This tutorial from MS would be a good start: https://docs.microsoft.com/en-us/dotnet/framework/reflection-and-codedom/how-to-create-a-class-using-codedom
Followed with: https://docs.microsoft.com/en-us/dotnet/framework/reflection-and-codedom/how-to-create-an-xml-documentation-file-using-codedom

We should be able to generate code that is 100% compatible with the editor config rules etc so we don't need to deal with any weirdness.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 17, 2020

While not in the scope of this PR, if we did want to migrate away from using the StringBuilder approach for all the other classes, we could potentially use CodeDOM for those too. That or just wait for that 1st class code generation feature so we don't need to do the work twice.

@RehanSaeed
Copy link
Copy Markdown
Owner

I'm hesitant to invest heavily in code generation when I know that Microsoft is building something as we speak. The thing is I'm not certain when that will become available. At this stage it's all experimental. If I had to guess, it'd take at least several months.

@RehanSaeed
Copy link
Copy Markdown
Owner

Lets just fix up the CI and get this merged for now. Schema.org v7 came out with some Corona Virus updates. Might be worth getting that in too.

@Turnerj Turnerj changed the title Schemaorg Release v6 Schemaorg Release v7 Mar 17, 2020
@Turnerj Turnerj changed the title Schemaorg Release v7 Schema.org Release v7 Mar 17, 2020
@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 17, 2020

I've added support for Values7 (that took a long while to write all the tests). Looks to now be just breaking changes with some of the other models and their tests.

@RehanSaeed
Copy link
Copy Markdown
Owner

Looks like there are 6 failed tests.

Thanks for updating. It'd be nice to stick it to the CoronaVirus in a very very small way with this PR.

ISO 8601 defines the offset as numbers after a "+" as the offset or after a "Z" meaning UTC0. Without this change, the timezone offset of the current machine is used which is (in many cases) undesired.
@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

So the fix for those tests was a slight change to how we handle DateTimeOffset.

For a Values<int?, DateTime?, DateTimeOffset?>, keep in mind we read the generic parameters from right-to-left in our conversion code. With a string like "2020-03-18", when we encounter the DateTimeOffset, we check if DateTimeOffset.TryParse is successful which, even if it doesn't have an offset, it is - that right there was our problem.

Some ValuesN properties have DateTime and DateTimeOffset, some have just DateTimeOffset. Any time we had DateTimeOffset, it would always usurp anything that was also a valid DateTime. This is an additional problem because if you don't specify an offset, it will use the current system's offset. This is what happened in our tests, our test JSON strings were dates but they were converted to DateTimeOffset rather than DateTime so had the offset of the target machine.

To alleviate this, I made the check to try parsing as a DateTimeOffset include specifically checking for the ISO 8601 offset characters of "+" and "Z" - if either are there, try parsing as a DateTimeOffset otherwise don't.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

There are a few quirks with what I've mentioned above so will be adding some tests and tweaking some more behaviour.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

One obvious mistake I missed, I'm not handling negative offsets. Whoops!

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

I'm going down a rabbit hole of little issues. The short of it is, I'm going to disable the built-in DateTime parsing by JSON.NET and rely entirely on our own as there are various issues around offset handling. With that in mind, I'm going to create our own SchemaSerializer class which has all the configuration necessary to both read and write - currently we only have our overloads on Thing.ToString() etc which allow writing.

Using this SchemaSerializer is entirely optional for reading with the exception that offsets for datetime values do not work. This class will replace the inner-workings of Thing.ToString() etc however those methods will still exist and just point to SchemaSerializer.

The reason these changes belong in this PR is that my first "fix" for the tests, while passes, actually fails at the few additional tests I'm adding around those changes. I'd rather have a properly tested and functional PR than split a half-working PR into two PRs.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

To summarise everything:

  • Schema.org v7 Models are now used
  • Fixing issues with DateTime and DateTimeOffset - only using DateTimeOffset if there is an offset (this is what broke those 6 tests the other day)

Its that last part which brings through these changes:

  • Removed support for using JSON.NET Date parsing - it prevented the offsets from correctly coming through under certain conditions
  • Added new SchemaSerializer which helps fix the date parsing issue by disabling the built-in JSON.NET Date parsing. This does mean for anyone to use our "fixed" date time handling for reads, they need to use SchemaSerializer instead of JsonConvert directly. In theory, if people haven't yet encountered weirdness around date time handling, they won't notice a difference by not using it. I think though it might be good to say this is the "recommended" way as it may help us moving to System.Text.Json in the future.
  • As all date parsing is now done on our end, fixed support for MS DateTime string (eg. /Date(12356789000-0100)/) (JSON.NET didn't support offsets for this)

With that last point, it means we are no longer skipping one of our tests for MS DateTime.

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Mar 18, 2020

Oh, one other little note - I added an allocation free path using Span for the date parsing. While that would normally be compatible, there is no int.TryParse that takes in Span for anything less than .NET Core - with this in mind, I added .NET Core 3.1 as an additional target for the main library.

@RehanSaeed RehanSaeed merged commit 3664cf7 into RehanSaeed:master Mar 18, 2020
@RehanSaeed
Copy link
Copy Markdown
Owner

Looks reasonable to me. I'll see if I can arrange a release today.

Thanks for your efforts as usual. Stay safe!

@RehanSaeed
Copy link
Copy Markdown
Owner

@Turnerj
Copy link
Copy Markdown
Collaborator Author

Turnerj commented Apr 30, 2020

Yeah it sounds interesting though I'm not actually sure its going to be much of a difference for this library.

As the source is injected into the build, the source it injects in terms of namespacing and actual code still looks to be built with a StringBuilder (at least based on the example in that blog). The advantage it seems to provide is that the generated code doesn't actually need to exist in the repository

While the generated code might not be in the repository isn't necessarily bad (its great in theory for PRs and general repository size), I actually thought source generators were more like a new paradigm in source generation whereas it seems to just be a new way to hook code into compilation without files. I imagine when dealing with compilation issues, source generator ones may be a lot harder to debug vs file-generated source (like we do now) as we likely can't see the full "generated" source to see our errors more obviously (sometimes error messages from compilation aren't the most intuitive).

Hopefully there are some more examples in the future where source generators are used that our workflow seems more suited for.

@Turnerj Turnerj deleted the schemaorg-release-v6 branch April 30, 2020 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Issues describing an enhancement or pull requests adding an enhancement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrading to Schema.org Release 7.0 - Issues with extending primitives

2 participants