Skip to content

Conversation

@grendello
Copy link
Contributor

@grendello grendello commented Jan 24, 2019

Context: https://issuetracker.google.com/issues/116182838

Google shipped invalid (not well-formed) XML annotation documents in
the platform-tools 28.0.1 package, with < present within element
attributes:

<item name="android.view.View void addFocusables(java.util.ArrayList<android.view.View>, int) 1">

Note the java.util.ArrayList<android.view.View> within the
//item/@name attribute value, and that the < isn't escaped.

This invalid XML prevents System.Xml from loading the
annotations.xml documents, with errors similar to:

android/view/annotations.xml failed: '<', hexadecimal value 0x3C, is an invalid attribute character. Line 206, position 71

This commit takes advantage of the forgiving nature of the
HtmlAgilityPack HTML parser -- which loads those files fine -- and
makes it possible for us to write them back using XmlWriter in
order to generate a well-formed XML document which we can load into
XDocument.

Slow, but it works...


static XDocument SafeXmlLoad (Stream s, string fileName)
{
// We must save to a temporary file because the stream doesn't support seeking and should the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must we use a temporary file? As opposed to copying into a MemoryStream?

The largest annotations.xml I see within platform-tools_r28.0.1-darwin.zip is 76017 bytes for android/renderscript/annotations.xml, which isn't that large, and the 76KB should be quickly collectable once it's copied into the (much larger!) XDocument.

{
// Google ships not well-formed XML files in the platform-tools 28.0.1 package (in the
// annotations.zip file), so we need to load the files with a forgiving parser in order to fix
// them up before loading with a validating XML parser.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably mention https://issuetracker.google.com/issues/116182838, either here, in the commit message, or both. (I like "both". :-)

@jonpryor
Copy link
Contributor

Commit message for merging:

Context: https://issuetracker.google.com/issues/116182838

Google shipped invalid (not well-formed) XML annotation documents in
the `platform-tools` 28.0.1 package, with `<` present within element
attributes:

	<item name="android.view.View void addFocusables(java.util.ArrayList<android.view.View>, int) 1">

Note the `java.util.ArrayList<android.view.View>` within the
`//item/@name` attribute value, and that the `<` isn't escaped.

This invalid XML prevents System.Xml from loading the
`annotations.xml` documents, with errors similar to:

	android/view/annotations.xml failed: '<', hexadecimal value 0x3C, is an invalid attribute character. Line 206, position 71

This commit takes advantage of the forgiving nature of the
HtmlAgilityPack HTML parser -- which loads those files fine -- and
makes it possible for us to write them back using `XmlWriter` in
order to generate a well-formed XML document which we can load into
`XDocument`.

Slow, but it works...

@grendello grendello force-pushed the fix-annotations-loading branch from 0efa286 to edae3b6 Compare January 24, 2019 14:45
} catch {
// ignore
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think using blocks, you could avoid the need for this method entirely. The empty try-catch is the part we shouldn't really need.

Was there some trouble with a Stream getting double-disposed? It might be there is just a place you need to use an overload of leaveOpen: true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The streams in question aren't placed in using blocks because I don't want to nest the blocks too much and the streams may be null. The method provides for cleaner code and the empty catch is to discard any and all errors in a place where we really don't care about any errors.

Copy link
Member

@jonathanpeppers jonathanpeppers Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the streams would be null if using was used?

It seems cleaner to me to just do something like:

	static XDocument SafeXmlLoad (Stream s, string fileName)
	{
		// We must save to a temporary stream because the stream doesn't support seeking and should the
		// parsing fail we won't be able to go back to its beginnig to reparse it.
		using (var ms = new MemoryStream ()) {
			s.CopyTo (ms);
			ms.Seek (0, SeekOrigin.Begin);

			try {
				return XDocument.Load (ms);
			} catch (XmlException ex) {
				Console.Error.WriteLine ($"Warning: failed to load annotation document '{fileName}' directly from the annotations archive. {ex.Message}");
				Console.Error.WriteLine ("Attempting to fix up and reload");
			}

			try {
				using (var ns = FixAnnotationXML (ms))
					return XDocument.Load (ns);
			} catch (Exception ex) {
				throw new InvalidOperationException ($"Failed to fix up invalid XML in annotation document '{fileName}'. {ex.Message}", ex);
			}
		}
	}

Context: https://issuetracker.google.com/issues/116182838

Google shipped invalid (not well-formed) XML annotation documents in
the `platform-tools` 28.0.1 package, with `<` present within element
attributes:

	<item name="android.view.View void addFocusables(java.util.ArrayList<android.view.View>, int) 1">

Note the `java.util.ArrayList<android.view.View>` within the
`//item/@name` attribute value, and that the `<` isn't escaped.

This invalid XML prevents System.Xml from loading the
`annotations.xml` documents, with errors similar to:

	android/view/annotations.xml failed: '<', hexadecimal value 0x3C, is an invalid attribute character. Line 206, position 71

This commit takes advantage of the forgiving nature of the
HtmlAgilityPack HTML parser -- which loads those files fine -- and
makes it possible for us to write them back using `XmlWriter` in
order to generate a well-formed XML document which we can load into
`XDocument`.

Slow, but it works...
@grendello grendello force-pushed the fix-annotations-loading branch from edae3b6 to 7e1c797 Compare January 24, 2019 17:21
@grendello
Copy link
Contributor Author

Timed out...

@grendello
Copy link
Contributor Author

build

@jonpryor jonpryor merged commit 4073f3e into dotnet:master Jan 24, 2019
@jonpryor
Copy link
Contributor

Note: on the next Java.Interop bump within xamarin-android we'll need need to update external/Java.Interop.tpnitems to mention HtmlAgilityPack.

@grendello grendello deleted the fix-annotations-loading branch January 24, 2019 19:07
@github-actions github-actions bot locked and limited conversation to collaborators Apr 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants