-
Notifications
You must be signed in to change notification settings - Fork 43
Whitespace in text tokenized as IGNORABLE_WHITESPACE in XmlReader #241
Copy link
Copy link
Closed
Description
When I have this XML
<user>dude & <dudette></user>I expect to get the following events when I iterate through it via the XmlReader:
START_DOCUMENTSTART_ELEMENTlocalName="user"TEXTtext="dude "ENTITY_REFtext="&"TEXTtext=" "ENTITY_REFtext="<"TEXTtext="dudette"ENTITY_REFtext=">"END_ELEMENTlocalName="user"END_DOCUMENT
However, number 5 doesn't turn up as a TEXT but as an IGNORABLE_WHITESPACE.
I think this is a bug, this is not an ignorable whitespace. Whitespaces between XML elements, such as <user>abc</user> <id>1234</id> would be ignorable.
(By the way, the existence of CDSECT and ENTITY_REF was a pitfall (aka footgun) for me, I assumed before that the XMLReader would already have all text content, i.e. I expected there would be just TEXT text="dude & <dudette>" and then END_ELEMENT.)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels