Skip to content

Merge text and CDATA events in serde deserializer #474

@Mingun

Description

@Mingun

CDATA elements cannot contain sequence ]]>. When that sequence is appeared in the data, it should be split into two pieces and each piece should be put in their own CDATA container:

]]>

become

<![CDATA[]]]>
<![CDATA[]>]]>

or

<![CDATA[]]]]>
<![CDATA[>]]>

Currently in serde deserializer only one CDATA event processed at time, that means, that deserialization

<root>
  <string><![CDATA[]]]]><![CDATA[>]]></string>
</root>

into

struct AnyName {
  string: String,
}

would fail or wrongly return ]] instead of ]]>.

To fix that we should merge CDATA events, that there are some ambiguities that should be investigated:

  1. should we merge CDATA and text events:
    <![CDATA[one]]>two
    should return onetwo?

    Judging from that and that, we should do that

  2. should we ignore comments between CDATA events? Between CDATA and text events?
    <![CDATA[one]]><!--comment--><![CDATA[two]]>
    should return onetwo?
    <![CDATA[one]]><!--comment-->two
    should return onetwo?
    Currently all comments are skips at very early stage and deserializer sees
    <![CDATA[one]]><!--comment--><![CDATA[two]]>
    as
    <![CDATA[one]]><![CDATA[two]]>
  3. should we ignore processing instructions between CDATA events? Between CDATA and text events?
    <![CDATA[one]]><?pi?><![CDATA[two]]>
    should return onetwo?
    <![CDATA[one]]><?pi?>two
    should return onetwo?
    Currently all processing instructions are skips at very early stage and deserializer sees
    <![CDATA[one]]><?pi?><![CDATA[two]]>
    as
    <![CDATA[one]]><![CDATA[two]]>
  4. should we ignore whitespaces between CDATAs? Between CDATA and text?
    <![CDATA[one]]>
    <![CDATA[two]]>
    should return onetwo?
    <![CDATA[one]]>
    two
    should return onetwo?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bughelp wantedserdeIssues related to mapping from Rust types to XML

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions