Skip to content

XML deserializate failed when text contains &()_+-=; #719

@Xuanwo

Description

@Xuanwo

Hi, I'm from Apache OpenDAL Community. I found an interesting cases that quick-xml failed to deserializate content like &()_+-=;.

Context

We are developing WebDAV support with JFrog Artifactory. I'm not sure how JFrog implements WebDAV support, but it seems they do not correctly escape the XML content.

We found that jfrog will return invalid content like

<D:href>/artifactory/example-repo-local/bfc28925-81ec-4b69-b494-e2ee0f957ee8/aba0a24a-3a2b-48e2-ae65-ef2dff027458%20!%40%23%24%25%5E%26()_%2B-%3D%3B'%2C.txt/bfc28925-81ec-4b69-b494-e2ee0f957ee8/aba0a24a-3a2b-48e2-ae65-ef2dff027458%20!@%23$%25%5E&()_+-=;',.txt</D:href>

quick-xml will recongize &()_+-=; as an XML entity thus lead to deserializate failure:

test panicked: stat must succeed: Unexpected (persistent) at stat => deserialize xml

Context:
   service: webdav
   path: 2c9433ab-450c-49e9-bb88-703cb2f327a0 !@#$%^&()_+-=;',.txt

Source:
   Error while escaping character at range 238..244: Unrecognized escape symbol: "()_+-=": Error while escaping character at range 238..244: Unrecognized escape symbol: "()_+-=": Error while escaping character at range 238..244: Unrecognized escape symbol: "()_+-="

To Reproduce

deserializate content like

<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<D:multistatus xmlns:D=\"DAV:\" xmlns:ns0=\"DAV:\">
<D:response xmlns:lp2=\"http://apache.org/dav/props/\" xmlns:lp1=\"DAV:\">
<D:href>/artifactory/example-repo-local/bfc28925-81ec-4b69-b494-e2ee0f957ee8/aba0a24a-3a2b-48e2-ae65-ef2dff027458%20!%40%23%24%25%5E%26()_%2B-%3D%3B'%2C.txt/bfc28925-81ec-4b69-b494-e2ee0f957ee8/aba0a24a-3a2b-48e2-ae65-ef2dff027458%20!@%23$%25%5E&()_+-=;',.txt</D:href>
<D:propstat><D:prop>
<lp1:getcontentlength>2066093</lp1:getcontentlength>
<lp1:getcontenttype>text/plain</lp1:getcontenttype>
<lp1:resourcetype/><lp1:getlastmodified>Tue, 27 Feb 2024 07:13:54 GMT</lp1:getlastmodified>
<lp1:getetag>W/\"2066093-Tue, 27 Feb 2024 07:13:54 GMT\"</lp1:getetag>
<lp1:creationdate>2024-02-27T07:13:54Z</lp1:creationdate>\n<D:displayname><![CDATA[aba0a24a-3a2b-48e2-ae65-ef2dff027458 !@#$%^&()_+-=;',.txt]]></D:displayname>
<D:source></D:source>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>\n</D:response>\n</D:multistatus>

into

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct MultistatusOptional {
    pub response: Option<Vec<ListOpResponse>>,
}

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct ListOpResponse {
    pub href: String,
    pub propstat: Propstat,
}

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct Propstat {
    pub prop: Prop,
    pub status: String,
}

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct Prop {
    #[serde(default)]
    pub displayname: String,
    pub getlastmodified: String,
    pub getetag: Option<String>,
    pub getcontentlength: Option<String>,
    pub getcontenttype: Option<String>,
    pub resourcetype: ResourceTypeContainer,
}

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct ResourceTypeContainer {
    #[serde(rename = "$value")]
    pub value: Option<ResourceType>,
}

#[derive(Deserialize, Debug, PartialEq, Eq, Clone)]
#[serde(rename_all = "lowercase")]
pub enum ResourceType {
    Collection,
}

Related Issues


I understand this seems more like an issue for jfrog to address, but I believe it's also valuable to inform the quick-xml maintainer about this situation.

  • What do you think about this case?
  • Does it make sense to skip invliad/unknown XML entity and keep them as is?
  • Can we skip invliad/unknown XML entity and keep them as is?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionserdeIssues related to mapping from Rust types to XML

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions