Skip to content

UnclosedTag on validated XML file. #939

@LegoWolf

Description

@LegoWolf

Starting with v0.39.1 of quick_xml (more precisely commit 74bc46e), I get a SyntaxError::UnclosedTag when parsing the XML files that I get from Apple Health Kit's export feature.

I've attached a sample export that I've trimmed down as small as I can while still being valid XML and reproducing the error. You can reproduce the problem by creating the following project and placing the attached file exporter-test.xml in the same folder as Cargo.toml.

Cargo.toml:

[package]
name = "testxml"
version = "0.1.0"
edition = "2024"

[dependencies]
quick-xml = "=0.39.1"

main.rs

use std::error::Error;
use std::fs::File;
use std::io::BufReader;
use quick_xml::events::Event;
use quick_xml::reader::Reader;

const BUFFER_SIZE: usize = 4 * 1024 * 1024;

fn main() -> Result<(), Box<dyn Error>> {
    let xml_file = BufReader::new(File::open("exporter-test.xml")?);
    let mut xml_file = Reader::from_reader(xml_file);
    let mut buf = Vec::with_capacity(BUFFER_SIZE);
    loop {
        match xml_file.read_event_into(&mut buf)? {
            Event::Start(_e) => {},
            Event::Empty(_e) => {},
            Event::End(_e) => {},
            Event::Eof => break,
            _ => { },
        }
        buf.clear();
    }
    Ok(())
}

Sample output of the error:

Error: Syntax(UnclosedTag)

I shrunk the XML file from a real export by removing large chunks of the data tags in between the HealthData tags, scrubbing it of personally identifying information and checking its validity with XML ValidatorBuddy Plus. I noticed along the way that the XML line where the failure occurs shifts wildly depending on what seem like innocuous edits to the file. e.g. Changing the string length of the value of one attribute moved the error from line ~1700 to line ~4400. I suspect that this error is related to buffer management or something rather than parsing as a result, but I don't know the code well enough to diagnose.

I've tested this example code on Windows 11 and Amazon Linux 2023 with the same result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions