Skip to content

Not able to parse string containing '&' to Node #291

@cy745

Description

@cy745

try this test, in my opinion, the '&' should behavior like a part of the string, but it looks like it became another Node, and creating the node throw out an exception.

    val text = "<tag>&amp;Content</tag>"

    @XmlSerialName(value = "tag")
    @Serializable
    data class TAG(
        @XmlValue
        val content: String
    )

    @XmlSerialName(value = "tag")
    @Serializable
    data class TAG2(
        @XmlValue
        val content: List<Node> = emptyList()
    )

    @XmlSerialName(value = "tag")
    @Serializable
    data class TAG3(
        @XmlValue
        val content: List<CompactFragment> = emptyList()
    )

    @Test
    fun testParser() {
        val ampTestResult = xml.decodeFromString<TAG>(text)
        println(ampTestResult)   // output: AMP(content=&Content)

        val amp3TestResult = xml.decodeFromString<TAG3>(text)
        println(amp3TestResult) // TAG3(content=[{namespaces=[], content=&}, {namespaces=[], content=Content}])

        val amp2TestResult = xml.decodeFromString<TAG2>(text) // throw Invalid XML value at position: 1:8: Creating entity references is not supported (or incorrect) in most browsers
        println(amp2TestResult)
    }

by the way, I use List<Node> to deserialized the xml liked below, is there a better way to do deserialize for it?

      <p begin="24.954" end="29.012" itunes:key="L3" ttm:agent="v2">
        <span begin="24.954" end="25.125">To</span>
        <span begin="25.125" end="25.487">watch</span>
        <span begin="25.487" end="25.921">our</span>
        <span begin="25.921" end="26.881">life</span>
        <span ttm:role="x-bg">
          <span begin="25.763" end="26.091">(To</span>
          <span begin="26.091" end="26.437">watch</span>
          <span begin="26.437" end="26.987">our</span>
          <span begin="26.987" end="27.837">life</span>
          <span begin="27.837" end="28.224">to</span>
          <span begin="28.224" end="29.012">gether)</span>
        </span>
      </p>
@Serializable
data class TTMLSpan(
    @XmlSerialName("begin")
    val begin: String? = null,
    @XmlSerialName("end")
    val end: String? = null,
    @XmlSerialName(
        value = "role",
        prefix = "ttm",
        namespace = "http://www.w3.org/ns/ttml#metadata",
    )
    val role: String? = null,
    @XmlSerialName(
        value = "lang",
        prefix = "xml",
        namespace = "http://www.w3.org/XML/1998/namespace",
    )
    val lang: String? = null,
    @XmlValue
    private val value: List<Node>? = null,
) {
    fun isTranslation(): Boolean = role == "x-translation"

    fun content(): String? {
        return value?.firstOrNull()?.takeIf { it is Text }
            ?.getTextContent()
    }

    fun children(): List<TTMLSpan>? {
        return value?.mapNotNull {
            if (it !is Element) return@mapNotNull null

            TTMLSpan(
                begin = it.getAttribute("begin"),
                end = it.getAttribute("end"),
                role = it.getAttribute("role"),
                lang = it.getAttribute("lang"),
                value = it.getChildNodes().toList()
            )
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions