-
Notifications
You must be signed in to change notification settings - Fork 83
Closed
Labels
featurea feature request or enhancementa feature request or enhancement
Description
Found in the XML representation of an edge case R file:
library(xml2)
library(xmlparsedata)
p = parse("https://raw.githubusercontent.com/mwaldstein/edgarWebR/fb9a38e6a57186ffd1c93cc1aa00c4fdf1bc5514/tests/cache/browse-edgar-11457c.R")
xml = read_xml(xml_parse_data(p))Printing this is painfully slow:
system.time(print(xml))
# {xml_document}
# <exprlist>
# [1] <expr line1="1" col1="1" line2="5944" col2="43" start="145" end="855979">\n <expr line1="1" col1="1" line2="1" col2="9" start="145" end="153">\n <SYMBOL_FUNCTION_CALL li ...
# user system elapsed
# 2.906 0.048 2.958 Took a brief look, it looks like encodeString() is the culprit:
# ** debugging inside show_nodes() **
system.time(vapply(x, as.character, FUN.VALUE = character(1)))
# user system elapsed
# 0.248 0.017 0.268
system.time(encodeString(vapply(x, as.character, FUN.VALUE = character(1))))
# user system elapsed
# 2.959 0.024 3.007Is it possible to apply substr() twice -- once after as.character(), then again after encodeString()?
chr = vapply(x, as.character, FUN.VALUE = character(1))
nchar(chr)
# [1] 18965721
This is clearly already wayyy to wide (width = 180 for me).
I believe we can always just apply
x %>%
substring(1, n) %>%
encodeString() %>%
substring(1, n)since the default behavior of encodeString() is to simply add \ to non-printable characters, so it will just be a weakly wider version of the input.
Happy to file a PR if that sounds good.
Metadata
Metadata
Assignees
Labels
featurea feature request or enhancementa feature request or enhancement