Indentation of XML documents

A nice way to auto-indent your XML documents is via the free editor Notepad++.

Load your XML document, then navigate the menu
TextFX -> TextFX HTML Tidy -> Tidy: Reindent XML

The preconfigured HTML tidy also does some other fixes on your XML code, e.g. converting special characters to ampersand notation and word wrapping at 69 characters. To disable those, you have to alter the TIDYCFG.INI file in folder “plugins\Config\tidy” in your Notpad++ program directory. Add a new configuration like this:

[Tidy: Reindent XML]
input-xml:yes
indent:yes
wrap:0
wrap-sections:no

Look at the Tidy project page for a list of all HTML tidy attributes.

Libxml2 and Python on Windows

For the purpose of using XPath and XQuery out of my python scripts for test statistics generation, I decided to try libxml2. I mostly decided against ElementTree because their website told me the XPath subset does not support queries like:


count(//event[evalresult/text()="FALSE"][warnlevel/text()="1"])

Also, I hope to be faster with the C-based implementation of libxml2. For installation, I only had to download this pre-bundled lxml windows binary from the python package repository; it comes with libxml2 included.

The following code is enough to get the above count:

from lxml import etree
doc = etree.parse(filePath)
result = doc.xpath('//event[evalresult/text()="FALSE"][warnlevel/text()="1"]')
count = len(result)
count2 = doc.xpath(count('//event[evalresult/text()="FALSE"][warnlevel/text()="1"]')) # alternative

About XQuery: lxml serves as frontend for libxml2 and libxslt, neither of which support XQuery.

Further information:
lxml homepage
SketchPath, a good XPath expression evaluation software