Archive
XML to dict / XML to JSON
Problem
You have an XML file and you want to convert it to dict or JSON.
Well, if you have a dict, you can convert it to JSON with “json.dump()“, so the real question is: how to convert an XML file to a dictionary?
Solution
There is an excellent library for this purpose called xmltodict. Its usage is very simple:
import xmltodict
# It doesn't work with Python 3! Read on for the solution!
def convert(xml_file, xml_attribs=True):
with open(xml_file) as f:
d = xmltodict.parse(f, xml_attribs=xml_attribs)
return d
This worked well under Python 2.7 but I got an error under Python 3. I checked the project’s documentation and it claimed to be Python 3 compatible. What the hell?
The error message was this:
Traceback (most recent call last):
File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 247, in parse
parser.ParseFile(xml_input)
TypeError: read() did not return a bytes object (type=str)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./xml2json.py", line 27, in <module>
print(convert(sys.argv[1]))
File "./xml2json.py", line 17, in convert
d = xmltodict.parse(f, xml_attribs=xml_attribs)
File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 249, in parse
parser.Parse(xml_input, True)
TypeError: '_io.TextIOWrapper' does not support the buffer interface
I even filed an issue ticket :)
After some debugging I found a hint here: you need to open the XML file in binary mode!
XML to dict (Python 2 & 3)
So the correct version that works with Python 3 too is this:
import xmltodict
def convert(xml_file, xml_attribs=True):
with open(xml_file, "rb") as f: # notice the "rb" mode
d = xmltodict.parse(f, xml_attribs=xml_attribs)
return d
XML to JSON (Python 2 & 3)
If you want JSON output:
import json
import xmltodict
def convert(xml_file, xml_attribs=True):
with open(xml_file, "rb") as f: # notice the "rb" mode
d = xmltodict.parse(f, xml_attribs=xml_attribs)
return json.dumps(d, indent=4)
