Archive
Posts Tagged ‘duplicate’
detect duplicate keys in a JSON file
March 6, 2016
4 comments
Problem
I want to edit a JSON file by hand but I’m afraid that somewhere I introduce a duplicate key by accident. If it happens, then the second key silently overwrites the first one. Example:
$ cat input.json
{
"content": {
"a": 1,
"a": 2
}
}
Naive approach:
import json
with open("input.json") as f:
d = json.load(f)
print(d)
# {'content': {'a': 2}}
If there is a duplicate key, it should fail! But it remains silent and you have no idea that you just lost some data.
Solution
I found the solution here.
import json
def dict_raise_on_duplicates(ordered_pairs):
"""Reject duplicate keys."""
d = {}
for k, v in ordered_pairs:
if k in d:
raise ValueError("duplicate key: %r" % (k,))
else:
d[k] = v
return d
def main():
with open("input.json") as f:
d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)
print(d)
Now you get a nice error message:
Traceback (most recent call last):
File "./check_duplicates.py", line 28, in <module>
main()
File "./check_duplicates.py", line 21, in main
d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)
File "/usr/lib64/python3.5/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib64/python3.5/json/__init__.py", line 332, in loads
return cls(**kw).decode(s)
File "/usr/lib64/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python3.5/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
File "./check_duplicates.py", line 13, in dict_raise_on_duplicates
raise ValueError("duplicate key: %r" % (k,))
ValueError: duplicate key: 'a'
If your json file has no duplicates, then the code aboce nicely prints its content.
