Archive
Prettify a JSON file
Problem
You want to make a JSON file (string) human-readable.
Solution
curl -s http://www.reddit.com/r/nsfw/.json | python -mjson.tool
For some more alternatives please refer to this post.
Update (20121021)
Another way is to use jq, a powerful command-line tool for querying JSON files.
Prettify HTML with BeautifulSoup
With the Python library BeautifulSoup (BS), you can extract information from HTML pages very easily. However, there is one thing you should keep in mind: HTML pages are usually malformed. BS tries to correct an HTML page, but it means that BS’s internal representation of the HTML page can be slightly different from the original source. Thus, when you want to localize a part of an HTML page, you should work with the internal representation.
The following script takes an HTML and prints it in a corrected form, i.e. it shows how BS stores the given page. You can also use it to prettify the source:
#!/usr/bin/env python
# prettify.py
# Usage: prettify <URL>
import sys
import urllib
from BeautifulSoup import BeautifulSoup
class MyOpener(urllib.FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15'
def process(url):
myopener = MyOpener()
#page = urllib.urlopen(url)
page = myopener.open(url)
text = page.read()
page.close()
soup = BeautifulSoup(text)
return soup.prettify()
# process(url)
def main():
if len(sys.argv) == 1:
print "Jabba's HTML Prettifier v0.1"
print "Usage: %s <URL>" % sys.argv[0]
sys.exit(-1)
# else, if at least one parameter was passed
print process(sys.argv[1])
# main()
if __name__ == "__main__":
main()
You can find the latest version of the script at https://github.com/jabbalaci/Bash-Utils.
