Archive
Check Gmail for new messages
Problem
I want to check my new Gmail messages periodically. When I get a message from a specific sender (with a specific Subject), I want to trigger some action. How to do that?
Solution
Fortunately, there is an atom feed of unread Gmail messages at https://mail.google.com/mail/feed/atom. All you have to do it is visit this page, send your login credentials, fetch the feed and process it.
import urllib2
FEED_URL = 'https://mail.google.com/mail/feed/atom'
def get_unread_msgs(user, passwd):
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(
realm='New mail feed',
uri='https://mail.google.com',
user='{user}@gmail.com'.format(user=user),
passwd=passwd
)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
feed = urllib2.urlopen(FEED_URL)
return feed.read()
##########
if __name__ == "__main__":
import getpass
user = raw_input('Username: ')
passwd = getpass.getpass('Password: ')
print get_unread_msgs(user, passwd)
For reading XML I use the untangle module:
import untangle # sudo pip install untangle
xml = get_unread_msgs(USER, PASSWORD)
o = untangle.parse(xml)
try:
for e in o.feed.entry:
title = e.title.cdata
print title
except IndexError:
pass # no new mail
Links
- How to auto log into gmail atom feed with Python? (I took the script from here)
- Read XML painlessly (the untangle module)
Upload an image to imgur.com from Python
If you are familiar with reddit, you must have noticed that most images are hosted on imgur. I would like to upload several images from my computer and I want to collect their URLs on imgur. Let’s see how to do that.
Imgur has an API, this is what we’ll use. Anonymous upload is fine for my needs. For this you need to register and you get an API key. Under the examples there is a very simple Python code. When you execute it, pycurl prints the server’s XML response to the standard output. How to store that in a variable? From that XML we want to extract some data.
Here is an extended version of the uploader script:
#!/usr/bin/env python
import pycurl
import cStringIO
import untangle # XML parser
def upload_from_computer(image):
response = cStringIO.StringIO() # XML response is stored here
c = pycurl.Curl()
values = [
("key", your_api_key),
("image", (c.FORM_FILE, image))]
# OR: ("image", "http://example.com/example.jpg")]
# OR: ("image", "YOUR_BASE64_ENCODED_IMAGE_DATA")]
c.setopt(c.URL, "http://api.imgur.com/2/upload.xml")
c.setopt(c.HTTPPOST, values)
c.setopt(c.WRITEFUNCTION, response.write) # put the server's output in here
c.perform()
c.close()
return response.getvalue()
def process(xml):
o = untangle.parse(xml)
url = o.upload.links.original.cdata
delete_page = o.upload.links.delete_page.cdata
print 'url: ', url
print 'delete page:', delete_page
#############################################################################
if __name__ == "__main__":
img = '/tmp/something.jpg'
xml = upload_from_computer(img)
process(xml)
The tip for storing the XML output in a variable is from here. Untangle is a lightweight XML parser; more info here.
Read XML painlessly
Problem
I had an XML file (an RSS feed) from which I wanted to extract some data. I tried some XML libraries but I didn’t like any of them. Is there a simple, brain-friendly way for this? After all, it’s Python, so everything should be simple.
Solution
Yes, there is a simple library for reading XML called “untangle“, developed by Chris Stefanescu. It’s in PyPI, so installation is very easy:
sudo pip install untangle
For some examples, visit the project page.
Use Case
Let’s see a simple, real-world example. From the RSS feed of Planet Python, let’s extract the post titles and their URLs.
#!/usr/bin/env python
import untangle
#XML = 'examples/planet_python.xml' # can read a file too
XML = 'http://planet.python.org/rss20.xml'
o = untangle.parse(XML)
for item in o.rss.channel.item:
title = item.title.cdata
link = item.link.cdata
if link:
print title
print ' ', link
It couldn’t be any simpler :)
Limitations
According to Chris, untangle doesn’t support documents with namespaces (yet).
Related posts
Alternatives (update 20111031)
Here are some alternatives (thanks reddit).
- Python and XML (overview)
- lxml
- amara [official tutorial]
- xmltodict (converts XML to dict; added on 20141229)
lxml and amara are heavyweight solutions and are built upon C libraries so you may not be able to use them everywhere. untangle is a lightweight parser that can be a perfect choice to read a small and simple XML file.
