Archive
Should I use Python 2 or Python 3?
“Should I use Python 2 or Python 3?”
This is a very common question when someone wants to learn Python. Here is a nice article about this topic: http://wiki.python.org/moin/Python2orPython3.
(Thanks Jaume for the link.)
Update (20110404)
If you are ready to dive in Python 3, here are some tutorials:
- The official Python 3 tutorial (HTML, PDF)
- Further Python 3 docs (c-api.pdf, distutils.pdf, documenting.pdf, extending.pdf, faq.pdf, howto-advocacy.pdf, howto-cporting.pdf, howto-curses.pdf, howto-descriptor.pdf, howto-doanddont.pdf, howto-functional.pdf, howto-logging-cookbook.pdf, howto-logging.pdf, howto-pyporting.pdf, howto-regex.pdf, howto-sockets.pdf, howto-sorting.pdf, howto-unicode.pdf, howto-urllib2.pdf, howto-webservers.pdf, install.pdf, library.pdf, reference.pdf, tutorial.pdf, using.pdf, whatsnew.pdf)
- Dive Into Python 3 (HTML and PDF)
Update (20110526)
I follow the following simple guideline: I use that version of Python that comes with Ubuntu by default. In Ubuntu 10.10 it was Python 2.6, in Ubuntu 11.04 it’s Python 2.7. When they switch to Python 3.x, I will switch too.
Where does a page redirect to?
Question
We have a page that redirects to another page. How to figure out where the redirection points to?
Answer
import urllib s = "https://pythonadventures.wordpress.com?random" # returns a random post page = urllib.urlopen(s) print page.geturl() # e.g. http:// pythonadventures.wordpress.com/2010/10/08/python-challenge-1/
Credits
I found it in this thread.
Update (20121202)
With requests:
>>> import requests
>>> r = requests.get('https://pythonadventures.wordpress.com?random')
>>> r.url
u'https://pythonadventures.wordpress.com/2010/09/30/create-import-module/'
Reading and writing a file
Here is a mini cheat sheet for reading and writing a text file.
Read a text file line by line and write each line to another file (copy):
f1 = open('./in.txt', 'r')
to = open('./out.txt', 'w')
for line in f1:
to.write(line)
f1.close()
to.close()
Variations:
text = f.read() # read the entire file line = f.readline() # read one line at a time lineList = f.readlines() # read the entire file as a list of lines
unicode to ascii
Problem
I had the following unicode string: “Kellemes Ünnepeket!” that I wanted to simplify to this: “Kellemes Unnepeket!”, that is strip “Ü” to “U”. Furthermore, most of the strings were normal ascii, only some of them were in unicode.
Solution
import unicodedata
title = ... # get the string somehow
try:
# if the title is a unicode string, normalize it
title = unicodedata.normalize('NFKD', title).encode('ascii','ignore')
except TypeError:
# if it was not a unicode string => OK, do nothing
pass
Credits
I used the following resources:
- http://www.peterbe.com/plog/unicode-to-ascii
- http://stackoverflow.com/questions/196345/how-to-check-if-a-string-in-python-is-in-ascii
Using MySQL from Python
Problem
You want to interact with a MySQL database from your Python script.
Solution
First of all, you need to install the following package:
sudo apt-get install python-mysqldb
Then try the following basic script to check if everything is OK:
#!/usr/bin/env python
import MySQLdb
conn = MySQLdb.connect (host = "localhost",
user = "testuser",
passwd = "testpass",
db = "test")
cursor = conn.cursor ()
cursor.execute ("SELECT VERSION()")
row = cursor.fetchone ()
print "server version: ", row[0]
cursor.close ()
conn.close ()
Example:
We have a .csv file with two columns: symbol and name. Iterate through the lines and insert each line in a database table as a record.
#!/usr/bin/env python
import MySQLdb
f1 = open('./NYSE.csv', 'r')
# A line looks like this:
# ZLC; Zale Corporation
conn = MySQLdb.connect(host = "localhost",
user = "user",
passwd = "passwd",
db = "table")
cursor = conn.cursor()
for line in f1:
pieces = map(str.strip, line.split(';'))
#print "'%s' => '%s'" % (pieces[0], pieces[1])
query = "INSERT INTO symbol_name (symbol, name) VALUES (\"%s\", \"%s\")" % (pieces[0], pieces[1])
#print query
cursor.execute(query)
f1.close()
conn.commit()
cursor.close ()
conn.close ()
Links
There are lots of Python-MySQL tutorials on the net. Let’s see some of them:
- http://www.kitebird.com/articles/pydbapi.html
- http://zetcode.com/databases/mysqlpythontutorial/
- http://www.tutorialspoint.com/python/python_database_access.htm
Rename multiple files
Problem
I scanned in 66 pages that are numbered from 11 to 76. However, the scanning software saved the files under the names Scan10032.JPG, Scan10033.JPG, …, Scan10097.JPG. I want to rename them to reflect the real numbering of the pages, i.e. 11.jpg, 12.jpg, …, 76.jpg.
Solution
#!/usr/bin/env python
import glob
import re
import os
files = glob.glob('*.JPG') # get *.JPG in a list (not sorted!)
files.sort() # sort the list _in place_
cnt = 11 # start new names with 11.jpg
for f in files:
original = f # save the original file name
result = re.search(r'Scan(\d+)\.JPG', f) # pattern to match
if result: # Is there a match?
new_name = str(cnt) + '.jpg' # create the new name
print "%s => %s" % (original, new_name) # verify if it's OK
# os.rename(original, new_name) # then uncomment to rename
cnt += 1 # increment the counter
Comments are inside the source code.
If you need a simpler rename (like removing a part of the file names), you can also use the rename command. In this post I give an example for that.
Fluffy is gone
We are sad to inform you that Fluffy, the world’s longest snake living in captivity, has died. 18-years-old and weighing 300-pounds Fluffy held the title of longest snake by Guinness World Records and was a hit attraction at Columbus Zoo.
Find more info here.
Levenshtein distance
The Levenshtein distance (or edit distance) between two strings is the minimal number of “edit operations” required to change one string into the other. The two strings can have different lengths. There are three kinds of “edit operations”: deletion, insertion, or alteration of a character in either string.
Example: the Levenshtein distance of “ag-tcc” and “cgctca” is 3.
#!/usr/bin/env python
def LD(s,t):
s = ' ' + s
t = ' ' + t
d = {}
S = len(s)
T = len(t)
for i in range(S):
d[i, 0] = i
for j in range (T):
d[0, j] = j
for j in range(1,T):
for i in range(1,S):
if s[i] == t[j]:
d[i, j] = d[i-1, j-1]
else:
d[i, j] = min(d[i-1, j] + 1, d[i, j-1] + 1, d[i-1, j-1] + 1)
return d[S-1, T-1]
a = 'ag-tcc'
b = 'cgctca'
print LD(a, b) # 3
The implementation is from here.
Hamming distance
The Hamming distance is defined between two strings of equal length. It measures the number of positions with mismatching characters.
Example: the Hamming distance between “toned” and “roses” is 3.
#!/usr/bin/env python
def hamming_distance(s1, s2):
assert len(s1) == len(s2)
return sum(ch1 != ch2 for ch1, ch2 in zip(s1, s2))
if __name__=="__main__":
a = 'toned'
b = 'roses'
print hamming_distance(a, b) # 3
If you need the number of matching character positions:
#!/usr/bin/env python
def similarity(s1, s2):
assert len(s1) == len(s2)
return sum(ch1 == ch2 for ch1, ch2 in zip(s1, s2))
if __name__=="__main__":
a = 'toned'
b = 'roses'
print similarity(a, b) # 2
Actually this is equal to len(s1) - hamming_distance(s1, s2). Remember, len(s1) == len(s2).
More info on zip() here.
Permutations of a list
Update (20120321): The methods presented here can generate all the permutations. However, the permutations are not ordered lexicographically. If you need the permutations in lexicographical order, refer to this post.
Problem
You need all the permutations of a list.
Solution
With generators:
#!/usr/bin/env python
def perms01(li):
if len(li) yield li
else:
for perm in perms01(li[1:]):
for i in range(len(perm)+1):
yield perm[:i] + li[0:1] + perm[i:]
for p in perms01(['a','b','c']):
print p
Output:
['a', 'b', 'c'] ['b', 'a', 'c'] ['b', 'c', 'a'] ['a', 'c', 'b'] ['c', 'a', 'b'] ['c', 'b', 'a']
This tip is from here.
Without generators:
def perms02(l):
sz = len(l)
if sz return [l]
return [p[:i]+[l[0]]+p[i:] for i in xrange(sz) for p in perms02(l[1:])]
for p in perms02(['a','b','c']):
print p
Output:
['a', 'b', 'c'] ['a', 'c', 'b'] ['b', 'a', 'c'] ['c', 'a', 'b'] ['b', 'c', 'a'] ['c', 'b', 'a']
This tip is from here.
The two outputs contain the same elements in a different order.
Notes
If S is a finite set of n elements, then there are n! permutations of S. For instance, if we have 4 letters (say a, b, c, and d), then we can arrange them in 4! = 4 * 3 * 2 * 1 = 24 different ways.


You must be logged in to post a comment.