Archive
get the title of a web page
Problem
You need the title of a web page.
Solution
from bs4 import BeautifulSoup soup = BeautifulSoup(html) print soup.title.string
I found the solution here.
python-markdown: add support for strikethrough
Problem
In a webapp of mine I use markdown with the excellent Python-Markdown package. However, it doesn’t support strikethrough by default.
Solution
The good news is that you can add 3rd-party extensions to Python-Markdown. With the extension “mdx_del_ins” you can use the <del> and <ins> tags.
Here is a Python function that converts markdown to HTML:
import bleach
from markdown import markdown
def md_to_html(md):
"""
Markdown to HTML conversion.
"""
allowed_tags = ['a', 'abbr', 'acronym', 'b',
'blockquote', 'code', 'em',
'i', 'li', 'ol', 'pre', 'strong',
'ul', 'h1', 'h2', 'h3', 'p', 'br', 'ins', 'del']
return bleach.linkify(bleach.clean(
markdown(md, output_format='html', extensions=['nl2br', 'del_ins']),
tags=allowed_tags, strip=True))
Input:
TODO list --------- * ~~strikethrough in Python-Markdown~~
Output:
TODO list
* strikethrough in Python-Markdown
[flask] render a template and jump to an anchor
Problem
You render a page but you want to jump to an anchor on the rendered page.
Solution
Here is a route:
@app.route('/about/')
def fn_about():
return render_template('about.html')
In the view add this (assuming you have jQuery):
<script> $(function(){ window.location.hash = "jump_here"; }); </script>
It’ll run once the HTML is loaded. Found it here.
create a virtual environment easily
Problem
I prefer to put my larger projects in a virtual environment (there are people who put every project in a virt. env. …). I keep my projects in Dropbox (thus they are available on all my machines), but the virt. env.’s are kept outside Dropbox since they can grow quite big (and they are easily reproducible).
For creating virt. env.’s, I use virtualenvwrapper, which (by default) puts virt. env.’s in the folder ~/.virtualenvs. Say I have a project in Dropbox, and I want to create a virt. env. for it. How to do it easily?
Solution
First, you need to know if your project is written in Python 2 or Python 3. Then, you need to use the mkvirtualenv command but I always need to look up its syntax. Solution: in the root folder of my project I want a script that will create a virt. env. for the project. Here is the script:
#!/usr/bin/env bash
# mk_venv.sh
# which Python version to use in the created virt. env. (2 or 3)
PYTHON_VER=2
a=`pwd`
f=${a##*/}
source `which virtualenvwrapper.sh` && mkvirtualenv -p `which python${PYTHON_VER}` $f
Just set PYTHON_VER and launch the script. It will figure out the name of the current folder and create a virt. env. with this name. For instance, if you have your project in ~/projects/stuff, then the virt. env. will be created in the folder ~/.virtualenvs/stuff.
Related links
[flask] validate a URL
Problem
In a Flask application I wanted to verify if a user-given URL is valid.
Solution
I found a simple validator package for that called validators (see it on GitHub).
Sample usage:
$ pip install validators
$ python
>>> import validators
>>> url = "http://index.hu"
>>> validators.url(url)
True
>>> url = "http://index.h/"
>>> validators.url(url)
ValidationFailure(func=url, args={'value': 'http://index.h/', 'require_tld': True})
The ValidationFailure class implements the __bool__ method, so you can easily check if validation failed:
if not validators.url(url):
flash("Error: you must provide a valid URL!")
working with zip files
Problem
In a project of mine I had to deal with folders, where a folder can contain several thousands of small text files. I kept this project on Dropbox, so I could use it on all my machines. However, Dropbox is quite slow when trying to synchronize several thousand files. So I decided to put files in a folder into a zip file.
So the question is: how to deal with zip files? How to do basic operations with them: create zip, delete from zip, list zip, add to zip, move to zip, extract from zip, etc.
Solution
In this project of mine I used the external zip command as well as the zipfile package from the stdlib. Let’s see both of them.
Manipulating zip files from the command-line
Let’s see some examples. Compress every .json file in the current directory except the desc.json file:
zip -9 files.zip *.json -x desc.json
The switch “-9” gives the best compression, files.zip is the output, and “-x” is short for “--exclude“. From Python you can call it as an external command with os.system() for instance.
The previous example creates a zip file and leaves the original files. Now let’s move files into a zip file (and delete the original files when they were added successfully to the archive):
zip -9 -m files.zip *.json -x desc.json
Delete a file from an archive:
zip -d files.zip desc.json
It will delete desc.json from the zip file.
List the content of a zip file:
zipinfo files.zip
Add a file to the archive:
zip -g files.zip new.json
Where “-g” means: grow.
Extract just one file from a zip file:
# basic: unzip files.zip this.json # extract to a specific folder: unzip files.zip this.json -d /extract/here/
It will extract this.json from the archive.
Read the content of a zip file in Python
OK, say we have a zip file that contains some files. How to get the filenames? How to read them? I found some nice examples here.
List the file names in a zip file:
import zipfile
zfile = zipfile.ZipFile("files.zip", "r")
for name in zfile.namelist():
print(name)
Read files in a zip file:
import zipfile
zfile = zipfile.ZipFile("files.zip", "r")
for name in zfile.namelist():
data = zfile.read(name)
print(data)
Links
- The zipfile module at effbot.org.
trending Python repositories on GitHub
“Find what repositories the GitHub community is most excited about today.“
Reading (writing) unicode text from (to) files
Problem
You want to write some special characters to a file (e.g. f.write("voilá")) but you get immediately some unicode error in your face.
Solution
Instead of messing with the encode, decode methods, use the codecs module.
import codecs
# read
with codecs.open(fname, "r", "utf-8") as f:
text = f.read()
# write
with codecs.open(tmp, "w", "utf-8") as to:
to.write(text)
As can be seen, its usage is very similar to the well-known open function.
This tip is from here.
screenshot.py
I made a simple wrapper script called screenshot.py that calls phantomjs and convert to do the real job. The advantage of screenshot.py is its very simple usage.
Usage
screenshot.py -full http://reddit.com full.jpg
Screenshot of the entire page (can be very high).
screenshot.py -window http://reddit.com window.jpg
Screenshot of the area that you see in the browser.
screenshot.py -thumb http://reddit.com thumb.jpg
Thumbnail of the area that you see in the browser.
Links
- sample screenshots
- project’s page: https://github.com/jabbalaci/screenshot.py
- this project of mine appeared in ImportPython Weekly Newsletter – Issue No 39

You must be logged in to post a comment.