python | Python Adventures

get the title of a web page

September 8, 2015 Jabba Laci Leave a comment

Problem
You need the title of a web page.

Solution

from bs4 import BeautifulSoup

soup = BeautifulSoup(html)
print soup.title.string

I found the solution here.

Categories: python Tags: beautifulsoup, html, scrape, title

python-markdown: add support for strikethrough

September 7, 2015 Jabba Laci 1 comment

Problem
In a webapp of mine I use markdown with the excellent Python-Markdown package. However, it doesn’t support strikethrough by default.

Solution
The good news is that you can add 3rd-party extensions to Python-Markdown. With the extension “mdx_del_ins” you can use the <del> and <ins> tags.

Here is a Python function that converts markdown to HTML:

import bleach
from markdown import markdown

def md_to_html(md):
    """
    Markdown to HTML conversion.
    """
    allowed_tags = ['a', 'abbr', 'acronym', 'b',
                    'blockquote', 'code', 'em',
                    'i', 'li', 'ol', 'pre', 'strong',
                    'ul', 'h1', 'h2', 'h3', 'p', 'br', 'ins', 'del']
    return bleach.linkify(bleach.clean(
        markdown(md, output_format='html', extensions=['nl2br', 'del_ins']),
        tags=allowed_tags, strip=True))

Input:

TODO list
---------
* ~~strikethrough in Python-Markdown~~

Output:
TODO list
* ~~strikethrough in Python-Markdown~~

Categories: python Tags: markdown, strikethrough

[flask] render a template and jump to an anchor

September 6, 2015 Jabba Laci Leave a comment

Problem
You render a page but you want to jump to an anchor on the rendered page.

Solution

Here is a route:

@app.route('/about/')
def fn_about():
    return render_template('about.html')

In the view add this (assuming you have jQuery):

<script> $(function(){ window.location.hash = "jump_here"; }); </script>

It’ll run once the HTML is loaded. Found it here.

Categories: flask, python Tags: anchor, jquery, render_template

create a virtual environment easily

August 30, 2015 Jabba Laci Leave a comment

Problem
I prefer to put my larger projects in a virtual environment (there are people who put every project in a virt. env. …). I keep my projects in Dropbox (thus they are available on all my machines), but the virt. env.’s are kept outside Dropbox since they can grow quite big (and they are easily reproducible).

For creating virt. env.’s, I use virtualenvwrapper, which (by default) puts virt. env.’s in the folder ~/.virtualenvs. Say I have a project in Dropbox, and I want to create a virt. env. for it. How to do it easily?

Solution
First, you need to know if your project is written in Python 2 or Python 3. Then, you need to use the mkvirtualenv command but I always need to look up its syntax. Solution: in the root folder of my project I want a script that will create a virt. env. for the project. Here is the script:

#!/usr/bin/env bash

# mk_venv.sh

# which Python version to use in the created virt. env. (2 or 3)
PYTHON_VER=2

a=`pwd`
f=${a##*/}
source `which virtualenvwrapper.sh` && mkvirtualenv -p `which python${PYTHON_VER}` $f

Just set PYTHON_VER and launch the script. It will figure out the name of the current folder and create a virt. env. with this name. For instance, if you have your project in ~/projects/stuff, then the virt. env. will be created in the folder ~/.virtualenvs/stuff.

Related links

Categories: bash, python Tags: virtualenv, virtualenvwrapper

[flask] validate a URL

August 23, 2015 Jabba Laci Leave a comment

Problem
In a Flask application I wanted to verify if a user-given URL is valid.

Solution
I found a simple validator package for that called validators (see it on GitHub).

Sample usage:

$ pip install validators
$ python
>>> import validators
>>> url = "http://index.hu"
>>> validators.url(url)
True
>>> url = "http://index.h/"
>>> validators.url(url)
ValidationFailure(func=url, args={'value': 'http://index.h/', 'require_tld': True})

The ValidationFailure class implements the __bool__ method, so you can easily check if validation failed:

if not validators.url(url):
    flash("Error: you must provide a valid URL!")

Categories: flask, python Tags: valid url, validator

[flask] show render time on a page

August 22, 2015 Jabba Laci Leave a comment

Problem
For debugging purposes, I wanted to see the rendering time of each page.

Solution
I found a solution here. Here are the links:

show request time in template @Gist (nice and simple, just copy/paste)
Flask-DebugToolbar, port of the famous django-debug-toolbar (the docs is excellent, easy to configure)
Flask-DebugToolbar-Mongo; MongoDB support, good docs, easy to configure

Note that Flask-DebugToolbar is very useful but it really slowed down my application. I even switched it off and I only use it when I want to debug something. When the issue is solved I switch it off again.

Categories: flask, python Tags: debug, debugtoolbar, elapsed time, MongoDB, render, rendering

working with zip files

August 21, 2015 Jabba Laci Leave a comment

Problem

In a project of mine I had to deal with folders, where a folder can contain several thousands of small text files. I kept this project on Dropbox, so I could use it on all my machines. However, Dropbox is quite slow when trying to synchronize several thousand files. So I decided to put files in a folder into a zip file.

So the question is: how to deal with zip files? How to do basic operations with them: create zip, delete from zip, list zip, add to zip, move to zip, extract from zip, etc.

Solution

In this project of mine I used the external zip command as well as the zipfile package from the stdlib. Let’s see both of them.

Manipulating zip files from the command-line
Let’s see some examples. Compress every .json file in the current directory except the desc.json file:

zip -9 files.zip *.json -x desc.json

The switch “-9” gives the best compression, files.zip is the output, and “-x” is short for “--exclude“. From Python you can call it as an external command with os.system() for instance.

The previous example creates a zip file and leaves the original files. Now let’s move files into a zip file (and delete the original files when they were added successfully to the archive):

zip -9 -m files.zip *.json -x desc.json

Delete a file from an archive:

zip -d files.zip desc.json

It will delete desc.json from the zip file.

List the content of a zip file:

zipinfo files.zip

Add a file to the archive:

zip -g files.zip new.json

Where “-g” means: grow.

Extract just one file from a zip file:

# basic:
unzip files.zip this.json

# extract to a specific folder:
unzip files.zip this.json -d /extract/here/

It will extract this.json from the archive.

Read the content of a zip file in Python
OK, say we have a zip file that contains some files. How to get the filenames? How to read them? I found some nice examples here.

List the file names in a zip file:

import zipfile

zfile = zipfile.ZipFile("files.zip", "r")

for name in zfile.namelist():
    print(name)

Read files in a zip file:

import zipfile

zfile = zipfile.ZipFile("files.zip", "r")

for name in zfile.namelist():
    data = zfile.read(name)
    print(data)

Links

The zipfile module at effbot.org.

Categories: bash, python Tags: zip, zipfile

trending Python repositories on GitHub

August 9, 2015 Jabba Laci Leave a comment

https://github.com/trending?l=python

“Find what repositories the GitHub community is most excited about today.“

Categories: python Tags: github, trends

Reading (writing) unicode text from (to) files

August 6, 2015 Jabba Laci Leave a comment

Problem
You want to write some special characters to a file (e.g. f.write("voilá")) but you get immediately some unicode error in your face.

Solution
Instead of messing with the encode, decode methods, use the codecs module.

import codecs

# read
with codecs.open(fname, "r", "utf-8") as f:
    text = f.read()

# write
with codecs.open(tmp, "w", "utf-8") as to:
    to.write(text)

As can be seen, its usage is very similar to the well-known open function.

This tip is from here.

Categories: python Tags: codecs, unicode, utf-8

screenshot.py

July 26, 2015 Jabba Laci Leave a comment

I made a simple wrapper script called screenshot.py that calls phantomjs and convert to do the real job. The advantage of screenshot.py is its very simple usage.

Usage

screenshot.py -full http://reddit.com full.jpg

Screenshot of the entire page (can be very high).

screenshot.py -window http://reddit.com window.jpg

Screenshot of the area that you see in the browser.

screenshot.py -thumb http://reddit.com thumb.jpg

Thumbnail of the area that you see in the browser.

Links

sample screenshots
project’s page: https://github.com/jabbalaci/screenshot.py
this project of mine appeared in ImportPython Weekly Newsletter – Issue No 39

Categories: python Tags: convert, phantomjs, screenshot, thumbnail

Newer Entries Older Entries

Python Adventures

Archive