python | Python Adventures

JiVE: A general purpose, cross-platform image viewer with some built-in NSFW support, written in Python 3.6 using PyQt5

June 6, 2018 Jabba Laci Leave a comment

In the past 2-3 weeks I’ve been working on a general purpose, cross-platform image viewer that has some built-in NSFW support. It’s called JiVE and it’s in Python 3.6 using PyQt5. A unique feature of JiVE is that it allows you to browse online images just as if they were local images.

You can find it on GitHub: https://github.com/jabbalaci/JiVE-Image-Viewer. I also wrote a detailed documentation.

Screenshots

In action:

Selecting an NSFW subreddit:

Read the docs for more info.

Categories: python, Qt Tags: cross-platform, image viewer, PyQt5, python3

install PyQt5

May 17, 2018 Jabba Laci Leave a comment

The following is based on this YouTube video.

$ sudo apt install python3-pyqt5
...
$ python3
>>> import PyQt5
>>>
=======================================================================
$ sudo apt install python3-pyqt5.qtsql
...
$ python3
>>> from PyQt5 import QtSql
>>>
=======================================================================
$ sudo apt install qttools5-dev-tools
...
$ ls -al /usr/lib/x86_64-linux-gnu/qt5/bin/designer
lrwxrwxrwx 1 root root 25 ápr   14 09:38 /usr/lib/x86_64-linux-gnu/qt5/bin/designer -> ../../../qt5/bin/designer

I put a symbolic link on designer to launch it easily.

Categories: PyQt5, python

word cloud generator

March 12, 2018 Jabba Laci Leave a comment

See https://github.com/amueller/word_cloud.

Categories: python Tags: wordcloud

unzip: perform the opposite of zip

March 8, 2018 Jabba Laci Leave a comment

zip

>>> a = [1, 2, 3]
>>> b = ["one", "two", "three"]
>>> zip(a, b)
<zip object at 0x7fd30310b508>
>>> list(zip(a, b))
[(1, 'one'), (2, 'two'), (3, 'three')]

unzip
How to perform the opposite of zip? That is, we have [(1, 'one'), (2, 'two'), (3, 'three')], and we want to get back [1, 2, 3] and ["one", "two", "three"].

>>> li
[(1, 'one'), (2, 'two'), (3, 'three')]
>>> a, b = zip(*li)
>>> a
(1, 2, 3)
>>> b
('one', 'two', 'three')

Notice that the results are tuples.

More info here.

Categories: python Tags: unzip, zip

sanitizing tweets

February 12, 2018 Jabba Laci Leave a comment

Problem
You have the text of a tweet and you want to get rid of the bullshit (smileys, emojis, etc.)

Solution
See https://github.com/s/preprocessor. It’s customizable, you can select what to remove, e.g. URLs, smileys, etc.

Categories: python Tags: sanitize, text preprocessing, tweet, twitter

What are the built-in functions?

January 19, 2018 Jabba Laci Leave a comment

Problem
How to figure out the built-in functions in Python? Of course, you can look up the documentation, but now the exercise is to list them in the Python shell.

Solution

In [1]: import builtins

In [2]: dir(builtins)
Out[2]: 
['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'BytesWarning',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'DeprecationWarning',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'FutureWarning',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'ImportWarning',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PendingDeprecationWarning',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'ResourceWarning',
 'RuntimeError',
 'RuntimeWarning',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SyntaxWarning',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecodeError',
 'UnicodeEncodeError',
 'UnicodeError',
 'UnicodeTranslateError',
 'UnicodeWarning',
 'UserWarning',
 'ValueError',
 'Warning',
 'ZeroDivisionError',
 '__IPYTHON__',
 '__build_class__',
 '__debug__',
 '__doc__',
 '__import__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'abs',
 'all',
 'any',
 'ascii',
 'bin',
 'bool',
 'bytearray',
 'bytes',
 'callable',
 'chr',
 'classmethod',
 'compile',
 'complex',
 'copyright',
 'credits',
 'delattr',
 'dict',
 'dir',
 'display',
 'divmod',
 'enumerate',
 'eval',
 'exec',
 'filter',
 'float',
 'format',
 'frozenset',
 'get_ipython',
 'getattr',
 'globals',
 'hasattr',
 'hash',
 'help',
 'hex',
 'id',
 'input',
 'int',
 'isinstance',
 'issubclass',
 'iter',
 'len',
 'license',
 'list',
 'locals',
 'map',
 'max',
 'memoryview',
 'min',
 'next',
 'object',
 'oct',
 'open',
 'ord',
 'pow',
 'print',
 'property',
 'range',
 'repr',
 'reversed',
 'round',
 'set',
 'setattr',
 'slice',
 'sorted',
 'staticmethod',
 'str',
 'sum',
 'super',
 'tuple',
 'type',
 'vars',
 'zip']

Categories: python Tags: builtins

BASE64 as URL parameter

January 1, 2018 Jabba Laci Leave a comment

Problem
In a REST API, I wanted to pass a URL as a BASE64-encoded string, e.g. “http://host/api/v2/url/aHR0cHM6...“. It worked well for a while but I got an error for a URL. As it turned out, a BASE64 string can contain the “/” sign, and it caused the problem.

Solution
Replace the “+” and “/” signs with “-” and “_“, respectively. Fortunately, Python has functions for that (see here).

Here are my modified, URL-safe functions:

def base64_to_str(b64):
    return base64.urlsafe_b64decode(b64.encode()).decode()

def str_to_base64(s):
    data = base64.urlsafe_b64encode(s.encode())
    return data.decode()

You can also quote and unquote a URL instead of using BASE64:

>>> url = "https://www.youtube.com/watch?v=V6w24Lg3zTI"
>>>
>>> import urllib.parse
>>>
>>> new = urllib.parse.quote(url)
>>>
>>> new
>>> 'https%3A//www.youtube.com/watch%3Fv%3DV6w24Lg3zTI'    # notice the "/" signs!
>>>
>>> urllib.parse.quote(url, safe='')
>>> 'https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DV6w24Lg3zTI'    # no "/" signs!
>>>
>>> new = urllib.parse.quote(url, safe='')
>>>
>>> urllib.parse.unquote(new)
>>> 'https://www.youtube.com/watch?v=V6w24Lg3zTI'

Categories: python Tags: base64, quote, unquote, urlsafe

convert a file to an UTF-8-encoded text

December 16, 2017 Jabba Laci Leave a comment

I wrote a simple script that takes an input file, changes its character encoding to UTF-8, and prints the result to the screen.

It’s actually a wrapper around the Unix commands “file” and “iconv“. The goal was to make its usage as simple as possible. The script is here: to_utf8.py.

Usage:

$ to_utf8.py input.txt

The program tries to detect the encoding of the input file.

Links

Text file encoding (doing the same with some Unix commands)

Categories: bash, python Tags: iconv, utf-8

work in a temp. dir. and delete it when done

December 11, 2017 Jabba Laci Leave a comment

Problem
You want to work in a temp. directory, and delete it completely when you are done. You also need the name of this temp. folder.

Solution
You can write with tempfile.TemporaryDirectory() as dirpath:, and the temp. dir. will be removed automatically by the context manager when you quit the with block. Nice and clean.

import tempfile
from pathlib import Path

with tempfile.TemporaryDirectory() as dirpath:
    fp = Path(dirpath, "data.txt")
    # create fp, process it, etc.

# when you get here, dirpath is removed recursively

More info in the docs.

Categories: python Tags: tempdir, tempfile

extract e-mails from a file

October 10, 2017 Jabba Laci Leave a comment

Problem
You have a text file and you want to extract all the e-mail addresses from it. For research purposes, of course.

Solution

#!/usr/bin/env python3

import re
import sys

def extract_emails_from(fname):
    with open(fname, errors='replace') as f:
        for line in f:
            match = re.findall(r'[\w\.-]+@[\w\.-]+', line)
            for e in match:
                if '?' not in e:
                    print(e)
                    
def main():
    fname = sys.argv[1]
    extract_emails_from(fname)

##############################################################################

if __name__ == "__main__":
    if len(sys.argv) == 1:
        print("Error: provide a text file!", file=sys.stderr)
        exit(1)
    # else
    main()

I had character encoding problems with some lines where the original program died with an exception. Using “open(fname, errors='replace')” will replace problematic characters with a “?“, hence the extra check before printing an e-mail to the screen.

The core of the script is the regex to find e-mails. That tip is from here.

Categories: python Tags: email, extract, regex

Newer Entries Older Entries

Python Adventures

Archive

JiVE: A general purpose, cross-platform image viewer with some built-in NSFW support, written in Python 3.6 using PyQt5

install PyQt5

word cloud generator

unzip: perform the opposite of zip

sanitizing tweets

What are the built-in functions?

BASE64 as URL parameter

convert a file to an UTF-8-encoded text

work in a temp. dir. and delete it when done

extract e-mails from a file

Blog Stats

Random Post

Recent Posts

Archives

Meta