jugad2 - Vasudev Ram on software innovation: databases

Showing posts with label databases. Show all posts

Friday, January 18, 2019

Announcing PIaaS - Python Interviewing as a Service

Hello, readers,

Announcing Python Interviewing as a Service:

I'm now officially offering PIaaS - Python Interviewing as a Service. I have done it some earlier, informally, for clients. Recently a couple of companies asked me for help on this again, so I am now adding it to my list of offered services, the others being consulting (software design and development, code review, technology evaluation and recommendation) and software training.

I can help your organization interview and hire Python developer candidates, offloading (some of) that work from your core technical and HR / recruitment staff.

I can also interview on related areas like SQL and RDBMS, and Unix and Linux commands and shell scripting.

I have long-term experience in all the above areas.

To hire me for PIaaS or to learn more about it, contact me via the Gmail address on my site's contact page.

- Vasudev Ram

My Codementor profile: Vasudev Ram on Codementor

Thursday, September 29, 2016

Publish Peewee ORM data to PDF with xtopdf

By Vasudev Ram

Peewee => PDF

Peewee is a small, expressive ORM for Python, created by Charles Leifer.

After trying out Peewee a bit, I thought of writing another application of xtopdf (my Python toolkit for PDF creation), to publish Peewee data to PDF. I used an SQLite database underlying the Peewee ORM, but it also supports MySQL and PostgreSQL, per the docs. Here is the program, in file PeeweeToPDF.py:

# PeeweeToPDF.py
# Purpose: To show basics of publishing Peewee ORM data to PDF.
# Requires: Peewee ORM and xtopdf.
# Author: Vasudev Ram
# Copyright 2016 Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: http://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

from peewee import *
from PDFWriter import PDFWriter

def print_and_write(pw, s):
    print s
    pw.writeLine(s)

# Define the database.
db = SqliteDatabase('contacts.db')

# Define the model for contacts.
class Contact(Model):
    name = CharField()
    age = IntegerField()
    skills = CharField()
    title = CharField()

    class Meta:
        database = db

# Connect to the database.
db.connect() 

# Drop the Contact table if it exists.
db.drop_tables([Contact])

# Create the Contact table.
db.create_tables([Contact])

# Define some contact rows.
contacts = (
    ('Albert Einstein', 22, 'Science', 'Physicist'),
    ('Benjamin Franklin', 32, 'Many', 'Polymath'),
    ('Samuel Johnson', 42, 'Writing', 'Writer')
)

# Save the contact rows to the contacts table.
for contact in contacts:
    c = Contact(name=contact[0], age=contact[1], \
    skills=contact[2], title=contact[3])
    c.save()

sep = '-' * (20 + 5 + 10 + 15)

# Publish the contact rows to PDF.
with PDFWriter('contacts.pdf') as pw:
    pw.setFont('Courier', 12)
    pw.setHeader('Demo of publishing Peewee ORM data to PDF')
    pw.setFooter('Generated by xtopdf: slides.com/vasudevram/xtopdf')
    print_and_write(pw, sep)
    print_and_write(pw, 
        "Name".ljust(20) + "Age".center(5) + 
        "Skills".ljust(10) + "Title".ljust(15))
    print_and_write(pw, sep)

    # Loop over all rows queried from the contacts table.
    for contact in Contact.select():
        print_and_write(pw, 
            contact.name.ljust(20) + 
            str(contact.age).center(5) + 
            contact.skills.ljust(10) + 
            contact.title.ljust(15))
    print_and_write(pw, sep)

# Close the database connection.
db.close()

I could have used Python's namedtuple feature instead of tuples, but did not do it for this small program.

I ran the program with:

python PeeweeToPDF.py

Here is a screenshot of the output as seen in Foxit PDF Reader (click image to enlarge):

- Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python DLang xtopdf

Subscribe to my blog by email

My ActiveState recipes

FlyWheel - Managed WordPress Hosting

Share |

Wednesday, February 25, 2015

Publish SQLite data to PDF using named tuples

By Vasudev Ram

Some time ago I had written this post:

Publishing SQLite data to PDF is easy with xtopdf.

It showed how to get data from an SQLite (Wikipedia) database and write it to PDF, using xtopdf, my open source PDF creation library for Python.

Today I was browsing the Python standard library docs, and so thought of modifying that program to use the namedtuple data type from the collections module of Python, which is described as implementing "High-performance container datatypes". The collections module was introduced in Python 2.4.
Here is a modified version of that program, SQLiteToPDF.py, called SQLiteToPDFWithNamedTuples.py, that uses named tuples:

# SQLiteToPDFWithNamedTuples.py
# Author: Vasudev Ram - http://www.dancingbison.com
# SQLiteToPDFWithNamedTuples.py is a program to demonstrate how to read 
# SQLite database data and convert it to PDF. It uses the Python
# data structure called namedtuple from the collections module of 
# the Python standard library.

from __future__ import print_function
import sys
from collections import namedtuple
import sqlite3
from PDFWriter import PDFWriter

# Helper function to output a string to both screen and PDF.
def print_and_write(pw, strng):
    print(strng)
    pw.writeLine(strng)

try:

    # Create the stocks database.
    conn = sqlite3.connect('stocks.db')
    # Get a cursor to it.
    curs = conn.cursor()

    # Create the stocks table.
    curs.execute('''DROP TABLE IF EXISTS stocks''')
    curs.execute('''CREATE TABLE stocks
                 (date text, trans text, symbol text, qty real, price real)''')

    # Insert a few rows of data into the stocks table.
    curs.execute("INSERT INTO stocks VALUES ('2006-01-05', 'BUY', 'RHAT', 100, 25.1)")
    curs.execute("INSERT INTO stocks VALUES ('2007-02-06', 'SELL', 'ORCL', 200, 35.2)")
    curs.execute("INSERT INTO stocks VALUES ('2008-03-07', 'HOLD', 'IBM', 300, 45.3)")
    conn.commit()

    # Create a namedtuple to represent stock rows.
    StockRecord = namedtuple('StockRecord', 'date, trans, symbol, qty, price')

    # Run the query to get the stocks data.
    curs.execute("SELECT date, trans, symbol, qty, price FROM stocks")

    # Create a PDFWriter and set some of its fields.
    pw = PDFWriter("stocks.pdf")
    pw.setFont("Courier", 12)
    pw.setHeader("SQLite data to PDF with named tuples")
    pw.setFooter("Generated by xtopdf - https://bitbucket.org/vasudevram/xtopdf")

    # Write header info.
    hdr_flds = [ str(hdr_fld).rjust(10) + " " for hdr_fld in StockRecord._fields ]
    hdr_fld_str = ''.join(hdr_flds)
    print_and_write(pw, '=' * len(hdr_fld_str))
    print_and_write(pw, hdr_fld_str)
    print_and_write(pw, '-' * len(hdr_fld_str))

    # Now loop over the fetched data and write it to PDF.
    # Map the StockRecord namedtuple's _make class method
    # (that creates a new instance) to all the rows fetched.
    for stock in map(StockRecord._make, curs.fetchall()):
        row = [ str(col).rjust(10) + " " for col in (stock.date, \
        stock.trans, stock.symbol, stock.qty, stock.price) ]
        # Above line can instead be written more simply as:
        # row = [ str(col).rjust(10) + " " for col in stock ]
        row_str = ''.join(row)
        print_and_write(pw, row_str)

    print_and_write(pw, '=' * len(hdr_fld_str))

except Exception as e:
    print("ERROR: Caught exception: " + e.message)
    sys.exit(1)

finally:
    pw.close()
    conn.close()

This time I've imported print_function so that I can use print as a function instead of as a statement.

Here's a screenshot of the PDF output in Foxit PDF Reader:

- Vasudev Ram - Online Python training and programming

Dancing Bison Enterprises

Signup to hear about new products or services from me.

Posts about Python Posts about xtopdf

Contact Page

Share |

Thursday, January 15, 2015

Publish databases to PDF with PyDAL and xtopdf

By Vasudev Ram

Some days ago, I had blogged about pyDAL, a pure Python Database Abstraction Layer.

Today I thought of writing a program to publish database data to PDF, using PyDAL and xtopdf, my open source Python library for PDF creation from other file formats.

(Here is a good online overview about xtopdf, for those new to it.)

So here is the code for PyDALtoPDF.py:

"""
Author: Vasudev Ram
Copyright 2014 Vasudev Ram - www.dancingbison.com
This program is a demo of how to use the PyDAL and xtopdf Python libraries 
together to publish database data to PDF.
PyDAL is at: https://github.com/web2py/pydal/blob/master/README.md
xtopdf is at: https://bitbucket.org/vasudevram/xtopdf
and info about xtopdf is at: http://slides.com/vasudevram/xtopdf or 
at: http://slid.es/vasudevram/xtopdf
"""

# imports
from pydal import DAL, Field
from PDFWriter import PDFWriter

SEP = 60

# create the database
db = DAL('sqlite://house_depot.db')

# define the table
db.define_table('furniture', \
    Field('id'), Field('name'), Field('quantity'), Field('unit_price')
)

# insert rows into table
items = ( \
    (1, 'chair', 40, 50),
    (2, 'table', 10, 300),
    (3, 'cupboard', 20, 200),
    (4, 'bed', 30, 400)
)
for item in items:
    db.furniture.insert(id=item[0], name=item[1], quantity=item[2], unit_price=item[3])

# define the query
query = db.furniture
# the above line shows an interesting property of PyDAL; it seems to 
# have some flexibility in how queries can be defined; in this case,
# just saying db.table_name tells it to fetch all the rows 
# from table_name; there are other variations possible; I have not 
# checked out all the options, but the ones I have seem somewhat 
# intuitive.

# run the query
rows = db(query).select()

# setup the PDFWriter
pw = PDFWriter('furniture.pdf')
pw.setFont('Courier', 10)
pw.setHeader('     House Depot Stock Report - Furniture Division     '.center(60))
pw.setFooter('Generated by xtopdf: http://google.com/search?q=xtopdf')

pw.writeLine('=' * SEP)

field_widths = (5, 10, 10, 12, 10)

# print the header row
pw.writeLine(''.join(header_field.center(field_widths[idx]) for idx, header_field in enumerate(('#', 'Name', 'Quantity', 'Unit price', 'Price'))))

pw.writeLine('-' * SEP)

# print the data rows
for row in rows:
    # methinks the writeLine argument gets a little long here ...
    # the first version of the program was taller but thinner :)
    pw.writeLine(''.join(str(data_field).center(field_widths[idx]) for idx, data_field in enumerate((row['id'], row['name'], row['quantity'], row['unit_price'], int(row['quantity']) * int(row['unit_price'])))))

pw.writeLine('=' * SEP)
pw.close()

I ran it (on Windows) with:

$ py PyDALtoPDF.py 2>NUL

Here is a screenshot of the output in Foxit PDF Reader:

- Enjoy.

--- Posts about Python --- Posts about xtopdf ---

- Vasudev Ram - Python programming and training

Signup to hear about new products or services from me.

Contact Page

Share |

Tuesday, December 30, 2014

pyDAL, a pure Python Database Abstraction Layer

By Vasudev Ram

pyDAL is a pure Python Database Abstraction Layer. So it seems to be something like the lower layer of SQLAlchemy, i.e. SQLAlchemy Core, the library that is used by the upper layer, SQLAlchemy ORM. See the SQLAlchemy (0.8) documentation.

From the pyDAL site:

[ It dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and your code will be portable among different types of databases.

pyDAL comes from the original web2py's DAL, with the aim of being wide-compatible. pyDAL doesn't require web2py and can be used in any Python context. ]

IOW, pyDAL has been separated out into a different project from web2py, a Python web framework, of which it was originally a part.

The use of an ORM (Object Relational Mapper) vs. writing plain SQL code (vs. using an intermediate option like pyDAL or SQLAlchemy Core), can be controversial; there are at least some pros and cons on both (or all 3) sides. I've read some about this, and have got some experience with using some of these options in different projects, but am not an expert on which is the best approach, and also, it can vary depending on your project's needs, so I'm not getting into that topic in this post.

pyDAL seems to support many popular databases, mostly SQL ones, but also a NoSQL one or two, and even IMAP. Here is a list, from the site: SQLite, PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, Cubrid, Sybase, Teradata, SAPDB, MongoDB, IMAP.

For some of those databases, it uses PyMySQL, pyodbc or fbd, which are all Python database libraries that I had blogged about earlier.

I tried out pyDAL a little, with this simple program, adapted from its documentation:

import sys
import time
from pydal import DAL, Field
db = DAL('sqlite://storage.db')
db.define_table('product', Field('name'))
t1 = time.time()
num_rows = int(sys.argv[1])
for product_number in range(num_rows):
    db.product.insert(name='Product-'.format(str(product_number).zfill(4)))
t2 = time.time()
print "time to insert {} rows = {} seconds".format(num_rows, int(t2 - t1))
query = db.product.name
t1 = time.time()
rows = db(query).select()
for idx, row in enumerate(rows):
    #print idx, row.name
    pass
t2 = time.time()
print "time to select {} rows = {} seconds".format(num_rows, int(t2 - t1))

It worked, and gave this output:

$ python test_pydal2.py 100000
No handlers could be found for logger "web2py"
time to insert 100000 rows = 18 seconds
time to select 100000 rows = 7 seconds

Note: I first ran it with this statement uncommented:

#print idx, row.name

to confirm that it did select the records, and then commented it and replaced it with "pass" in order to time the select without the overhead of displaying the records to the screen.

I'll check out pyDAL some more, for other commonly needed database operations, and may write about it here.
There may be a way to disable that message about a logger.

The timing statements in the code and the time output can be ignored for now, since they are not meaningful without doing a comparison against the same operations done without pyDAL (i.e. just using SQL from Python with the DB API). I will do a comparison later on and blog about it if anything interesting is found.

- Vasudev Ram - Dancing Bison Enterprises - Python training and consulting

Signup to hear about new products or services from me.

Contact Page

Share |

Friday, December 12, 2014

Project Enferno: RAD framework with Python / Flask / MongoDB / Redis

By Vasudev Ram

Just saw this via Python Weekly's email newsletter. Enferno seems to be a RAD (Rapid Application Development) framework of sorts, using Python, Flask, some Flask extensions, MongoDB AND MongoEngine, and Redis.

Enferno framework

I've worked on a project or two using Python, Flask and MongoDB (which is a NoSQL database), and found it a useful combination for developing apps relatively quickly. Enferno may be worth checking out, which I will do sometime later and may then blog about it.

Here is an article about NoSQL by Martin Fowler that may be of interest to newcomers to the field. Fowler and Pramod Sadalage have also written a book, NoSQL Distilled, which is linked to from Fowler's NoSQL article page. I've read Fowler's earlier book UML Distilled, a small and useful summary of UML, and it looks from the title that their NoSQL book may be on the same lines.

- Vasudev Ram - Dancing Bison Enterprises

Signup for email about interesting new stuff from me.

Contact Page

Other Python posts | Posts about xtopdf

Share |

Monday, November 10, 2014

PyMySQL, a pure-Python client library for MySQL

By Vasudev Ram

I came across PyMySQL on github today. Excerpt from the PyMySQL Github page:

[ This package contains a pure-Python MySQL client library. The goal of PyMySQL is to be a drop-in replacement for MySQLdb and work on CPython, PyPy, IronPython and Jython. ]

In other words, PyMySQL is a pure-Python MySQL driver, and as the docs say, it aims to be a drop-in replacement for MySQLdb, which is a Python driver for MySQL, but implemented as a Python C extension. One of the advantages of using a pure-Python library instead of a C extension, is that you do not have to build it from source, and also, potentially, that it may work without any build steps or changes to the code, on various versions of Python, such as CPython, PyPy, etc., as the excerpt above says. Also see this Stack Overflow question:

What actually is pymysql and how it differs from mysqldb?

Here is PyMySQL on PyPI, (the Python Package Index).

The statistics for PyMySQL on PyPI show this:

Downloads (All Versions):
104 downloads in the last day
7168 downloads in the last week
49124 downloads in the last month

So I tried out PyMySQL a bit, on CPython 2.7.8 on Ubuntu Linux.

To install PyMySQL, do:

$ pip install pymysql

Checked whether it was properly installed with:

$ python
>>> import PyMySQL

Oops, didn't work. I then tried:

>>> import pymysql

That did work.
Here is the schema of the MySQL table (called sales, in a database also called sales) that I used to try PyMySQL:

$ mysql -u root
mysql> use sales;

Database changed
mysql> desc sales;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| id         | int(11)     | YES  |     | NULL    |       |
| region     | varchar(6)  | YES  |     | NULL    |       |
| item       | varchar(15) | YES  |     | NULL    |       |
| unit_price | int(11)     | YES  |     | NULL    |       |
| quantity   | int(11)     | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
5 rows in set (0.01 sec)

mysql>

Basically, it is a table to record sales data by region, for items with unit prices and quantities sold.
And here is my Python program to try PyMySQL:

$ cat test_pymysql.py

import pymysql
conn = pymysql.connect(host='localhost', port=3306, user='some_user', passwd='some_password', db='sales')
curs = conn.cursor()
curs.execute('select * from sales')
# Added this to give a proper header for the output, with field names.
field_names = [ item[0] for item in curs.description ]
for field_name in field_names:
    print field_name.rjust(12),
print
for row in curs:
    for col in row:
        print str(col).rjust(12),
    print
print
curs.close()
conn.close()

I ran the program and got the correct output:

$ python test_pymysql.py
id       region         item   unit_price     quantity
1        North        Chair          100            2
2        North        Table          200            3
3        South         Desk          300            1
3        South        Shelf          400            2
$

I also ran this other SQL query with the GROUP BY clause, to get the region-wise sales:

curs.execute('select region, sum(unit_price * quantity) as value from sales group by region order by 1')

and got the expected output:

$ python test_pymysql.py
      region        value
       North          800
       South         1100

The PyMySQL Github page shows how it works: PyMySQL implements the client side of the MySQL Client/Server Protocol. The MySQL database server listens for database requests from clients on port 3306, and the clients (which may be on the same machine or on another machine on the network), send the request to that server at that port. The protocol also supports UNIX domain sockets, so the connect method has a named parameter unix_socket=None.

Related links:

PyPy, a Python interpreter and Just-In-Time compiler

IronPython, Python for .NET

Jython, Python for the JVM

MySQL

- Vasudev Ram - Dancing Bison Enterprises

Signup for email about new products that I create.

Contact Page

Share |

Sunday, September 14, 2014

Read a DBF file's metadata and data with Python and xtopdf

By Vasudev Ram

DBF files (a.k.a. XBASE files), were one of the most widely used data formats for storing structured relational data on PCs, due to the fact that the original products that used DBF files, dBase II and III, were among the most successful database products of their time (the early personal computer era). DBF is probably still very widely used as a format in small and medium-sized desktop-based applications.

DBFReader.py, a program I wrote as part of my xtopdf toolkit for PDF creation using Python, can be used to read the contents of a DBF file, including both the metadata (file and field header information) and the data records, and display them on the screen.

Here is an example invocation of DBFReader.py:

python DBFReader.py test3.dbf | more

This command will read the metadata and data of the specified DBF file and display them on your screen. Here is the output from the above command:

File header :

key: rec_len value: 30
key: ver value: 245
key: hdr_len value: 193
key: last_update value: 02/11/04
key: num_flds value: 5
key: num_recs value: 4

Field headers :

num_flds =  5
       Name |   Type | Length | Decimals
  FIELD1    |       C|       5|       0
  FIELD2    |       N|       5|       0
  FIELD3    |       L|       1|       0
  FIELD4    |       D|       8|       0
  FIELD5    |       M|      10|       0

Data records:

(' ', ['AAAAA', '11111', 'F', '19010101', '          '])
(' ', ['BBBBB', '22222', 'T', '19020202', '          '])
(' ', ['CCCCC', '33333', 'F', '19030303', '          '])
(' ', ['DDDDD', '44444', 'T', '19040404', '          '])

- Vasudev Ram - Python consulting and training - Dancing Bison Enterprises

Contact Page

Share |

Thursday, April 17, 2014

Nothing but the key: Nice mnemonic for database third normal form (3NF)

http://en.m.wikipedia.org/wiki/Third_normal_form

Saw this great mnemonic for 3NF at the above page:

A memorable statement of Codd's definition of 3NF, paralleling the traditional pledge to give true evidence in a court of law, was given by Bill Kent: "[Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key."[6] A common variation supplements this definition with the oath: "so help me Codd".[7]

Monday, January 27, 2014

JackDB, a web-based database client

JackDB - http://www.jackdb.com/ - is a browser-based database client. It requires no installation.

Thursday, January 23, 2014

Publish Berkeley DB data to PDF with xtopdf

By Vasudev Ram

Berkeley DB (sometimes called BDB or BSD DB) is an embedded (*) key-value database with a long history and a huge user base. It is quite fast and supports very large data sizes. Berkeley DB was developed by Sleepycat Software which was acquired by Oracle some years ago.

(*) "embedded", in the sense of, not client-server, it is a database library that gets linked with your application; not "embedded" in the sense of software embedded in hardware devices, although Berkeley DB can also be embedded in the second sense, since it is small in size.

Excerpt from the Wikipedia article about Berkeley DB linked above:

[ Berkeley DB (BDB) is a software library that provides a high-performance embedded database for key/value data. Berkeley DB is written in C with API bindings for C++, C#, PHP, Java, Perl, Python, Ruby, Tcl, Smalltalk, and many other programming languages. BDB stores arbitrary key/data pairs as byte arrays, and supports multiple data items for a single key. Berkeley DB is not a relational database.[1]
BDB can support thousands of simultaneous threads of control or concurrent processes manipulating databases as large as 256 terabytes,[2] on a wide variety of operating systems including most Unix-like and Windows systems, and real-time operating systems. ]

[ Incidentally, Mike Olson, the former CEO of Sleepycat Software, is now the Chief Strategy Officer of Cloudera, which I blogged about here today:

Cloudera's Impala engine - SQL querying of Hadoop data. ]

I've used Berkeley DB off and on, from before Sleepycat Software was acquired by Oracle, and including via at least C, Python and Ruby.

Today I thought of writing a program that enables a user to publish the data in a Berkeley DB database to PDF, using my xtopdf toolkit for PDF creation. Here is the program, BSDDBToPDF.py:

# BSDDBToPDF.py

# Program to convert Berkeley DB (BSD DB) data to PDF.
# Uses Python's bsdd library (deprecated in Python 3),
# and xtopdf.
# Author: Vasudev Ram - http://www.dancingbison.com

import sys
import bsddb
from PDFWriter import PDFWriter

try:
    # Flag 'c' opens the DB read/write and doesn't delete it if it exists.
    fruits_db = bsddb.btopen('fruits.db', 'c')
    fruits = [
            ('apple', 'The apple is a red fruit.'),
            ('banana', 'The banana is a yellow fruit.'),
            ('cherry', 'The cherry is a red fruit.'),
            ('durian', 'The durian is a yellow fruit.')
            ]
    # Add the key/value fruit records to the DB.
    for fruit in fruits:
        fruits_db[fruit[0]] = fruit[1]
    fruits_db.close()

    # Read the key/value fruit records from the DB and write them to PDF.
    with PDFWriter("fruits.pdf") as pw:
        pw.setFont("Courier", 12)
        pw.setHeader("BSDDBToPDF demo: fruits.db to fruits.pdf")
        pw.setFooter("Generated by xtopdf")
        fruits_db = bsddb.btopen('fruits.db', 'c')
        print "FRUITS"
        print
        pw.writeLine("FRUITS")
        pw.writeLine(" ")
        for key in fruits_db.keys():
            print key
            print fruits_db[key]
            print
            pw.writeLine(key)
            pw.writeLine(fruits_db[key])
            pw.writeLine(" ")
        fruits_db.close()

except Exception, e:
    sys.stderr.write("ERROR: Caught exception: " + repr(e) + "\n")
    sys.exit(1)

And here is a screenshot of the PDF output of the program:

- Vasudev Ram - Dancing Bison Enterprises

Contact Page

Share |

Friday, December 27, 2013

Publishing SQLite data to PDF is easy with xtopdf

By Vasudev Ram

This is another in my series of posts about the uses of xtopdf:

SQLite is an embedded database that is widely used.

According to an estimate by the SQLite developers, it may be the most widely deployed SQL database engine in the world.

Excerpt from the above-linked SQLite page that talks about the deployment numbers (emphasis mine):

[ Now let's consider where SQLite is used:

300 million copies of Mozilla Firefox.

20 million Mac computers, each of which contains multiple copies of SQLite

20 million websites run PHP which has SQLite built in. [3] We have no way of estimating what fraction of those sites actively use SQLite, but we think it is a significant fraction.
450 million registered Skype users.

20 million Symbian smartphones shipped in Q3 2007 [5] Newer versions of the SymbianOS have SQLite built in. It is unclear exactly how many Symbian phones actually contain SQLite, so we will use a single quarter's sales as a lower bound.

10 million Solaris 10 installations, all of which require SQLite in order to boot.

Millions and millions of copies of McAfee anti-virus software all use SQLite internally.

Millions of iPhones use SQLite

Millions and millions of other cellphones from manufactures other than Symbian and Apple use SQLite. This has not been publicly acknowledged by the manufactures but it is known to the SQLite developers. ]

An interesting fact about SQLite is that many startups use it as the database in the initial stages of development, because it is so fast to set up - pretty much no configuration or administration required. I've been involved in a few startups that used SQLite like this.

The list of famous SQLite users is interesting, and includes Adobe, Apple, Airbus, Dropbox, Google, Intuit, McAfee, Skype and more.

And last but not least, Python is one of the "organizations" listed that uses SQLIte. Yes, the sqlite3 library for Python is included in the stdlib since Python 2.5.

So, with that background, here is a program (SQLiteToPDF.py) that shows how easy it is to publish SQLite data to PDF using my xtopdf toolkit for PDF creation:

# SQLiteToPDF.py
# Author: Vasudev Ram - http://www.dancingbison.com
# SQLiteToPDF.py is a program to demonstrate how to read 
# SQLite database data and convert it to PDF.

import sys
from PDFWriter import PDFWriter
import sqlite3

try:

    conn = sqlite3.connect('example.db')
    curs = conn.cursor()

    # Create table.
    curs.execute('''DROP TABLE IF EXISTS stocks''')
    curs.execute('''CREATE TABLE stocks
                 (date text, trans text, symbol text, qty real, price real)''')

    # Insert a few rows of data.
    curs.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.0)")
    curs.execute("INSERT INTO stocks VALUES ('2007-02-06','SELL','ORCL',200,25.1)")
    curs.execute("INSERT INTO stocks VALUES ('2008-03-06','HOLD','IBM',200,45.2)")

    # Save the changes.
    conn.commit()

    # Now fetch back the inserted data and write it to PDF.
    curs.execute("SELECT * FROM stocks")
    with PDFWriter("sqlite_data.pdf") as pw:
        pw.setFont("Courier", 12)
        pw.setHeader("SQLite data to PDF")
        pw.setFooter("Generated by xtopdf - https://bitbucket.org/vasudevram/xtopdf")
        for row in curs:
            row_str = ""
            for col in row:
                row_str = row_str + str(col).rjust(6) + " "
            pw.writeLine(row_str)
    conn.close()

except Exception, e:
    print "ERROR: Caught exception: " + repr(e)
    sys.exit(1)

You can run the program like this:

python SQLiteToPDF.py

And here is its output:

Note: The stock price data shown is randomly made up, not real, of course.

Also see this previous post of mine about sqlite3dbm, an SQLite-backed dbm module.

- Vasudev Ram - Dancing Bison Enterprises

Contact Page

Share |

Friday, December 20, 2013

OrientDB, a distributed transactional graph database

By Vasudev Ram

SQL

OrientDB from Orient Technologies (in London, UK :-) is a distributed transactional graph database.

I saw it via this post on Hacker News:

OrientDB, the most versatile database I’ve run across

The article linked in that post is here:

OrientDB. Thanks!!!

An interesting point in that article is that the author, Petter Graff, thinks that OrientDB is versatile, and somewhat unusual in that it seems to be good for graph, document and object-oriented database applications - which may not be common in a single database product.

I scanned some pages on the OrientDB / Orient Technologies sites and got an overview of the product. It has a native Java and a JDBC interface. It also seems to have interfaces, i.e. bindings, to some other popular programming languages or technologies, including JavaScript, Scala, C, PHP. .NET, Python, Node.js, Clojure and Android.

They have both a free Community edition and an Enterprise edition, with professional services. OrientDB customers includes Lufthansa, Pitney Bowes and Cisco, according to their site.

I'm checking out OrientDB (with the Python binding) and will blog again later if I find it interesting.

- Vasudev Ram - Python, Linux and open source consulting

Contact Page

Share |

Tuesday, August 20, 2013

Publish SQLAlchemy data to PDF with xtopdf

By Vasudev Ram

SQLAlchemyToPDF is a demo program that shows how to publish your database data via SQLAlchemy to PDF.

SQLAlchemy is a popular and widely used database toolkit for Python. It includes both a Core, which consists of a sort of DSL (Domain-Specific Language) for SQL, written in Python, and an ORM (Object Relational Mapper) which is built on top of the Core.

SQLAlchemyToPDF, after some improvement, will become a part of my xtopdf toolkit for PDF creation. It will be released under the BSD license, like the rest of xtopdf.

Using the technique shown below (with appropriate modifications), you can publish data, from any of the major databases that SQLAlchemy supports, to PDF. And you can do this using the high-level interface provided by SQLAlchemy's ORM, which means code that is shorter and easier to write.

However, SQLAlchemy also provides you the ability to go to a lower level when needed, to access more of the power of SQL or of a specific database.

Here is the code for SQLAlchemyToPDF.py:

# SQLAlchemyToPDF.py
# Program to read database data via SQLAlchemy 
# and publish it to PDF. This is a demo.
# Author: Vasudev Ram - http://www.dancingbison.com
# Copyright 2013 Vasudev Ram
# Version 0.1

from PDFWriter import PDFWriter
from sqlalchemy import create_engine

engine = create_engine('sqlite:///:memory:', echo=False)

from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

from sqlalchemy import Column, Integer, String

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    name = Column(String)
    fullname = Column(String)
    email = Column(String)

    def __init__(self, name, fullname, email):
        self.name = name
        self.fullname = fullname
        self.email = email

Base.metadata.create_all(engine) 

a_user = User('A', 'A 1', '[email protected]')
b_user = User('B', 'B 2', '[email protected]')
c_user = User('C', 'C 3', '[email protected]')

from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)

session = Session()

for user in (a_user, b_user, c_user):
    session.add(user)

pw = PDFWriter('users.pdf')
pw.setFont("Courier", 12)
pw.setHeader("SQLAlchemyToPDF - User table report")
pw.setFooter("Generated by xtopdf using Reportlab and Python")

users = session.query(User)
for user in users:
    pw.writeLine(user.name + "|" + user.fullname + "|" + user.email)

pw.savePage()
pw.close()

I used an in-memory SQLite database to keep the code simple. Refer to the SQLAlchemy documentation for how to connect to other databases.

And here is a screenshot of the resulting PDF output:

Read other xtopdf posts on jugad2

Read other python posts on jugad2

- Vasudev Ram - Dancing Bison Enterprises

Contact me

Share |

Wednesday, April 17, 2013

Common DB developer mistake? Using natural instead of surrogate keys

By Vasudev Ram

Just saw this StackOverflow article:

Database development mistakes made by application developers [closed]

via this Twitter conversation between me and @coding_insights.

The article had this as one of the answers:

"3. Using natural rather than surrogate (technical) primary keys"

followed by an explanation of why that person thinks so.

I tend to agree with that comment, based on my own experience working on many database-backed projects.

Though it seems counter-intuitive, using surrogate keys seems to be a better and more robust approach.

The commenter gave many reasons for using surrogate keys instead of natural keys, (and as I said, I've experienced at least some of those reasons), so I won't repeat them here; go take a look at that article.

- Vasudev Ram - Dancing Bison Enterprises

Share |

Wednesday, April 10, 2013

Using xtopdf and pypyodbc to publish MS Access database data to PDF

By Vasudev Ram

I had blogged about pypyodbc, a pure-Python ODBC library, recently.

Using pypyodbc with my xtopdf toolkit for PDF creation, you can publish your MS Access database data to PDF.

Here is some example code to publish MS Access data to PDF:

First, the program create_ppo_mdb.py, shown below, creates an MS Access database called fruits.mdb, then creates a table called fruits in it, and inserts 3 records into the table:

# create_ppo_mdb.py

import pypyodbc 
             
pypyodbc.win_create_mdb('.\\fruits.mdb')
connection_string = 'Driver={Microsoft Access Driver (*.mdb)};DBQ=.\\fruits.mdb'
connection = pypyodbc.connect(connection_string)

SQL = 'CREATE TABLE fruits (id COUNTER PRIMARY KEY, fruit_name VARCHAR(25));'
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO fruits values (1, 'apple');"
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO fruits values (2, 'banana');"
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO fruits values (3, 'orange');"
connection.cursor().execute(SQL).commit()

# Uncomment the 5 lines below make the program also display the data after creating it.

#SQL = 'SELECT * FROM fruits;'
#cursor = connection.cursor().execute(SQL)
#for row in cursor:
#    for col in row:
#        print col,
#    print

cursor.close()
connection.close()

Next, the program MDBtoPDF.py, shown below, reads the data from the fruits table in the MDB database just created above, and publishes the selected records to PDF:

#-------------------------------------------------------------------

# MDBtoPDF.py
# Description: A program to convert MS Access .MDB data to PDF format.
# Author: Vasudev Ram - http://www.dancingbison.com

#-------------------------------------------------------------------

# imports

import sys 
import os
import time
import string
import pypyodbc 
from PDFWriter import PDFWriter
             
#-------------------------------------------------------------------

# globals

##------------------------ usage ---------------------------------------

def usage():

 sys.stderr.write("Usage: python " + sys.argv[0] + " MDB_DSN table_name pdf_file\n")
 sys.stderr.write("where MDB_DSN is the ODBC DSN (Data Source Name) for the\n")
 sys.stderr.write("MDB file, table_name is the name of the table in that MDB,\n")
 sys.stderr.write("whose data you want to convert to PDF, and pdf_file is the\n")
 sys.stderr.write("output PDF filename.\n")
 sys.stderr.write(sys.argv[0] + " reads the table data from the MDB and\n")
 sys.stderr.write("writes it to pdf_file.\n")

##------------------------ main ------------------------------------------

def main():

 '''Main program to convert MDB data to PDF.
 '''

 # check for right num. of args
 if (len(sys.argv) != 4):
  usage()
  sys.exit(1)

 # extract MDB DSN, table name and pdf filename from args
 mdb_dsn = sys.argv[1]
 table_name = sys.argv[2]
 pdf_fn = sys.argv[3]

 print "mdb_dsn =", mdb_dsn
 print "table_name =", table_name
 print "pdf_fn =", pdf_fn

    # build connection string
 connection_string_prefix = 'Driver={Microsoft Access Driver (*.mdb)};DBQ='
 connection_string = connection_string_prefix + mdb_dsn
 print "connection_string =", connection_string
 connection = pypyodbc.connect(connection_string)
 print "connection =", connection

 # create the PDFWriter instance
 pw = PDFWriter(pdf_fn)

 # and set some of its fields

 # set the font
 pw.setFont("Courier", 10)

 # set the page header
 gen_datetime = time.asctime()
 pw.setHeader("Generated by MDBtoPDF: Input: " + mdb_dsn + \
 " At: " + gen_datetime)

 # set the page footer
 pw.setFooter("Generated by MDBtoPDF: Input: " + mdb_dsn + \
 " At: " + gen_datetime)

 # create the separator for logical grouping of output
 sep = "=" * 60

 # print the data records section title
 pw.writeLine("MDB Data Records from MDB: %s, table: %s" % (mdb_dsn, 
  table_name))

 # print a separator line
 pw.writeLine(sep)

 # read the input MDB data and write it to the PDF file

 SQL = 'SELECT * FROM fruits;'

 cursor = connection.cursor().execute(SQL)
 for row in cursor:
  str_row = ""
  for col in row:
   str_row = str_row + str(col) + " "
  pw.writeLine(str_row)

 # close the cursor and connection
 cursor.close()
 connection.close()

 # print a separator line
 pw.writeLine(sep)

 # save current page
 pw.savePage()

 # close the PDFWriter
 pw.close()

##------------------------ Global code -----------------------------------

# invoke main

if __name__ == '__main__':
 main()

##------------------------ EOF - MDBto_PDF.py ---------------

To make the above programs work, you need to have the Reportlab toolkit v1.21 and the xtopdf toolkit installed, in addition to pypyodbc and Python 2.7. (Click on the "Branches" tab on the xtopdf page linked in the previous sentence to download xtopdf.)

I've had an interest in ODBC ever since I first worked, as team leader, on a middleware software product that used ODBC. The middleware was developed at Infosys Technologies, where I worked at the time.

Though ODBC itself had a good architecture, many driver implementations of the time (this was some years ago) were rather slow, so one of the main goals of the product was to improve the performance of client-server or desktop applications (written in Visual Basic or C) that used ODBC for database access.

I remember learning ODBC as part of the project (and teaching it to the team), and reading most of the book "Inside ODBC" by Kyle Geiger, one of the architects of ODBC - it was a fascinating book, that gave a detailed look inside the architecture of ODBC, the reasons for certain design decisions that were made, and so on.

We succeeded in meeting all the goals of the project, and that middleware product was used in many large client-server applications (using VB and Oracle / Sybase) that were developed by Infosys for its clients. I really had a lot of fun working on that project.

Related links:

ODBC entry on Wikipedia

Inside ODBC - the book, on Amazon

eGenix mxODBC Connect, from eGenix, a German Python products company.

eGenix mxODBC

unixODBC

DataDirect ODBC

iODBC

The Microsoft SQL Server ODBC Driver for Linux - it provides native connectivity from Linux to Microsoft SQL Server. (Seems to be 64-bit only).

- Vasudev Ram - Dancing Bison Enterprises

Share |

Tuesday, March 26, 2013

pypyodbc, a pure Python ODBC library

By Vasudev Ram

In my recent post, PyODBC for Linux, Windows and Mac OS X, Ben Rousch commented: "I've had better luck with pure-Python PyPyODBC than with PyODBC."

So I downloaded and tried out pypyodbc.

Features of pypyodbc (from its web page):

Design Goal: Small, Maintainable, Cross Platform, Compatible, Portable

Features
One pure Python script, runs on CPython / IronPython / PyPy , Python 3.3 / 2.4 / 2.5 / 2.6 / 2.7 , Win / Linux / Mac , 32 / 64 bit
Very similar usage as pyodbc ( can be seen like a re-implementation of pyodbc in pure Python )
Simple - the whole module is implemented in a single python script with less than 3000 lines
Built-in functions to create and compress Access MDB files on Windows

I first tried the simple test program given on the pypyodbc page, which just creates a new MS Access database using Python. It worked.

Then I added code to that test program to insert three rows to the table and then and then retrieve the rows from the database and print them.

Here is the code:

# test_ppo.py

import pypyodbc 
             
pypyodbc.win_create_mdb('C:\\vr\\pypyodbc\\db2.mdb')
connection_string = 'Driver={Microsoft Access Driver (*.mdb)};DBQ=C:\\vr\\pypyodbc\\db2.mdb'
connection = pypyodbc.connect(connection_string)

SQL = 'CREATE TABLE saleout (id COUNTER PRIMARY KEY,product_name VARCHAR(25));'
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO saleout values (1, 'apple');"
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO saleout values (2, 'banana');"
connection.cursor().execute(SQL).commit()

SQL = "INSERT INTO saleout values (3, 'orange');"
connection.cursor().execute(SQL).commit()

SQL = 'SELECT * FROM saleout;'
cursor = connection.cursor().execute(SQL)
for row in cursor:
    print row

And I got the expected output:

C:\vr\pypyodbc>python test_ppo.py
(1, u'apple')
(2, u'banana')
(3, u'orange')

- Vasudev Ram - Dancing Bison Enterprises

Share |

Thursday, March 21, 2013

PyODBC for Linux, Windows and Mac OS X

By Vasudev Ram

PyODBC is like the odbc module in the Standard Python library, but claims to work on Linux, Windows and Mac OS X. I've used the odbc module from the stdlib earlier. Using it is straightforward for basic CRUD-type usage.

PyODBC is supposed to "implement the Python Database API Specification v2.0, but additional features have been added to simplify database programming even more." I am trying out PyODBC (have taken an initial look at the examples, seems similar to the stdlib odbc module) and will report on it a little later.

ODBC on Wikipedia.

- Vasudev Ram - Dancing Bison Enterprises

Share |

Saturday, March 2, 2013

Dancing Bison Enterprises - Profile

Dancing Bison Enterprises is a small software company based in Pune, India.

We have done projects for USA-, Europe- and India-based clients in Python, Ruby on Rails, C, Java, PDF and other technologies. We have good skills and experience in Python, C, UNIX/Linux, Java, Ruby, relational databases, PDF creation, and multiple open source technologies, and in software requirements analysis, design and implementation. Development of robust software applications and components is our forte. Open source technologies and UNIX/Linux are two of our key areas of strength - we have been working with UNIX from some time before Linux was first created, and with open source software from before the term "open source" was coined. Have many years of software development experience, including both working with large international and Indian software companies and with small companies, including startups based both abroad and in India.

Please visit our business web site www.dancingbison.com for an overview about us.

The founder of the company, Vasudev Ram, is a nominated / elected member of the Python Software Foundation.

Developed multiple open source software products/projects - please see our Products page , and our Bitbucket page . More such products are in the pipeline.

Packt Publishing of the UK uses our Python product, xtopdf, in their book production workflow. The Software Freedom Law Center of the USA also uses xtopdf as part of their e-discovery work.

Published technical articles on Python, C and Linux; please see:

Using xtopdf (on the Packt Publishing site)

Developing a Linux command-line utility (on the IBM developerWorks site)

A vi quickstart tutorial (in Linux For You magazine)

How Knoppix saved the day (in Linux For You magazine)

Our article on Developing a Linux command-line utility - the 2nd link in the list of articles above - was published on IBM developerWorks, and translated by IBM for the Chinese and Japanese versions of their developerWorks site. More than one organization has used the article as a basis for developing production command-line tools.

We are available for web or non-web application development, open source and related consulting/contract work, and for corporate software training in the areas of our skills.

Please visit our Contact page to get in touch with us or to request a quote.

Sunday, February 24, 2013

vitess - scalable RPC interface to MySQL, used by YouTube

vitess - Scaling MySQL databases for the web - Google Project Hosting

Vitess, used at YouTube, is a scalable RPC interface to MySQL, with sharding, limited ACID support, Python and Go client interfaces, and other features.

.

Pages

Friday, January 18, 2019

Thursday, September 29, 2016

Wednesday, February 25, 2015

Thursday, January 15, 2015

Tuesday, December 30, 2014

Friday, December 12, 2014

Monday, November 10, 2014

Sunday, September 14, 2014

Thursday, April 17, 2014

Monday, January 27, 2014

Thursday, January 23, 2014

Friday, December 27, 2013

Friday, December 20, 2013

SQL

Tuesday, August 20, 2013

Wednesday, April 17, 2013

Wednesday, April 10, 2013

Tuesday, March 26, 2013

Thursday, March 21, 2013

Saturday, March 2, 2013

Sunday, February 24, 2013

Blog Archive

Labels