Showing posts with label tabular-data. Show all posts
Showing posts with label tabular-data. Show all posts

Friday, January 23, 2015

PrettyTable to PDF is pretty easy with xtopdf

By Vasudev Ram




"PrettyTable to PDF is pretty easy with xtopdf."

How's that for some alliteration? :)

PrettyTable is a Python library to help you generate nice tables with ASCII characters as the borders, plus alignment of text within columns, headings, padding, etc.

Excerpt from the site:

[ PrettyTable is a simple Python library designed to make it quick and easy to represent tabular data in visually appealing ASCII tables.
...
PrettyTable lets you control many aspects of the table, like the width of the column padding, the alignment of text within columns, which characters are used to draw the table border, whether you even want a border, and much more. You can control which subsets of the columns and rows are printed, and you can sort the rows by the value of a particular column.

PrettyTable can also generate HTML code with the data in a <table> structure. ]

I came across PrettyTable via this blog post:

11 Python Libraries You Might Not Know.

Then I thought of using it with my PDF creation toolkit, to generate such ASCII tables, but as PDF. Here's a program, PrettyTableToPDF.py, that shows how to do that:
"""
PrettyTableToPDF.py
A demo program to show how to convert the output generated 
by the PrettyTable library, to PDF, using the xtopdf toolkit 
for PDF creation from other formats.
Author: Vasudev Ram - http://www.dancingbison.com
xtopdf is at: http://slides.com/vasudevram/xtopdf

Copyright 2015 Vasudev Ram
"""

from prettytable import PrettyTable
from PDFWriter import PDFWriter

pt = PrettyTable(["City name", "Area", "Population", "Annual Rainfall"])
pt.align["City name"] = "l" # Left align city names
pt.padding_width = 1 # One space between column edges and contents (default)
pt.add_row(["Adelaide",1295, 1158259, 600.5])
pt.add_row(["Brisbane",5905, 1857594, 1146.4])
pt.add_row(["Darwin", 112, 120900, 1714.7])
pt.add_row(["Hobart", 1357, 205556, 619.5])
pt.add_row(["Sydney", 2058, 4336374, 1214.8])
pt.add_row(["Melbourne", 1566, 3806092, 646.9])
pt.add_row(["Perth", 5386, 1554769, 869.4])
lines = pt.get_string()

pw = PDFWriter('Australia-Rainfall.pdf')
pw.setFont('Courier', 12)
pw.setHeader('Demo of PrettyTable to PDF')
pw.setFooter('Demo of PrettyTable to PDF')
for line in lines.split('\n'):
    pw.writeLine(line)
pw.close()

You can run the program with:
$ python PrettyTableToPDF.py
And here is a screenshot of the output PDF in Foxit PDF Reader:


- Enjoy.

--- Posts about Python --- Posts about xtopdf ---

- Vasudev Ram - Dancing Bison Enterprises - Python programming and training

Signup to be informed about my new products or services.

Contact Page

Thursday, December 25, 2014

Create tabular PDF reports with Python, xtopdf and tablib

By Vasudev Ram


Tablib is a Python library that allows you to import, export and manipulate tabular data.

I had come across tablib a while ago. Today I thought of using it with xtopdf, my Python library for PDF creation, to generate PDF output from tabular data. So I wrote a program, TablibToPDF.py, for that. It generates dummy data for student grades (for an examination), then puts that data into a tablib Dataset, and then exports the contents of that Dataset to PDF, using xtopdf. Given the comments in the code, it is mostly self-explanatory. I first wrote the program in an obvious/naive way, and then improved it a little by removing some intermediate variables, and by converting some for loops to list comprehensions, thereby shortening the code by a few lines. Here is the code for TablibToPDF.py:
"""
TablibToPDF.py
Author: Vasudev Ram
Copyright 2014 Vasudev Ram - www.dancingbison.com
This program is a demo of how to use the tablib and xtopdf Python libraries 
to generate tabular data reports as PDF output.
Tablib is at: https://tablib.readthedocs.org/en/latest/
xtopdf is at: https://bitbucket.org/vasudevram/xtopdf
and info about xtopdf is at: http://slides.com/vasudevram/xtopdf or 
at: http://slid.es/vasudevram/xtopdf
"""

import random
import tablib
from PDFWriter import PDFWriter

# Helper function to output a string to both screen and PDF.
def print_and_write(pw, strng):
    print strng
    pw.writeLine(strng)

# Set up grade and result names and mappings.
grade_letters = ['F', 'E', 'D', 'C', 'B', 'A']
results = {'A': 'Pass', 'B': 'Pass', 'C': 'Pass', 
    'D': 'Pass', 'E': 'Pass', 'F': 'Fail'}

# Create an empty Dataset and set its headers.
data = tablib.Dataset()
data.headers = ['ID', 'Name', 'Marks', 'Grade', 'Result']
widths = [5, 12, 8, 8, 12] # Display widths for columns.

# Create some rows of student data and use it to populate the Dataset.
# Columns for each student row correspond to the header columns 
# shown above.

for i in range(20):
    id = str(i).zfill(2)
    name = 'Student-' + id
    # Let's grade them on the curve [1].
    # This examiner doesn't give anyone 100 marks :)
    marks = random.randint(40, 99)
    # Compute grade from marks.
    grade = grade_letters[(marks - 40) / 10]
    result = results[grade]
    columns = [id, name, marks, grade, result]
    row = [ str(col).center(widths[idx]) for idx, col in enumerate(columns) ]
    data.append(row)

# Set up the PDFWriter.
pw = PDFWriter('student_grades.pdf')
pw.setFont('Courier', 10)
pw.setHeader('Student Grades Report - generated by xtopdf')
pw.setFooter('xtopdf: http://slides.com/vasudevram/xtopdf')

# Generate header and data rows as strings; output them to screen and PDF.

separator = '-' * sum(widths)
print_and_write(pw, separator)

# Output headers
header_strs = [ header.center(widths[idx]) for idx, header in enumerate(data.headers) ]
print_and_write(pw, ''.join(header_strs))
print_and_write(pw, separator)

# Output data
for row in data:
    print_and_write(pw, ''.join(row))

print_and_write(pw, separator)
pw.close()

# [1] http://en.wikipedia.org/wiki/Grading_on_a_curve
# I'm not endorsing the idea of grading on a curve; I only used it as a 
# simple algorithm to generate the marks and grades for this example.

You can run it with:
$ python TablibToPDF.py
It sends the tabular output that it generates, to both the screen and to a PDF file named student_grades.pdf.
Here is a screenshot of the generated PDF file, opened in Foxit PDF Reader:


The program that I wrote could actually have been written without using tablib, just with plain Python lists and/or dictionaries. But tablib has some additional features, such as dynamic columns, export to various formats (but not PDF), and more - see its documentation, linked near the top of this post. I may write another blog post later that explores the use of some of those tablib features.

- Enjoy.

Vasudev Ram - Python consulting and training - Dancing Bison Enterprises

Signup to hear about new products or services from me.

Contact Page

Friday, March 8, 2013

Python TableFu aims to become an ORM for spreadsheets

By Vasudev Ram

Python TableFu is an interesting tool I saw recently. It is a Python library that lets you import tabular data from CSV files and then manipulate them in memory, by calling its methods.

From the site:

[ Python TableFu is a tool for manipulating spreadsheet-like tables in Python. It began as a Python implementation of ProPublica's TableFu, though new methods have been added. TableFu allows filtering, faceting and manipulating of data. Going forward, the project aims to create something akin to an ORM for spreadsheets. ]

A CSV file you specify, is used as input to create a TableFu object, and the rows are available as a list, which can be indexed to get a specific row. The columns (of each row) are available as a dictionary, so you can say table[column_name], e.g. table["Author] to get a specific column. The model seems to make sense, since table rows usually don't have specific names, they are treated as "row number so-and-so", while columns do have names - the header of the column, if it is a CSV file, or the column name of a database table, using RDBMS terminology.

Python TableFu allows filtering, "faceting" and manipulation of the table data.

Here is some example Python TableFu code, from the site; I've deleted some bits to keep it short:

>>> from table_fu import TableFu
>>> table = TableFu.from_file('tests/test.csv')
>>> table.columns
['Author', 'Best Book', 'Number of Pages', 'Style']

# get all authors
>>> table.values('Author')
['Samuel Beckett', 'James Joyce', 'Nicholson Baker', 'Vladimir Sorokin']

# total a column
>>> table.total('Number of Pages')
1177.0

# filtering a table returns a new instance
>>> t2 = table.filter(Style='Modernism')
>>> list(t2)
[Row: Samuel Beckett, Malone Muert, 120, Modernism,
 Row: James Joyce, Ulysses, 644, Modernism]


# each TableFu instance acts like a list of rows
>>> table[0]


list(table.rows)
[Row: Samuel Beckett, Malone Muert, 120, Modernism,
 Row: James Joyce, Ulysses, 644, Modernism,
 Row: Nicholson Baker, Mezannine, 150, Minimalism,
 Row: Vladimir Sorokin, The Queue, 263, Satire]

# rows, in turn, act like dictionaries
>>> row = table[1]
>>> print row['Author']
James Joyce

# transpose a table
>>> t2 = table.transpose()
>>> list(t2)
[Row: Best Book, Malone Muert, Ulysses, Mezannine, The Queue,
 Row: Number of Pages, 120, 644, 150, 263,
 Row: Style, Modernism, Modernism, Minimalism, Satire]

>>> t2.columns
['Author',
 'Samuel Beckett',
 'James Joyce',
 'Nicholson Baker',
 'Vladimir Sorokin']

# sort rows
>>> table.sort('Author')
>>> table.rows
[Row: James Joyce, Ulysses, 644, Modernism,
 Row: Nicholson Baker, Mezannine, 150, Minimalism,
 Row: Samuel Beckett, Malone Muert, 120, Modernism,
 Row: Vladimir Sorokin, The Queue, 263, Satire]

(I deleted the angle brackets from the original output (that demarcates Rows) because it was messing up the HTML formatting, due to being interpreted as HTML elements.)

Python TableFu looks like it could be useful to manipulate CSV data in memory, before writing it out to another file or sending it to another process (or function in the same program) for further processing.

- Vasudev Ram - Dancing Bison Enterprises