Moved

Moved. See https://slott56.github.io. All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.

Showing posts with label python3. Show all posts
Showing posts with label python3. Show all posts

Tuesday, July 28, 2020

Modern Python Cookbook 2nd ed -- Advance Copies -- DM me

This is your "why wait" invitation.

Advanced copies will be available.  

IF.

And this is a big "if".

You have to write a blurb. 

I'll be putting you in contact with Packt marketing folks who will get you your advanced copy so you can write blurbs and reviews and -- well -- actually use the content.

It's all updated to Python 3.8. Type hints almost everywhere. F-strings and the walrus operator. Bunches of devops and data science examples. Plus a few personal examples involving sailboat navigation and management.

See me at LinkedIn https://www.linkedin.com/in/steven-lott-029835/ and I'll hook you up with Packt marketing folks.

See https://www.amazon.com/Modern-Python-Cookbook-Updated-programmer/dp/180020745X for the official Amazon Book Link. This is for ordinary "no obligation to write a review" orders.

DM me directly slott56 at gmail to be put into the marketing spreadsheet.

Tuesday, June 20, 2017

NMEA Data Acquisition -- An IoT Exercise with Python

Here's the code: https://github.com/slott56/NMEA-Tools. This is Python code to do some Internet of Things (IoT) stuff. Oddly, even when things connected by a point-to-point serial interface, it's still often called IoT. Even though there's no "Internetworking."

Some IoT projects have a common arc: exploration, modeling, filtering, and persistence. This is followed by the rework to revise the data models and expand the user stories. And then there's the rework conundrum. Stick with me to see just how hard rework can be.

What's this about? First some background. Then I'll show some code.

Part of the back story is here: http://www.itmaybeahack.com/TeamRedCruising/travel-2017-2018/that-leaky-hatch--chartplot.html

In the Internet of Things Boaty (IoT-B) there are devices called chart-plotters. They include GPS receivers, displays, and controls. And algorithms. Most important is the merging of GPS coordinates and an active display. You see where your boat is.

Folks with GPS units in cars and on their phones have an idea core feature set of a chart plotter. But the value of a chart plotter on a boat is orders of magnitude above the value in a car.

At sea, the hugeness and importance of the chartplotter is magnified. The surface of the a large body of water is (almost) trackless. Unless you're really familiar with it, it's just water, generally opaque. The depths can vary dramatically. A shoal too shallow for your boat can be more-or-less invisible and just ahead. Bang. You're aground (or worse, holed.)

A chart -- and knowledge of your position on that chart -- is a very big deal. Once you sail out of sight of land, the chart plotter becomes a life-or-death necessity. While I can find the North American continent using only a compass, I'm not sure I could find the entrance to Chesapeake Bay without knowing my latitude. (Yes, I have a sextant. Would I trust my life to my sextant skills?)

Modern equipment uses modern hardware and protocols. N2K (NMEA 2000), for example, is powered Ethernet connectivity that uses a simplified backbone with drops for the various devices. Because it's Ethernet, they're peers, and interconnection is simplified. See http://www.digitalboater.com for some background.

The Interface Issue

The particularly gnarly problem with chart plotters is the lack of an easy-to-live-with interface.

They're designed to be really super robust, turn-it-on-and-it-works products. Similar to a toaster, in many respects. Plug and play. No configuration required.

This is a two-edged sword. No configuration required bleeds into no configuration possible.

The Standard Horizon CP300i uses NT cards. Here's a reader device. Note the "No Longer Available" admonition. All of my important data is saved to the NT card. But. The card is useless except for removable media backup in case the unit dies.

What's left? The NMEA-0183 interface wiring.

NMEA Serial EIA-422

The good news is that the NMEA wiring is carefully documented in the CP300i owner's manual. There are products like this NMEA-USB Adaptor. A few wire interconnections and we can -- at least in principle -- listen to this device.

The NMEA standard was defined to allow numerous kinds of devices to work together. When it was adopted (in 1983), the idea was that a device would be a "talker" and other devices would be "listeners." The intent was to have a lot of point-to-point conversations: one talker many listeners.

A digital depth meter or wind meter, for example, could talk all day, pushing out message traffic with depth or wind information. A display would be a listener and display the current depth or wind state.

A centralized multiplexer could collect from multiple listeners and then stream the interleaved messages as a talker. Here's an example. This would allow many sensors to be combined onto a single wire. A number of display devices could listen to traffic on the wire, pick out messages that made sense to them, and display the details.

Ideally, if every talker was polite about their time budget, hardly anything would get lost.

In the concrete case of the CP300i, there are five ports. usable in various combinations. There are some restrictions that seem to indicate some hardware sharing among the ports. The product literature describes a number of use cases for different kinds of interconnections including a computer connection.

Since NMEA is EIA-422 is RS-232, some old computer serial ports could be wired up directly. My boat originally had an ancient Garmin GPS and an ancient windows laptop using an ancient DB-9 serial connector. I saved the data by copying files off the hard drive and threw the hardware away.

A modern Macintosh, however, only handles USB. Not direct EAI-422 serial connections. An adaptor is required.

What we will have, then, is a CP300i in talker mode, and a MacBook Pro (Retina, 13-inch, Late 2013) as listener.

Drivers and Infrastructure

This is not my first foray in the IoT-B world. I have a BU-353 GPS antenna. This can be used directly by the GPSNavX application on the Macintosh. On the right-ish side of the BU-353 page are Downloads. There's a USB driver listed here. And a GPS Utility to show position and satellites and the NMEA data stream.

Step 1. Install this USB driver.

Step 2. Install the MAC OS X GPS Utility. I know the USB interface works because I can see the BU-353 device using this utility.

Step 3. Confirm with GPSNavX. Yes. The chart shows the little boat triangle about where I expect to be.

Yay! Phase I of the IoT-B is complete. We have a USB interface. And we can see an NMEA-0183 GPS antenna. It's transmitting in standard 4800 BAUD mode. This is the biggest hurdle in many projects: getting stuff to talk.

In the project Background section on Git Hub, there's a wiring diagram for the USB to NMEA interface.

Also, the Installation section says install pyserial. https://pypi.python.org/pypi/pyserial. This is essential to let Python apps interact with the USB driver.

Data Exploration

Start here: NMEA Reference Manual. This covers the bases for the essential message traffic nicely. The full NMEA standard has lots of message types. We only care about a few of them. We can safely ignore the others.

As noted in the project documentation, there's a relatively simple message structure. The messages arrive more-or-less constantly. This leads to an elegant Pythonic design: an Iterator.

We can define a class which implements the iterator protocol (__iter__() and __next__()) that will consume lines from the serial interface and emit the messages which are complete and have a proper checksum. Since the fields of a message are comma-delimited, might as well split into fields, also.

It's handy to combine this with the context manager protocol (__enter__() and __exit__()) to create a class that can be used like this.

    with Scanner(device) as GPS:
        for sentence_fields in GPS:
            print(sentence_fields) 

This is handy for watching the messages fly past. The fields are kind of compressed. It's a light-weight compression, more like a lack of useful punctuation than proper compression.

Consequently, we'll need to derive fields from the raw sequences of bytes. This initial exploration leads straight to the next phase of the project.

Modeling

We can define a data model for these sentences using a Sentence class hierarchy. We can use a simple Factory function to emit Sentence objects of the appropriate subclass given a sequence of fields in bytes. Each subclass can derive data from the message.

The atomic fields seem to be of seven different types.
  • Text. This is a simple decode using ASCII encoding.
  • Latitude. The values are in degrees and float minutes.
  • Longitude. Similar to latitude.
  • UTC date. Year, month, and day as a triple.
  • UTC time. Hour, minute, float seconds as a triple.
  • float. 
  • int.
Because fields are optional, we can't naively use the built-in float() and int() functions to convert bytes to numbers. We'll have to have a version that works politely with zero-length strings and creates None objects.

We can define a simple field definition tuple, Field = namedtuple('Field', ['title', 'name', 'conversion']). This slightly simplifies definition of a class.

We can define a class with a simple list of field conversion rules.

class GPWPL(Sentence):
    fields = [
        Field('Latitude', 'lat_src', lat),
        Field('N/S Indicator', 'lat_h', text),
        Field('Longitude', 'lon_src', lon),
        Field('E/W Indicator', 'lon_h', text),
        Field("Name", "name", text),        
    ]

The superclass __init__() uses the sequence of field definitions to apply conversion functions (lat(), lon(), text()) to the bytes, populating a bunch of attributes. We can then use s.lat_src to see the original latitude 2-tuple from the message. A property can deduce the actual latitude from the s.lat_src and s.lat_h fields.

For each field, apply the function to the value, and set this as an attribute.

        for field, arg in zip(self.fields, args[1:]):
            try:
                setattr(self, field.name, field.conversion(arg))
            except ValueError as e:
                self.log.error(f"{e} {field.title} {field.name} {field.conversion} {arg}")

This sets attributes with useful values derived from the bytes provided in the arguments.

The factory leverages a cool name-to-class mapping built by introspection.

    sentence_class_map = {
        class_.__name__.encode('ascii'): class_ 
        for class_ in Sentence.__subclasses__()
    }
    class_= self.sentence_class_map.get(args[0])

This lets us map a sentence header (b"GPRTE") to a class (GPRTE) simply. The get() method can use an UnknownSentence subclass as a default.

Modeling Alternatives

As we move forward, we'll want to change this model. We could use a cooler class definition style, something like this. We could then iterate of the keys in the class __dict__ to set the attribute values.

class GPXXX(Sentence):
    lat_src = Latitude(1)
    lat_h = Text(2)
    lon_src = Longitude(3)
    lon_h = Text(4)
    name = Text(5)

The field numbers are provided to be sure the right bunch of bytes are decoded.

Or maybe even something like this:

class GPXXX(Sentence):
    latitude = Latitude(1, 2)
    longitude = Longitude(3, 4)
    name = Text(5)

This would combine source fields to create the useful value. It would be pretty slick. But it requires being *sure* of what a sentence' content is. When exploring, this isn't the way to start. The simplistic list of field definitions comes right off web sites without too much intermediate translation that can lead to confusion.

The idea is to borrow the format from the SiRF reference and start with Name, Example, Unit, and Description in each Field definition. That can help provide super-clear documentation when exploring. The http://aprs.gids.nl/nmea/ information has similar tables with examples. Some of the http://freenmea.net/docs examples only have names.

The most exhaustive seems to be http://www.catb.org/gpsd/NMEA.html. This, also, only has field names and position numbers. The conversions are usually pretty obvious.

Filtering

A talker -- well -- talks. More or less constantly. There are delays to allow time to listen and time for multiplexers to merge in other talker streams.

There's a cycle of messages that a device will emit. Once you've started decoding the sentences, the loop is obvious.

For an application where you're gathering real-time track or performance data, of course, you'll want to capture the background loop. It's a lot of data. At about 80 bytes times 8 background messages on a 2-second cycle, you'll see 320 bytes per second, 19K per minute, 1.1M per hour, 27.6M per day. You can record everything for 38 days to and be under a Gb.

The upper bound for 4800 BAUD is 480 bytes per second. 41M per day. 25 days to record a Gb of raw data.

For my application, however, I want to capture the data not in the background loop.

It works like this.
  1. I start the laptop collecting data.
  2. I reach over to the chartplotter and push a bunch of buttons to get to a waypoint transfer or a route transfer.
  3. The laptop shows the data arriving. The chartplotter claims it's done sending.
  4. I stop collecting data. In the stream of data are my waypoints or routes. Yay!
A reject filter is an easy thing: Essentially it's filter(lambda s: s._name not in reject_set, source). A simple set of names to reject is the required configuration for this filter.

Persistence

How do we save these messages?

We have several choices.
  1. Stream of Bytes. The protocol uses \r\n as line endings. We could (in principle) cat /dev/cu.usbserial-A6009TFG >capture.nmea. Pragmatically, that doesn't always work because the 4800 BAUD setting is hard to implement. But the core idea of "simply save the bytes" works.
  2. Stream of Serialized Objects. 
    1. We can use YAML to cough out the objects. If the derived attributes were all properties, it would have worked out really well. If, however, we leverage __init__() to set attributes, this becomes awkward.
    2. We can work around the derived value problems by using JSON with our own Encoder to exclude the derived fields. This is a bit more complex, than it needs to be. It permits exploration though.
  3. GPX, KML, or CSV. Possible, but. These seems to be a separate problem.
When transforming data, it's essential to avoid "point-to-point" transformation among formats. It's crucial to have a canonical representation and individual converters. In this case, we have NMEA to canonical, persist the canonical, and canonical to GPX (or KML, or CSV.)

Rework

Yes. There's a problem here.  Actually there are several problems.
  1. I got the data I wanted. So, fixing the design flaws isn't essential anymore. I may, but... I should have used descriptors.
  2. In the long run, I really need a three-way synchronization process between computer, new chart plotter and legacy chart plotter. 
Let's start with the first design issue: lazy data processing.

The core Field/Sentence design should have looked like this:

class Field:
    def __init__(self, position, function, description):
        self.position = position
        self.function = function
        self.description = description
    def __get__(self, object, class_):
        print(f"get {object} {class_}")
        transform = self.function
        return transform(object.args[self.position])
        
class Sentence:
    f0 = Field(0, int, "Item Zero")
    f1 = Field(1, float, "Item One")
    def __init__(self, *args):
        self.args = args

This makes all of the properties into lazy computations. It simplifies persistence because the only real attribute value is the tuple of arguments captured from the device.

>>> s = Sentence(b'1', b'2.3')
>>> s.f1
1
>>> s.f2
2.3

That would have been a nicer design because serialization would have been trivial. Repeated access to the fields might have become costly. We have a tradeoff issue here that depends on the ultimate use case. For early IoT efforts, flexibility is central, and the computation costs don't matter. At some point, there may be a focus on performance, where extra code to save time has merit.

Synchronization is much more difficult. I need to pick a canonical representation. Everything gets converted to a canonical form. Differences are identified. Then updates are created: either GPX files for the devices that handle that, or NMEA traffic for the device which updated over the wire.

Conclusion

This IoT project followed a common arc: Explore the data, define a model, figure out how to filter out noise, figure out how to persist the data. Once we have some data, we realize the errors we made in our model.

A huge problem is the pressure to ship an MVP (Minimally Viable Product.) It takes a few days to build this. It's shippable.

Now, we need to rework it. In this case, throw most of the first release away. Who has the stomach for this? It's essential, but it's also very difficult.

A lot of good ideas from this blog post are not in the code. And this is the way a lot of commercial software happens: MVP and move forward with no time for rework.

Tuesday, May 2, 2017

Functional Python, Literate Programming & Trello Board Analysis

The general advice to people using Kanban/Agile Project boards of various kinds is this:

Stop Starting -- Start Finishing


etc.

There's a lot of this advice. Some of it is helpful.

Many tools have various dashboards and metrics computations.

However.

The basic velocity calculations -- starts v. finishes -- is pretty straight-forward. The rules to classify a Trello action as "start" or "finish" are actually nice examples of simple functions or lambdas. Which also means that the basic pipeline required to gather the data can be written as a lazy, functional process.

Which leads to writing a Literate Programming version of a small program that gathers data from a Trello board.

https://github.com/slott56/Trello-Action-Counts

It's a kind of deep-dive into some aspects of Python functional-style programming. It's also a dive into Literate Programming via a longish example. And it has a fair number of type hints. It's not perfectly clean from MyPy-'s analysis. So there's some more to do on that front.

Wednesday, April 19, 2017

AWS "Serverless" Architecture Update -- Python 3.6 News

This: https://aws.amazon.com/about-aws/whats-new/2017/04/aws-lambda-supports-python-3-6/

You can now use Python 3.6 Lambdas. This changes things dramatically. We can now write faster, simpler, less quirky code using the latest-and-greatest Python.

If you want to configure a server in the cloud, consider this: https://wiki.ubuntu.com/Python. Use Ubuntu as the base image. Faster. Cleaner. Less Quirky.

Tuesday, February 14, 2017

Intro to Python CSV Processing for Actual Beginners

I've written a lot about CSV processing. Here are some examples http://slott-softwarearchitect.blogspot.com/search/label/csv.

It crops up in my books. A lot.

In all cases, though, I make the implicit assumption that my readers already know a lot of Python. This is a disservice to anyone who's getting started.

Getting Started

You'll need Python 3.6. Nothing else will do if you're starting out.

Go to https://www.continuum.io/downloads and get Python 3.6. You can get the small "miniconda" version to start with. It has some of what you'll need to hack around with CSV files. The full Anaconda version contains a mountain of cool stuff, but it's a big download.

Once you have Python installed, what next? To be sure things are running do this:
  1. Find a command line prompt (terminal window, cmd.exe, whatever it's called on your OS.)
  2. Enter python3.6 (or just python in Windows.)
  3. If Anaconda installed everything properly, you'll have an interaction that looks like this:

MacBookPro-SLott:Python2v3 slott$ python3.5
Python 3.5.1 (v3.5.1:37a07cee5969, Dec  5 2015, 21:12:44) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

More-or-less. (Yes, the example shows 3.5.1 even though I said you should get 3.6. As soon as the Lynda.com course drops, I'll upgrade. The differences between 3.5 and 3.6 are almost invisible.)

Here's your first interaction.

>>> 355/113
3.1415929203539825

Yep. Python did math. Stuff is happening.

Here's some more.

>>> exit
Use exit() or Ctrl-D (i.e. EOF) to exit
>>> exit()

Okay. That was fun. But it's not data wrangling. When do we get to the good stuff?

To Script or Not To Script

We have two paths when it comes to scripting. You can write script files and run them. This is pretty normal application development stuff. It works well. 

Or.

You can use a Jupyter Notebook. This isn't exactly a script. But. You can use it like a script. It's a good place to start building some code that's useful. You can rerun some (or all) of the notebook to make it script-like.

If you downloaded Anaconda, you have Jupyter. Done. Skip over the next part on installing Jupyter.

Installing Jupyter

If you did not download the full Anaconda -- perhaps because you used the miniconda -- you'll need to add Jupyter.  You can use the command conda install jupyter for this.

Another choice is to use the PIP program to install jupyter. The net effect is the same. It starts like this


MacBookPro-SLott:Python2v3 slott$ pip3 install jupyter
Collecting jupyter
  Downloading jupyter-1.0.0-py2.py3-none-any.whl
Collecting ipykernel (from jupyter)
  Downloading ipykernel-4.5.2-py2.py3-none-any.whl (98kB)

    100% |████████████████████████████████| 102kB 1.3MB/s 

It ends like this.

  Downloading pyparsing-2.1.10-py2.py3-none-any.whl (56kB)
    100% |████████████████████████████████| 61kB 2.1MB/s 
Installing collected packages: ipython-genutils, decorator, traitlets, appnope, appdirs, pyparsing, packaging, setuptools, ptyprocess, pexpect, simplegeneric, wcwidth, prompt-toolkit, pickleshare, ipython, jupyter-core, pyzmq, jupyter-client, tornado, ipykernel, qtconsole, terminado, nbformat, entrypoints, mistune, pandocfilters, testpath, bleach, nbconvert, notebook, widgetsnbextension, ipywidgets, jupyter-console, jupyter
  Found existing installation: setuptools 18.2
    Uninstalling setuptools-18.2:
      Successfully uninstalled setuptools-18.2
  Running setup.py install for simplegeneric ... done
  Running setup.py install for tornado ... done
  Running setup.py install for terminado ... done
  Running setup.py install for pandocfilters ... done
Successfully installed appdirs-1.4.0 appnope-0.1.0 bleach-1.5.0 decorator-4.0.11 entrypoints-0.2.2 ipykernel-4.5.2 ipython-5.2.2 ipython-genutils-0.1.0 ipywidgets-5.2.2 jupyter-1.0.0 jupyter-client-4.4.0 jupyter-console-5.1.0 jupyter-core-4.2.1 mistune-0.7.3 nbconvert-5.1.1 nbformat-4.2.0 notebook-4.4.1 packaging-16.8 pandocfilters-1.4.1 pexpect-4.2.1 pickleshare-0.7.4 prompt-toolkit-1.0.13 ptyprocess-0.5.1 pyparsing-2.1.10 pyzmq-16.0.2 qtconsole-4.2.1 setuptools-34.1.1 simplegeneric-0.8.1 terminado-0.6 testpath-0.3 tornado-4.4.2 traitlets-4.3.1 wcwidth-0.1.7 widgetsnbextension-1.2.6



Now you have Jupyter.

What just happened? You installed a large number of Python packages. All of those packages were required to run Jupyter. You can see jupyter-1.0.0 hidden in the list of packages that were installed.

Starting Jupyter

The Jupyter tool does a number of things. We're going to use the notebook feature to save some code that we can rerun. We can also save notes and do other things in the notebook. When you start the notebook, two things will happen.
  1. The terminal window will start displaying the Jupyter console log.
  2. A browser will pop open showing the local Jupyter notebook home page.
Here's what the console log looks like:

MacBookPro-SLott:Python2v3 slott$ jupyter notebook
[I 08:51:56.746 NotebookApp] Writing notebook server cookie secret to /Users/slott/Library/Jupyter/runtime/notebook_cookie_secret
[I 08:51:56.778 NotebookApp] Serving notebooks from local directory: /Users/slott/Documents/Writing/Python/Python2v3
[I 08:51:56.778 NotebookApp] 0 active kernels 
[I 08:51:56.778 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=2eb40fbb96d7788dd05a49600b1fca4e07cd9c8fe931f9af
[I 08:51:56.778 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

You can glance at it to see that things are still working. The "Use Control-C to stop this server" is a reminder of how to stop things when you're done.

Your Jupyter home page will have this logo in the corner. Things are working.


You can pick files from this list and edit them. And -- important for what we're going to do -- you can create new notebooks.

On the right side of the web page, you'll see this:


You can create files and folders. That's cool. You can create an interactive terminal session. That's also cool. More important, though, is that you can create a new Python 3 notebook. That's were we'll wrangle with CSV files.

"But Wait," you say. "What directory is it using for this?"

The jupyter server is using the current working directory when you started it.

If you don't like this choice, you have two alternatives.
  • Stop Jupyter. Change directory to your preferred place to keep files. Restart Jupyter.
  • Stop Jupyter. Include the --notebook-dir=your_working_directory option.
The second choice looks like this:

MacBookPro-SLott:Python2v3 slott$ jupyter notebook --notebook-dir=~/Documents/Writing/Python
[I 11:15:42.964 NotebookApp] Serving notebooks from local directory: /Users/slott/Documents/Writing/Python

Now you know where your files are going to be. You can make sure that your .CSV files are here. You will have your ".ipynb" files here also. Lots of goodness in the right place.

Using Jupyter

Here's what a notebook looks like. Here's a screen shot.


First. The notebook was originally called "untitled" which seemed less than ideal. So I clicked on the name and changed it to "csv_wrestling".

Second. There was a box labeled In [ ]:. I entered some Python code to the right of this label. Then I clicked the run cell icon. (It's similar to this emoji --  ⏯ -- but not exactly.)

The In [ ]: changed to In [1]:. A second box appeared labeled Out [1]:. This annotates our dialog with Python: each input and Python's response is tracked. It's pretty nice. We can change our input and rerun the cell. We can add new cells with different things to run. We can run all of the cells. Lots of things are possible based on this idea of a cell with our command. When we run a cell, Python processes the command and we see the output.

For many expressions, a value is displayed.  For some expressions, however, nothing is displayed. For complete statements, nothing is displayed. This means we'll often have to throw the name of a variable in to see the value of that variable.


The rest of the notebook is published separately. It's awkward to work in Blogger when describing a Jupyter notebook. It's much easier to simply post the notebook in GitHub.

The notebook is published here: slott56/introduction-python-csv. You can follow the notebook to build your own copy which reads and writes CSV files.


Thursday, January 8, 2015

The Python Challenge

See http://www.pythonchallenge.com

Addicting. For folks (like me) who like this kind of thing. For others, perhaps just dumb. Or infuriating.

Years ago -- many, many years ago -- I vaguely remember a similar game with a name like "insanity" or something like that. Now there's http://www.notpron.com and http://www.weffriddles.com. All of these are "show the page source" HTML games. These games are a kind of steganography: the page your browser renders isn't what you need to see.

What's important about the Python Challenge is that it's not specifically about Python. Any programming language would do. Although I suspect that folks who don't know Python will have a difficult time with some of the puzzles. I found that having Pillow was essential for problems 7 and 11. I'm sure there are packages as powerful as PIL/Pillow for other languages.

Also, one of the hints included dated Python 2.7 code. The rest of the problems, however, seem to fit perfectly well with Python 3.4.

I wasted a morning getting to challenge 11. It was a ton of fun.

Challenge 12 was the first of the show-stoppers. The hint "evil1.jpg" is beyond subtle. Let me add this hint: This is the first puzzle where the pictures have digits. Perhaps there are related pictures.

I spent hours studying and rearranging and filtering and enhancing evil1.jpg before I finally broke down and searched for a hint. The hint -- of course -- included the whole solution, so I had to skim the code to figure out what I'd missed.

Challenges 14, 15, and 16 require additional hints, also. 14, for example, needs a reminder that the pixels need to be spiraled. Challenge 15 barely requires minimal programming and a lot of Google searching for famous people's birthdays. Challenge 16's hint is as opaque as the picture. It involves restructuring the image. But. I had to resort to reading more of the http://intelligentgeek.blogspot.com/2006/03/python-challenge-16-ahh-i-finally.html than for other problems.

I have chapters to review. I really shouldn't be playing around with silliness like this.

In spite of that, let me just say, that reading about the "Look-and-Say" sequence was a bunch of fun. See http://oeis.org/A005150. Whatever you do, avoid reading this: http://archive.lib.msu.edu/crcmath/math/math/c/c671.htm; it won't help you with the Python Challenge at all. But it's interesting. And a huge time-waster. This particular challenge was more like Project Euler problems. [Project Euler is back up and running, BTW.]

Here's my variation on the Conway sequence theme:


    def say( digits ):

        def run_lengths(digits):
            d_iter= iter(digits)
            c, d0 = 1, next(d_iter)
            for d in d_iter:
                if d0 == d:
                    c += 1
                else:
                    yield str(c)+d0
                    c, d0= 1, d
            yield str(c)+d0

        return "".join(run_lengths(digits))


I'm a fan of generator functions. A big fan.

The interesting part is that we can do run-length encoding for the look-and-say function relatively simply using the "buffered generator" design pattern.

1. Seed the buffer with the head of the sequence, next(d_iter)
2. For each item in the tail of the sequence:
    a. If it matches, count.
    b. If it doesn't match, yield the interim reduction and reset the counter.
3. Yield the tail reduction.

This design pattern seems to occur in a number of contexts outside games and abstract math.

Thursday, October 16, 2014

Using Bottle as a miniature demo server

Let's talk small.

When writing API's, it sometimes helps to have a small demo web site to show the API in a context that's easy to visualize. API's are sometimes abstract, and without an application to provide some context, it can be unclear why the path looks like that or why the JSON document has those fields.

I want to emphasize the "small" part of the small demo. A small page or two with some HTML forms and a submit button. Really small.

The actual customer-facing apps (mobile, mobile web, and full web site) are being built by lots of other people. Not us. They're big. We build the API's (there are a lot) that support the data structures that support the processing that supports the user experience.

Building fake mobile apps is right out. We're not going to lard on Android SDK or Xcode development environments to our already overburdened laptops. We build backend API's.

Building a fake mobile web or full web site is appealing. What makes it complex is the UX folks are building everything in Angular.js. If we want to properly implement a form, we would have to master Angular just to do a demo for the product owner.

No thanks. Still too far afield for API developers. We're focused on mongo and JSON and performance and scalability. Not Angular.js and the UX.

What we want to do is build a small web server which runs just a few pages plucked out of the UX demo code so that we can show how interactions with a web page put stuff in a database. And vice-versa: stuff in the database will show up on a web page.

"Really?" we get asked. Some folks look askance at us for wanting to put a small demo site together.

"Yes," we answer. "Our product owner has a big vision and we're breaking that into a bunch of little API's. It's not perfectly clear how we're building up to that vision."

It's not perfectly clear how some of this work. Folks outside the scrum team have distracting questions. We want to have a page or two where we can fill in a form and click submit and stuff happens. This is far easier to explain than showing them Postman or SoapUI and claiming that this will support some user stories.

And as we grow toward the epic, the workflow aspects of this will grow. The stuff that admin "A" does after user "U" has made an initial request. Or the stuff that internal user "I" does after external user "X" has done something. But really, it's just a few small web pages. Small.

Imagine the demo. On laptop #1, we'll show user "X". On laptop #2, we're running a Mongo shell to query what's in the db. On laptop #3 we're showing user "I". The focus is really the API's. And how the API's add up to an epic collection of stories.

Serving some HTML pages

Just to make it painful, we can't simply grab the demo web pages out of the UX team's SVN repository. Why not? First, it's an Angular app. We can't just grab some HTML and go. The demo pages are served via node.js with Bower, so it's not even clear (to us) what the complete deployment looks like.

So. We cheated. We took a screen shot. We trimmed the edges of the page as .PNG files. We wrote our own form and cobbled in enough CSS to make it look close. We're not here to fake out the UX. We just want to enter some data and have it tickle our API. (Indeed, we have a "Not The Real Experience" on some pages.)

Initially, some of the team members tried serving these small pages with WebLogic. Then Jetty. It's not bad. But it's Java. It takes forever to build and deploy something after a trivial change. There are a lot of moving parts even with Jetty, and not all of them are obvious.

Since we're building "enterprise" API's, we're deeply enmeshed with every feature of the Spring Framework. Our STS/Eclipse environments are fat with add-ons and features.

While the Spring Framework ideal is to allow a developer to focus on relevant details and have the irrelevant details handled automagically, the magic almost gets in the way. These are small applications that are little more than a few static pages with forms and a submit button. Spring can do it, of course. But we're often testing our the actual API's in a Jetty server (or two). If the demo site requires yet another instance of Jetty with yet another configuration, our ability to cope diminishes.

How can we get back to small?

Python and Bottle

Python has several web servers built-in. We can use http.server. We can use wsgiref. Both of these are almost OK for what we want to do.

We can do better with two small downloads: Bottle and Jinja2. With these, we can build simple HTML pages that show some data. We can build simple servers that collect form data, use http.client to make API requests, and write copious logging details. We can write little bottle apps that handle just GETs and POSTs simply.

This is suitably small.

We can share the module with the Bottle object and the HTML mock-up pages. We can fire up the app in an instant on anyone's laptop, no matter what else they're running. We can tweak the server to adjust the logging or the API request or the form.

We actually run the server from within Idle. Make a change and hit F5 to redeploy after a change. It's small. It's fast. And it doesn't involve the huge complexities associated with Java.

Bottle doesn't do much. But what little it does do is a pretty tidy fit with tiny little demonstrations of super-simple HTML interactions.

Thursday, October 2, 2014

Not sure what went wrong, but...

Read this: http://quantlabs.net/blog/2014/09/here-is-why-i-gave-up-on-python-aka-dogs-breakfeast-of-a-so-called-programming-language/#sthash.Rp7pXObf.dpuf

Not sure what's going on here.

"Script I want to run" seemed clear. http://vispy.org/examples/basics/scene/surface_plot.html#sthash.kIzbd33O.dpuf

The rest seemed like ill-advised trips down numerous ratholes. In particular, anything that involved Python Tools for Visual Studio seems like a waste of time and brain calories.

It's not clear at all what's not working. That's perhaps the most frustrating thing about this kind of post.

The final note, "Decisions decisions..." pointed out a simple confusion that befuddles the technically-minded. Too many details.

The decision between Python2 and 3 is trivial. There are a lot of details, but they're irrelevant for making the decision.

What package are you trying to use? It's hard to tell, but it looks like it's vispy. If so, that's all that matters. vispy works with Python3.3, requires numpy, and a "backend." Install just that and nothing more. In particular, avoid junk like Visual Studio.

The Dog's Breakfast seems to be the result of chasing down lots of details that aren't too relevant. It's hard to tell. But a scatter-shot post claiming "all this is broken" is a hint that the author wasn't simply following the vispy installation instructions. It could be that they turned something simple into a dog's breakfast by chasing irrelevant technologies all around the garden.

Thursday, April 3, 2014

Mastering Object-Oriented Python

See http://www.packtpub.com/mastering-object-oriented-python/book

Coming soon.

This is relatively deep, under-the-hood stuff for folks who want to master the Python feature set.

Here's the overview of what you get:

  • 0 Some Preliminaries 3 examples, 56 lines
  • 1 The __init__() Method 55 examples, 351 lines
  • 2 Integrating Seamlessly with Python: Basic Special Methods 92 examples, 558 lines
  • 3 Attribute Access, Properties, and Descriptors 33 examples, 310 lines
  • 4 The ABC's of Consistent Design 18 examples, 108 lines
  • 5 Using Callables and Contexts 17 examples, 214 lines
  • 6 Creating Containers and Collections 50 examples, 438 lines
  • 7 Creating Numbers 12 examples, 232 lines
  • 8 Decorators And Mixins – Cross Cutting Aspects 39 examples, 233 lines
  • 9 Serializing and Saving: JSON, YAML, Pickle, CSV and XML 77 examples, 648 lines
  • 10 Storing and Retrieving Objects via shelve 34 examples, 272 lines
  • 11 Storing and Retrieving Objects via SQLite 45 examples, 410 lines
  • 12 Transmitting and Sharing Objects 38 examples, 388 lines
  • 13 Configuration Files and Persistence  59 examples, 490 lines
  • 14 The Logging and Warning Modules 46 examples, 343 lines
  • 15 Designing for Testability 38 examples, 393 lines
  • 16 Coping With The Command Line  42 examples, 222 lines
  • 17 Module and Package Design  31 examples, 93 lines
  • 18 Quality and Documentation  42 examples, 269 lines
  • Preface 3 examples, 12 lines
  • Bonus Chapter 1 Archives and Directories  11 examples, 119 lines
  • Bonus Chapter 2 Case Study: Document Analysis  39 examples, 308 lines

824 examples, 6467 lines

Yes. That's a lot of code. It's relentless.