Proto Cache: Flags and Hooks

Today’s Updates

Last week we made our Pub/Sub application use protocol buffer objects for most of its internal state. This week we’ll take advantage of that change by setting startup and shutdown hooks to load state and save state respectively. We will add flags so someone starting up our application can set the load and save files on the command line. We will then package our application into an executable with a new asdf command.

Code Changes

Proto-cache.lisp

Defpackage Updates:

We will use ace.core.hook to implement our load and exit hooks. We will show how to make methods that will run at load and exit time when we use this library in the code below. In the defpackage we use the nickname hook. The library is available in the ace.core repository.

We use ace.flag as our command line flag parsing library. This is a command line flag library used extensively at Google for our lisp executables. The library can be found in the ace.flag repository.

Flag definitions:

We define three command line flags:

  • flag::*load-file*
  • flag::*save-file*
  • flag::*new-subscriber* 
    • This flag is used for testing purposes. It should be removed in the future.
  • flag::*help*

The definitions all look the same, we will look at flag::*load-file* as an example:

(flag:define flag::*load-file* ""
  "Specifies the file from which to load the PROTO-CACHE on start up."
  :type string)
  • We use the flag:define macro to define a flag. Please see the code for complete documentation of this macro (REAME.md update coming). We only use a small subset of the ace.flag package.
  • flag::*load-file*: This is the global where the parsed command line flag will be stored.
  • The documentation string to document the flag. If flag:print-help is called this documentation will be printed:

    --load-file (Determines the file to load PROTO-CACHE from on startup)

     Type: STRING

  • :type : The type of the flag. Here we have a string.

We use the symbol-name string of the global in lowercase as the command line input. 

For example:

  1. flag::*load-file* becomes --load-file
  2. flag::*load_file* becomes –load_file

The :name or :names key in the flag:define macro will let users select their own names for the command line input instead of this default.

Main definition:

We want to create a binary for our application. Since we have no way to add publishers and subscribers outside of the repl we define a dummy main that adds publishers and subscribers for us:

(defun main ()
  (register-publisher "pika" "chu")
  (register-subscriber "pika" flag::*new-subscriber*)
  (update-publisher-any
    "pika" "chu"
    (google:make-any :type-url "a"))
  ;; Sleep to make sure running threads exit.
  (sleep 2))

After running the application we can check for a new subscriber URL in the saved proto-cache application state file. I will show this shortly.

Load/Exit hooks:

We have several pre-made hooks defined in ace.core.hook. Two useful functions are ace.core.hook:at-restart and ace.core.hook:at-exit. As one can imagine, at-restart runs when the lisp image starts up, and at-exit runs when the lisp image is about to exit.

The first thing we do when we start our application is parse our command line:

(defmethod hook::at-restart parse-command-line ()
  "Parse the command line flags."
  (flag:parse-command-line)
  (when flag::*help*
    (flag:print-help)))

You MUST call flag:parse-command-line for the defined command line flags to have non default values.

We also print a help menu  if --help was passed in.

Then we can load our proto if the load-file flag was passed in:

(defmethod hook::at-restart load-proto-cache :after parse-command-line  ()
  "Load the command line specified file at startup."
  (when (string/= flag::*load-file* "")
    (load-state-from-file :filename flag::*load-file*)))                                                                                            

We see an :after clause in our defmethod. We want the load-proto-cache method called during start-up but after we have parsed the command line so flag::*load-file* has been properly set. 

Note: The defmethod here uses a special defmethod syntax added in ace.core.hook. Please see the hook-method documentation for complete details.

Finally we save our image state at exit:

(defmethod hook::at-exit save-proto-cache ()
  "Save the command line specified file at exit."
  (when (string/= flag::*save-file* "")
    (save-state-to-file :filename flag::*save-file*)))

The attentive reader will notice our main function never explicitly called any of these hook functions…

Proto-cache.asd:

We add code to build an executable using asdf:

(defpackage :proto-cache …
  :build-operation "program-op"
  :build-pathname "proto-cache"
  :entry-point "proto-cache:main")

This is a program-op. The executable pathname is relative, we save the binary as “proto-cache” in the same directory as our proto-cache code. The entry point function is proto-cache:main.

We may then call: 

sbcl --eval "(asdf:operate :build-op :proto-cache)" 

at the command line to create our binary.

Running our binary:

With our binary built we can call:

./proto-cache  --save-file /tmp/first.proto --new-subscriber http://www.google.com

Trying cat /tmp/first.pb:

pika'
http://www.google.com
a?pika"chujg

These are serialized values so one shouldn’t try to understand the output so much. We can see “http://www.google.com”, “pika”, and “chu” are all saved.

Calling

./proto-cache   --load-file /tmp/first.pb --save-file /tmp/first.pb --new-subscriber http://www.altavista.com

And then cat /tmp/first.pb:

I
pikaA
?http://www.altavista.com
http://www.google.com
a?pika"chujg
“

Finally calling  ./proto-cache  --help

We get:

Flags from ace.flag:

    --lisp-global-flags
     (When provided, allows specifying global and special variables as a flag on the command line.
       The values are NIL - for none, :external - for package external, and T - for all flags.)
     Type: ACE.FLAG::GLOBAL-FLAGS

    --help (Whether to print help) Type: BOOLEAN Value: T

    --load-file (Determines the file to load PROTO-CACHE from on startup)
     Type: STRING
     Value: ""

    --new-subscriber (URL for a new subscriber, just for testing)
     Type: STRING
     Value: ""

    --lisp-normalize-flags
     (When non-nil the parsed flags will be transformed into a normalized form.
       The normalized form contains hyphens in place of underscores, trims '*' characters,
       and puts the name into lower case for flags names longer than one character.)
     Type: BOOLEAN

    --save-file (Determines the file to save PROTO-CACHE from on shutdown)
     Type: STRING
     Value: ""

This shows our provided documentation of the command line flags as expected.

Conclusions:

Today we added command line flags, load and exit hooks, and made our application buildable as an executable. We can build our executable and distribute it as we see fit. We can direct it to load and save the application state to user specified files without updating the code. There is still much to do before it’s done but this is slowly becoming a usable application.

There are a few additions I would like to make, but I have a second child coming soon. This may (or may not) be my last technical blog post for quite some time. I hope this sequence of Proto Cache posts has been useful thus far, and I hope to have more in the future.

Thanks to Ron Gut and Carl Gay for copious edits and comments.

Proto Cache: Saving State

Todays Updates:

In our last post we implemented a basic Pub Sub application that stores an Any protocol buffer message and a list of subscribers. When the Any protocol buffer message gets updated we send the new Any message in the body of an http request to all of the subscribers in the subscribe-list. 

Today we will update our service to save all of the state in a protocol buffer message. We will also add functionality to save and load the state of the Proto Cache application. 

Note: Viewing the previous post is highly suggested!

Code Updates:

Note: We use red to denote removed code and green to denote added code.

pub-sub-details.proto

`syntax = proto3`

We will use proto3 syntax. I’ve yet to find a great reason to choose proto3 over proto2, but I’ve also yet to find a great reason to choose proto2 over proto3. The biggest reason to choose proto3 over proto2 is that most people use proto3, but the Any proto will store proto2 or proto3 messages regardless.

import “any.proto”

Our users are publishing Any messages to their clients, so we must store them in our application state. This requires us to include the any.proto file in our proto file.

message PubSubDetails

This contains (almost) all of the state needed for the publish subscribe service for one user:

  • repeated string subscriber_list
  • google.protobuf.Any current_message
    • This is the latest Any message that the publisher has stored in the Proto Cache.
  • string username
  • string password
    • For any kind of production use this should be salted and hashed. 

message PubSubDetailsCache

This message contains one entry, a map from a string (which will be a username for a publisher) to a PubSubDetails instance. The attentive reader will notice that we save the username twice, once in the PubSubDetails message and once in the PubSubDetailsCache map as the key. This will be explained when we discuss changes to the proto-cache.lisp file.

proto-cache.asd

The only difference in proto-cache.asd from all of the other asd files we’ve seen using protocol buffers is the use of a protocol buffer message in a package different from our current package. That is, any.proto resides in the cl-protobufs package but we are including it in the pub-sub-details.proto file in proto-cache.

To allow the protoc compiler to find the any.proto file we give it a :proto-search-path containing the path to the any.proto file. 


...
    :components
    ((:protobuf-source-file "pub-sub-details"
      :proto-pathname "pub-sub-details.proto"
      :proto-search-path ("../cl-protobufs/google/protobuf/"))
...

Note: We use a relative path: “../cl-protobufs/google/protobuf/”, which may not work for you. Please adjust to reflect your set-up.

We don’t need a component in our defsystem to load the any.proto file into our lisp image since it’s already loaded by cl-protobufs. We might want to just to recognize the direct dependency of the any.proto file. 

proto-cache.lisp

Defpackage updates:

We are adding new user invokable functionality so we export:

  • save-state-to-file
  • load-state-from-file

local-nicknames:

  • cl-protobufs.pub-sub-details as psd
    • This is merely to save typing. The cl-protobufs.pub-sub-details is the package that contains the functionality derived from pub-sub-details.proto.

Globals:

*cache*: This will be a protocol buffer message containing a hash table with string keys and pub-sub-details messages. 

(defvar *cache* (make-hash-table :test 'equal))
(defvar *cache* (psd:make-pub-sub-details-cache))

*mutex-for-pub-sub-details*: Protocol buffer messages can’t store lisp mutexes. Instead, we store the mutex for a pub-sub-details in a new hash-table with string (username) keys.

make-pub-sub-details:

This function makes a psd:pub-sub-details protocol buffer message. It’s almost the same as the previous iteration of pub-sub-details except for the addition of username.


...
  (make-instance 'pub-sub-details :password password))
  (psd:make-pub-sub-details :username username
                            :password password
                            :current-any (google:make-any))
...

(defmethod (setf psd:current-any) (new-value (psd psd:pub-sub-details))

This is really a family of functions:

  • :around: When someone tries to set the current-message value on a pub-sub-details struct we want to write-protect the pub-sub-details entry. We use an around method which activates before any call to the psd:current-any setter. Here we take the username from the pub-sub-details message and write-hold the corresponding mutex in the *mutex-for-pub-sub-details* global hash-table. Then we call call-next-method which will call the main (setf current-any) method.
(defmethod (setf current-any) (new-value (psd pub-sub-details))
(defmethod (setf psd:current-any) :around (new-value (psd psd:pub-sub-details))
  • (setf psd:current-any): This is the actual defmethod defined in cl-protobufs.pub-sub-details. It sets the current-messaeg slot on the message struct.
  • :after: This occurs after the current-any setter was called. We send an http call to all of the subscribers on the pub-sub-details subscriber list. Minus the addition of the psd package prefix to accessor functions of pub-sub-details this function wasn’t changed.

 register-publisher:

The main differences between the last iteration of proto-cache and this one are:

  1. This *-gethash method is exported by cl-protobufs.pub-sub-details so the user can call gethash on the hash-table in a map field of a protocol buffer message.
    • (gethash username *cache*)
    • (psd:pub-sub-cache-gethash username *cache*)
  2. We add a mutex to the *mutex-for-pub-sub-details* hash-table with the key being the username string sent to register-publisher.
  3. We return t if the new user was registered successfully, nil otherwise.

register-subscriber and update-publisher-any:

  1. The main difference here is:
    1. (gethash publisher *cache*)
    2. (psd:pub-sub-cache-gethash publisher *cache*)
  2. We have to use the psd package prefix to all of the accessors to pub-sub-details

save-state-to-file:

(defun save-state-to-file (&key (filename "/tmp/proto-cache.txt"))
  "Save the current state of the proto cache to *cache* global
   to FILENAME as a serialized protocol buffer message."
  (act:with-frmutex-read (*cache-mutex*)
    (with-open-file (stream filename :direction :output
                                     :element-type '(unsigned-byte 8))
      (cl-protobufs:serialize-to-stream stream *cache*))))

This is a function that accepts a filename as a string, opens the file for output, and calls cl-protobufs:serialize-to-stream. This is all we need to do to save the state of our applications!

load-state-from-file:

We need to do three things:

  1. Open a file for reading and deserialize the Proto Cache state saved by save-sate-to-file
  2. Create a new map containing the mutexes for each username.
  3. Set the new state into the *cache* global and the new mutex hash-table in *mutex-for-pub-sub-details*.
    1. We do write-hold the *cache-mutex* but I would suggest only loading the saved state when Proto Cache is started.
(defun load-state-from-file (&key (filename "/tmp/proto-cache.txt"))                                                                                   
  "Load the saved *cache* globals from FILENAME. Also creates                                                                                          
   all of the fr-mutexes that should be in *mutex-for-pub-sub-details*."
  (let ((new-cache
          (with-open-file (stream filename :element-type '(unsigned-byte 8))
            (cl-protobufs:deserialize-from-stream
              'psd:pub-sub-details-cache :stream stream)))
        (new-mutex-for-pub-sub-details (make-hash-table :test 'equal)))
    (loop for key being the hash-keys of (psd:pub-sub-cache new-cache)
          do
             (setf (gethash key new-mutex-for-pub-sub-details)
                   (act:make-frmutex)))
    (act:with-frmutex-write (*cache-mutex*)
      (setf *mutex-for-pub-sub-details* new-mutex-for-pub-sub-details
            *cache* new-cache))))

Conclusion:

The main update we made today was defining pub-sub-details in a .proto file instead of a Common Lisp defclass form. The biggest downside is the requirement to save the pub-sub-details mutex in a separate hash-table. For this cost, we:

  1. Gained the ability to save our application state with one call to cl-protobufs:serialize-to-stream.
  2. Gained the ability to load our application with little more then one call to cl-protobufs:deserialize-from-stream.

We were also able to utilize the setf methods defined in cl-protobufs to create :around and :after methods.

Note: Nearly all services will be amenable to storing their state in protocol buffer messages.

I hope the reader has gained some insight into how they can use cl-protobufs in their application even if their application doesn’t make http-requests. Being able to save the state of a running program and load it for later use is very important in most applications, and protocol buffers make this task simple.

Thank you for reading!

Thanks to Ron, Carl, and Ben for edits!

Proto Cache: Implementing Basic Pub Sub

Today’s Updates

In our last post we saw some of the features of the ace.core.defun and ace.core.thread libraries by creating a thread-safe cache of the Any protocol buffer object. Today we are going to update the proto-cache repository to implement publisher/subscriber features. This will allow a publisher to publish a feed of Any messages and a subscriber to subscribe to such a  feed. 

It is expected (but not required) that the reader has read the previous post Proto Cache: A Caching Story. That post details some of the functions and objects you will see in today’s code.

Note: This is a basic implementation, not one ready for production use. This will serve as our working project going forward.

Code Updates

Proto-cache.asd

We want subscribers to be able to get new versions of an Any protocol buffer message. On the web, the usual way to receive messages is over HTTP. We use the Drakma HTTP client. You can see we added :drakma to the depends-on list in the defsystem.

Proto-cache.lisp

There are three major regions to this code. The first region is the global objects that make up the cache. The second is the definition of a new class, pub-sub-details. Finally the actual publisher-subscriber functions are at the bottom of the page.

Global objects:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

Pub-sub-details class:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

The pub-sub-details class contains the data we need to keep track of the publisher and subscriber features:

  • subscriber-list: This will be a list of the HTTP endpoints to send the Any messages to after the Any message is updated. Currently, we only allow for an HTTP message string. Future implementations should allow for security functionality on those endpoints.
  • current-any: The current Any message that the publisher has supplied.
  • mutex: A fr-mutex to protect the current-any slot. This should be read-held to get the current-any and it should be read-held to set a new current-any message.
  • password: The password for the subscriber held as a string. 

We shouldn’t be saving the password being as a string in the pub-sub-details class. At a minimum we should be salting and hashing this value. In the future we should implement an account system for readers and subscribers giving access to reading and updating the pub-sub-details. As this is only instructional and not production-ready code, I feel okay leaving it as is for the moment.

We create a make-pub-sub-details function that will create a pub-sub-details object with a given password. The register function doesn’t allow the user to set an Any message at creation time, and none of the other slots are useful to the publisher.

We create an accessor method to set the any-message value slot. We also create an :after method to send the Any message to any listening subscribers by iterating through the subscriber list and calling a drakma:http-request. We wrap this in unwind-protect so an IO failure doesn’t stop other subscribers from getting the message.

Finally we add a setter function for the subscriber list.

Function definitions:

Register-publisher:

This function is the registration point for a new publisher. It is almost the same as set-in-cache from our previous post except it checks that an entry in the cache for the soon-to-be-registered publisher doesn’t already exist. It would be bad to let a new publisher overwrite an existing publisher.

Register-subscriber:

Here we use a new macro, ace.core.etc:clet from the etc package in ace.core.

(defun register-subscriber (publisher address)
  "Register a new subscriber to a publisher."
  (ace:clet ((ps-struct
               (act:with-frmutex-read (*cache-mutex*)
                 (gethash publisher *cache*)))
             (ps-mutex (mutex ps-struct)))
    (act:with-frmutex-write (ps-mutex)
      (push address (subscriber-list ps-struct)))))

In the code below we search the cache for a user entry, if the entry is found then ps-struct will be non-nil and we can evaluate the body adding the subscriber to the list. If the subscriber is not found we return nil.

Update-publisher-any:

(defun update-publisher-any (username password any)
  "Updates the google:any message for a publisher
   with a specified username and password.
   The actual subscriber calls happen in a separate thread
   but 'T is returned to the user to indicate the any
   was truly updated."
  (ace:clet ((ps-class
              (act:with-frmutex-read (*cache-mutex*)
                (gethash username *cache*)))
             (correct-password (string= (password ps-class)
                                        password)))
    (declare (ignore correct-password))
    (act:make-thread
     (lambda (ps-class)
       (setf (current-any ps-class) any))
     :arguments (list ps-class))
    t))

In the update-publisher-any code we use clet to verify that the publisher exists and the password is found. We ignore the correct-password entry though.

We don’t want the publisher to be thread-blocked while we send the new message to all of the subscribers so we update the current-any in a separate thread. To do this we use the ace.core.thread function make-thread. A keen reader will see for SBCL this calls the sbcl make-thread function, otherwise it calls bordeaux-threads make-thread function.

If we are able to find a publisher with the correct password we return T to show success.

Conclusion

In today’s post we have made a basic publisher-subscriber library that will send an Any protocol buffer message to a list of subscribers. We have detailed some new functions that we used in ace.core. We have also listed some of the problems with this library. The code has evolved substantially from the previous post but it still has a long way to go before being production-ready.

Thank you for your reading!


Ron Gut, Carl Gay, and Ben Kuehnert gave comments and edits to this post.

Proto Cache: A Caching Story

What is Proto-Cache?

I’ve been working internally at Google to open source several libraries including cl-protobufs and a series of utility libraries we call “ace”. I wrote several blog posts making an HTTP server that takes in either protocol buffers or JSON strings and responds in kind. I think I have worked enough on Mortgage Server and wish to work on a different project.

Proto-cache will grow up to be a pub-sub system that takes in google.protobuf:any protos and send them to users over http requests. I’m developing it to showcase the ace.core library and the Any proto well-known-type. In this post we create a cache system which stores google.protobuf.any messages in a hash-table keyed off of a symbol.

The current incarnation of Proto Cache:

The code can be found here: https://github.com/Slids/proto-cache

Proto-cache.asd:

This is remarkable in-as-much as cl-protobufs isn’t required for the defsystem! It’s not required at all, but we do require the cl-protobufs.google.protobuf:any protocol buffer message object. Right now we are only adding and getting it from the cache. This allows us to store a protocol buffer message object that any user system can parse by calling unpack-any. We never have to understand the message inside.

Proto-cache.lisp:

The actual implementation. We give three different functions:

  • get-from-cache
  • set-in-cache
  • remove-from-cache

We also have a:

  • fast-read mutex
  • hash-table

Note: The ace.core library can be found at: https://github.com/cybersurf/ace.core

Fast-read mutex (fr-mutex):

The first interesting thing to note is the fast-read mutex. This can be found in the ace.core.thread package included in the ace.core utility library. This allows for mutex free reads of a protected region of code. One has to call:

  • (with-frmutex-read (fr-mutex) body)
  • (with-frmutex-write (fr-mutex) body)

If the body of with-frmutex-read is finished with nobody calling with-frmutex-write then the value is returned. If someone calls with-frmutex-write while another thread is in with-frmutex-read then the body of with-frmutex-read has to be re-run. One should be careful to not modify state in the with-frmutex-read body.

Discussion About the Individual Functions

get-from-cache:

(acd:defun* get-from-cache (key)
  "Get the any message from cache with KEY."
  (declare (acd:self (symbol) google:any))
  (act:with-frmutex-read (cache-mutex)
    (gethash key cache)))


This function uses the defun* form from ace.core.defun. It looks the same as a standard defun except has a new declare statement. The declare statement takes the form

(declare (acd:self (lambda-list-type-declarations) output-declaration))

In this function we state that the input KEY must be a symbol and the return value is going to be a google:any protobuf message. The output declaration is optional. For all of the options please see the macro definition for ace.core.defun:defun*.

The with-fr-mutex-read macro is also being used.

Note in the macro’s body we only do a simple accessor call into a hash-table. Safety is not guaranteed, only consistency.

set-in-cache:

(acd:defun* set-in-cache (key any)
  "Set the ANY message in cache with KEY."
  (declare (acd:self (symbol google:any) google:any))
  (act:with-frmutex-write (cache-mutex)
    (setf (gethash key cache) any)))

We see that the new defun* call is used. In this case we have two inputs, KEY will be a symbol ANY will be a google:any proto message. We also see that we will return a google:any proto message.

The with-frmutex-write macro is being used. The only thing that is done in the body is setting a cache value. If we try to get a message from the cache and set a message into the cache, it is possible a reader will have to read multiple times. In systems where readers are more common than writers fr-mutexes and spinlocking are much faster than having readers lock a mutex for every read..

remove-from-cache:

We omit this function in this write-up for brevity.

Conclusion:

Fast-read mutexes like the one found in ace.core.thread are incredibly useful tools. Having to access a mutex can be slow even in cases where that mutex is never locked. I believe this is one of the more useful additions in the ace.core library.

The new defun* macro found in ace.core.defun for creating function definitions is more mixed. I find a lack of clarity in mapping the lambda list s-expression in the defun statement to the s-expression in the declaration. Others may find it provides nicer syntax and the clarity is more obvious.

Future posts will show the use of the any protocol buffer message.

As usual Carl Gay gave copious edits and suggestions.

2021

Greeting Everyone.

It’s been a weird year for everyone. I don’t understand the idea that the 2020 year ending will make everything better, but it seems to be a popular idea. Please remember, the virus has gotten worse (UK variant). Please be careful, stay inside, isolate, and take the vaccine as soon as you can.


Okay, now onto some of my hopes and plans for 2021. First I have a reading list. Some of these have been started, but I hope to finish them in 2021:

  1. The Common Lisp Condition System.
    • Written by Michal Herda it’s a new book on the Lisp Condition System.
    • I’ll have a review out when I’m finished.
  2. Site Reliability Engineering: How Google Runs Production Systems and The Site Reliability Workbook: Practical Ways to Implement SRE.
    • These books go together in a complimentary way. I beleive all SWE’s should have knowledge and experience trying to keep their production systems working.
  3. Completely Bounded Maps and Operator Algebras
    • Book by Vern Paulsen, interestingly important in Quantum Information Theory.
Will Lyra make it though PCL in 2021?

Next, I’m buying a house! The house needs quite a bit of work, but it’s in a really nice neighborhood in Belmont MA. A town with great schools, nice parks, and some really cool Googlers! Take a look here:

https://www.zillow.com/homedetails/410-Pleasant-St-Belmont-MA-02478/56411531_zpid/

Wenwen, Dick, and Lyra leaving our new house.

Most importantly to have a great year with my family. Lyra is growing REALLY big. She runs around and plays, a perfect little one. Her sister Faye will be born at the end of january or beginning of february. Wenwen is tired.


There are some things I know I’ll miss in 2021. I won’t see my next intern, just like I never met my previous intern (Ben) in person. Due to the pandemic Google has interns working at home. Thankfully Ben got a job at Google so I may see him yet.

The European Lisp Symposium will be held online in 2021. I miss seeing Lispers from all kinds of backgrounds working in academia, startups, small businesses all the way to corporate monoliths. I miss the gathering, the online meeting isn’t the same.


Finally, I want to leave this year with something I learned.

Wenwen teaching online,

In 2020, as software engineers, we learned how to work in isolation. I don’t think remote work will or should be the norm, though I know lots of  different engineers disagree. We learn, make connections, gain understanding, and advance by meeting, discussing with, and learning from people. This will never be as constructive online as it will be in person.

I know from my wife’s work that students learn better online. I hope 2020 will show us that online work and education can and will be part of the future of education, but it will not be the future of work and education.


I hope everyone has a fantastic New Year.

Merry Christmas!

Greetings everyone!

Lyra on a box.

This will not be a programming post, or really a post of any technical or mathematical interest. I’m not entirely sure what the next technical post I will make is, but I am thinking.

As it is Christmas, I wanted to say some thanks.

Lyra’s first Christmas tree at home.

First, to Carl Gay, my coworker and mentor at Google. He’s been the person I’ve talked to the most from work over these past 9 months (and probably well before that as well). He’s much farther down his career then I am, but he’s been amazingly helpful and kind freind.

I have been blessed with many great co-workers at Google. Ron, Ted, Stephen, Rujith, etc. Thank you for making this strange work year as great as it was.

Also Google. They’ve given me months off to take care of my daughter and allowed my wife to continue working without strain on childcare. I know people have a lot of misgivings about Big Tech, but I truly beleive Google always tries to do whats right.

Lyra and Me at Google last year!

Next my parents. It was a tough year. We stayed at my moms for a bit in the summer, which allowed Lyra to play in giant fields and moo at giant cows. Sadly, we did not get to see my dad and Melinda. We miss them very much and look forward to seeing them in 2021.

Lyra, Grandma, Cows, and Me.

Finally, to my wife Wenwen and daughter Lyra, for making this year. For making our condo a home.

Again, Merry Christmas and if I don’t post again this year have a Happy New Year!

Lyra and Wenwen drawing,

Mortgage Server on a Raspberry Pi

In the last post we discussed creating a server to calculate an amortization schedule that takes and returns both protocol buffer messages and JSON. In this post we will discuss hosting this server on a Raspberry Pi. There are some pitfalls, and the story isn’t complete, but it’s still fairly compelling.

What We Will Use:

Hardware:

We will use a Raspberry Pi 3 model B as our server. We will use the stock operating system Raspbian. This SOC has a quad core 64-bit processor with floating point on chip. The operating system itself is 32-bit which makes the processor run on 32-bit mode.

Software:

We will be using SBCL as our Common Lisp, CL-PROTOBUFS as our protocol buffer and JSON library, and Hunchentoot as our web server.

Problems

1. SBCL on a Raspbian

When trying to run the mortgage-info server on Raspbian the first error I got was an inability to load the lisp file generated by protoc. On contacting Doug Katzman he noted I was running an old version of SBCL. The Raspbian apt-get repository has an old version of SBCL. If someone desires to run SBCL on a Raspberry Pi they should follow the binary installation instructions here: http://www.sbcl.org/getting.html.

2. CL-Protobufs on a 32-Bit OS

The cl-protobufs library has been optimized to run on a 64-bit x86 platform. The Raspberry Pi environment is 32-bit arm. As noted before, the 32-bit arm environment is supported by SBCL. I don’t think anyone has attempted to run cl-protobufs on the 32-bit arm environment running SBCL. After modifying cl-protobufs.asd to have float-bits.lisp loaded on SBCL not running in 64-bit we could quickload mortgage-info into a repl.

3. Bugs in the mortgage-info repo  

There were several bugs I fixed in my very limited testing of the mortgage info repo, as well as some bugs that are still existent. 

  1. When trying to set numbers in the proto message structs I had to coerce them to double-float. I’m not sure why… This works on SBCL running on the x86-64 without the coercions.
  2. A division by 0 bug if the entered interest rate is 0.
  3. The possibility of having 0 as the number of repayment periods. I added an assertion so we will return a 500 stating the assertion was hit. We should have a more graceful error message than a stack trace, but this is currently only a proof of concept.
  4. The mortgage.proto file had interest as an integer, but interest is usually a float divisible by .125. 
  5. We have rounding problems if the interest rate is too high (say 99%). We only ever pay interest and the amount never goes down, at least with a 300 payment period. This is most likely due to rounding, we do not accept fractional pennies. This is okay, if the national interest rate went anywhere near 99% we have BIG problems.

CL-protobufs on the Pi

I have cl-protobufs running on SBCL on the Raspberry Pi, but some of the tests don’t pass. I’m not sure if it would work on a 64-bit OS on the Raspberry Pi, I don’t have the inclination to get a 64-bit OS for my Pi. If you do, please tell me what happens!

I wasn’t able to get CCL on arm32 to load cl-protobufs. It gives an error saying it doesn’t have asdf 3.1. Quickloading asdf I get undefined function version<=. If any CCL folk has an idea about what’s going on, please send me a message.

Trying to run ABCL lead me to yet another bug: https://github.com/armedbear/abcl/issues/359

Running Server

My Raspberry Pi is running at: http://65.96.161.53:4242/mortgage-info

Feel free to send either JSON or protobuf messages to the server.

Example JSON:

{
“interest”:3,
“loan_amount”:380000,
“num_periods”:300
}

I don’t know how long I will keep it running. If it goes down and you are interested in sending it messages please send me an email.


Ron, Carl, and Ben edited this post (as usual). Doug provided a great deal of help with SBCL on ARM 32.

Lisp Mortgage Calculator Proto with JSON

I’ve finally found a house! Like many Googlers from Cambridge I will be moving to Belmont MA. With that being said, I have to get a mortgage. My wife noticed we don’t know much about mortgages, so she decided to do some research. I, being a mathematician and a programmer, decided to make a basic mortgage calculator that will tell you how much you will pay on your mortgage per month, and give you an approximate amortization schedule. Due to rounding it’s impossible to give an exact amortization schedule for every bank.

This post should explain three things:

  1. How to calculate your monthly payment given a fixed rate loan.
  2. How to create an amortization schedule.
  3. How to create an easy acceptor in Hunchentoot that takes either application/json or application/octet-stream.

Mathematical Finance

The actual formulas here come from the Pre Calculus for Economic Students course my wife teaches. The book is:

Applied Mathematics for the Managerial, Life, and Social Sciences, Soo T. Tan, Cengage Learning, Jan 1, 2015 – Mathematics – 1024 pages

With that out of the way we come to the Periodic Payment formula. We will assume you pay monthly and the interest rate is quoted for the year but calculated monthly. 

 Example:
 Interest rate of 3%
 Loan Amount 100,000$
 First Month Interest = $100,000*(.03/12) = $100,000*.0025= $250. 

 MonthlyPayment = \frac{LoanAmount * \frac{InterestRate}{12}} {1 - (1 + \frac{InterestRate}{12})^{NumberOfMonths}} 

I am not going to prove this, though the proof is not hard. I refer to the cited book section 4.3.

With this we can compute the amortization schedule iteratively. The interest paid for the first month is:

I_{1} = LoanAmount * \frac{InterestRate}{12}

The payment toward principal for the first month is:

PTP_{1} = MonthlyPayment - I_{1}

The interest paid for month j is:

I_{j} = \frac{InterestRate}{12}*(LoanAmount - \sum_{i=1}^{j-1}PTP_{i})

The payment toward principal for month j is:

PTP_{j} = MonthlyPayment - I_{j}

Since I_{j} relies on only the PTP(i) for 0<i<j and PTP_{1} is defined, we can compute them for any value we wish!

Creating the Mortgage Calculator

We will be creating a Huntchentoot server that will receive either JSON or octet-stream Protocol Buffer messages and return either JSON or octet-stream Protocol Buffer messages. My previous posts discussed creating Hunchentoot Acceptors and integrating Protocol Buffer Messages into a Lisp application. For a refresher please visit my Proto Over HTTPS.

mortgage.proto

When defining a system that sends and receives protocol buffers you must tell your consumers what those messages will be. We expect requests to be in the form of the  mortgage_information_request message and we will respond with mortgage_information message.

Note: With the cl-protobufs.json package we can send JSON requests that look like the protocol buffer message. So sending in:

{
 "interest":"3",
 "loan_amount":"380000",
 "num_periods":"300"
}

We can parse a mortgage_information. We will show how to do this shortly.

mortgage-info.lisp

Server Code:

There are two main portions of this file, the server creation section and the mortgage calculator section. We will start by discussing the server creation section by looking at the define-easy-handler macro.

We get the post body by calling (raw-post-data). This can either be in JSON or serialized protocol buffer format so we inspect the content-type http header with 

(cdr (assoc :content-type (headers-in *request*)))

If this header is “application/json” we turn the body into a string and call cl-protobufs.json:parse-json:

(let ((string-request 
        (flexi-streams:octets-to-string request)))
      (cl-protobufs.json:parse-json 
         'mf:mortgage-information-request
         :stream (make-string-input-stream 
                    string-request)))

Otherwise we assume it’s a serialized protocol buffer message and we call cl-protobufs:deserialize-from-stream.

The application code is the same either way; we will briefly discuss this later.

Finally, if we received a JSON object we return a JSON object. This can be done by calling cl-protobufs.json:print-json on the response object:

(setf (hunchentoot:content-type*) "application/json")
(let ((out-stream (make-string-output-stream)))
   (cl-protobufs.json:print-json response
      :stream out-stream)
   (get-output-stream-string out-stream))

Otherwise we return the response serialized to an octet vector using cl-protobufs:serialize-to-bytes.

Application Code:

For the most part, the application code is just the formulas described in the mathematical finance section but written in Lisp. The only problem is that representing currency as double-precision floating point is terrible. We make two simplifying assumptions:

  1. The currency uses two digits after the decimal.
  2. We floor to two digits after the decimal.

When we make our final amortization line we pay off the remaining principal. This means the repayment may not be the repayment amount for every other month, but it removes rounding errors. We may want to make a currency message for users to send us which specifies its own rounding and decimal places, or we could use the Google one that is not a well known type here. The ins-and-outs of currency programming wasn’t part of this blog post so please pardon the crudeness.

We create the mortgage_info message with the call to populate-mortgage-info:

  (let (...
         (response (populate-mortgage-info
                    (mf:loan-amount request)
                    (mf:interest request)
                    (mf:num-periods request)))) …)

We showed in the previous section how we convert JSON text or the serialized protocol buffer message into a protocol buffer message in lisp memory. This message was stored in the request variable. We also showed in the last section how the response variable will be returned to the caller as either a JSON string or a serialized protocol buffer message.


The author would like to thanks Ron Gut, Carl Gay, and Ben Kuehnert.

The Secretary Problem

I’ve been looking for houses lately. The general problem with house hunting is that there is a time limit which dictates how many houses you will see, and there will probably be a close to total order on your opinions of the houses. In layman’s terms, for all of the houses you look at each house will be better than some houses, and worse than the rest. My wife and I have debated how long we should look for a house. Thankfully this is nicely solved in mathematics.

The Secretary Problem:

Suppose you are trying to hire a secretary. You know you will interview 10 possible secretaries and you will have a total order in how much you like them. You must decide whether or not you should hire them at the end of each interview. What is the likelihood of choosing the top ranked secretary?

Problem Description and algorithm description.

To further explain each possible secretary you interview will have a rank from 1 to 10. When you interview them, you will not know how high they rank but you will know their score in relation to the other secretaries you have interviewed. When you interview candidate 1 you have no information. When you interview candidate 2 you know they are better or worse than candidate 1. When you interview candidate 3 you know how they relate to candidates 1 and 2. More information gives you more knowledge on the ranking, but less choices in who to hire.

Obviously there are many algorithms you could use to choose a secretary. You could choose the first secretary to come and interview, your chances of getting the optimal secretary is 10%. You could choose the first secretary better than the first, this will mean with 90% probability you will avoid the worst secretary!

The optimal probability of selecting the best secretary is 1/e. I’m not going to go into the proof, it’s not easy, but if you’re interested please check out the wikipedia page. The algorithm itself is quite simple. First we will generalize to having n secretaries come to interview.

  1. Check the first n/e applicants.
  2. Choose the next applicant who is better than the first n/e applicants.

Coding Experiment

We will generalize the optimal algorithm thusly:

  1. We will check the first k candidates of the n candidates.
  2. We will choose the first applicant who is better than the first k applicants.

We will create a permutation of {1,…,n}, get the max of the first k candidates, then get the first candidate who is higher than the max, if no such candidate exists we take the last candidate. We will return a bool determining whether this chosen candidate is ranked n.

The code can be found on my github account. We use Robert Smith’s cl-permutation library available on Quicklisp.

We see for 10 candidates and 100000 trials we get:

For 100 candidates and 100000 trials we get:

It’s interesting to note your chances of finding the optimal secretary increase quite quickly while increasing the number of people you interview, and decrease far slower after hitting the optimal stopping bound.

Takeaways:

As mathematics only approximates life, this doesn’t perfectly fit into my house search problem. I don’t know how many houses I will see, and I don’t know if house prices will increase or decrease over time. Also, I often don’t have to make a split-second decision right after I see a house. 

This does however give me a takeaway:

When searching for a house, do your due diligence and look at as many open houses as you can at first. Getting an idea of what you like and don’t like will help you find the house you want. Don’t wait too long though!

I would like to thank Ron, Carl, and Ben for the edits to this article.

Sending Protocol Buffers as an Octet Vector

In our previous posts on using Hunchentoot to send protocol buffer messages we turned them into base64-encoded strings and sent them as parameters in an HTTP post call. This allows us to send multiple protocol buffer messages in a single post call using multiple post parameters. In this post we will show how we can send a single protocol buffer message in the body of a post call as binary data instead of base64 encoding.

Note: I am new to using Hunchentoot, and would have started by sending an octet vector in the body of a post call if he knew how. On review the last blog post Carl Gay asked why this method wasn’t used, and the answer was due to lack of knowledge. After learning that one could use the `hunchentoot:raw-post-data` to access the post body I was able to write this simpler method.

Hello-world-client

The changes from our previous post where we turned our octet-vectors into base64 encoded strings to this post where we just send the octet vector can be found here.

ASD file

Since we are sending an octet-vector we no longer need to worry about flexi-streams, cl-base64, and protobuf-utilities. We removed them from the asd file. 

Implementation

This change is a dramatic simplification to our post call. All we have to do is use drakma to call our web server, setting the :content-type to application/octet-stream and :content to the serialized proto message. Since we assume the web server will be sending us application/octet-stream data we can deserialize the reply response and be one our way.

(response
           (cl-protobufs:deserialize-from-bytes
            'hwp:response
            (drakma:http-request
             address
             :content-type "application/octet-stream"
             :content
(cl-protobufs:serialize-to-bytes
proto-to-send)))

Hello-world-server

The changes from our previous post where we turned our base64 encoded strings into octet-vectors to this post where we just read the octet vector can be found here.

ASD file

Since we are sending an octet-vector we no longer need to worry about protobuf-utilities. We removed this from the asd file. 

Implementation

This change is a dramatic simplification to our post handler. First we set hunchentoot:content-type* to application/octet-stream so it knows we will return an octet-vector. Then we call raw-post-data and deserialize the result. We do our application logic and create our response. Finally we serialize our reply proto and return the octet-vector. 
The one gotcha in all of this is the inability to either send or receive the empty octet-vector. Either drakma just sends nil, or hunchentoot receives the octet stream as nil. Care should be taken to make sure one doesn’t try to deserialize nil, as that’s a type error. W all know nil is not of type octet-vector!

(define-easy-handler (hello-world :uri "/hello") ()
  (setf (hunchentoot:content-type*)
"application/octet-stream")
  (let* ((post-request (raw-post-data))
         (request
(if post-request
               (cl-protobufs:deserialize-from-bytes
                'hwp:request post-request)
                 (hwp:make-request)))
         (response
(hwp:make-response
            :response
             (if (hwp:request.has-name request)
                 (format nil "Hello ~a"
(hwp:request.name request))
                 "Hello"))))
    (cl-protobufs:serialize-to-bytes response)))

Final Thoughts

Sending and receiving protocol buffers through octet-vectors is a simpler way of using cl-protobufs with hunchentoot than trying to use HTTP parameters. Anyone using protocol-buffers will probably send and receive only one message at a time (or wrap multiple messages in one message) so it should be considered the canonical use case. This is how gRPC works. 

I hope you enjoyed this series on cl-protobufs, and hope you enjoy adding it into your own toolbox of useful Lisp packages.


I would like to thank Carl Gay for taking the time to edit the post and provide information on Hunchentoot Web Server.