{"@attributes":{"version":"2.0"},"channel":{"title":"\u2124\u2192\u2124","link":"https:\/\/ztoz.blog\/","description":"Recent content on \u2124\u2192\u2124","generator":"Hugo -- gohugo.io","language":"en-us","lastBuildDate":"Wed, 15 Apr 2026 11:55:22 -0600","item":[{"title":"Plain Text Accounting & Financial Reporting","link":"https:\/\/ztoz.blog\/posts\/plaintext-accounting-reporting\/","pubDate":"Wed, 15 Apr 2026 11:55:22 -0600","guid":"https:\/\/ztoz.blog\/posts\/plaintext-accounting-reporting\/","description":"<p>Over the past year, I&rsquo;ve been acting as the treasurer for a small publishing company. I converted our books from a collection of spreadsheets to a plain text accounting ledger and wrote programs to automate balance, income, and sales reporting using a tool called Beancount. These recipes may be useful to others running businesses, particularly if they have some programming talent.<\/p>\n<h2 id=\"plain-text-accounting\">Plain Text Accounting<\/h2>\n<p><a href=\"https:\/\/plaintextaccounting.org\/\">Plain Text Accounting<\/a> is an approach to accounting that has helped us minimize costs, maximize transparency, and automate reporting. Practitioners record transactions within &ldquo;plain text&rdquo; files that can be easily shared and stored in version control systems (such as <code>git<\/code>). Since the format of the transactions is well-defined, tools can process the files to provide auditing, translation, and reporting. For example, a program can translate a credit card statement into a plain text format to facilitate importing.<\/p>\n<p>For our tooling, I chose <a href=\"https:\/\/beancount.github.io\/\">Beancount v3<\/a>, partially due to its Python API, but competitor programs <a href=\"https:\/\/ledger-cli.org\/index.html\">ledger<\/a> or <a href=\"https:\/\/hledger.org\/\">hledger<\/a> may work better for you. Since you control the data, there is very little &ldquo;lock-in&rdquo; between any of the tools although they define the ledger format differently.<\/p>\n<h2 id=\"situational-context\">Situational Context<\/h2>\n<p>Our business sells products through various online storefronts. We are a purely internet-based business. Individual products are developed by ad-hoc product teams who earn royalties based on their individual contributions and team agreements, so the royalty structures are unique per product.<\/p>\n<p>Since we are bootstrapping the business and building revenue takes time, we try to minimize overhead cash expenditures. When we reviewed commercial accounting\/bookkeeping tools, we found that they were either priced outside our budget or did not support tracking cost of goods sold and royalties sufficient to our needs. Spreadsheets were sufficient early on, but they were error-prone and, as the number of products grew, were becoming increasingly complex and fragile.<\/p>\n<h2 id=\"double-entry-accounting-and-the-ledger\">Double Entry Accounting and the Ledger<\/h2>\n<p>Double entry accounting is a practice where transactions are recorded (at least twice): one or more source accounts and one or more destination accounts. Since the sum of the sources and destinations must equal zero, the practice has a built-in way to verify its correctness.<\/p>\n<p>All transactions can be recorded in a single ledger. For us, we have a <code>data\/ledger.txt<\/code> file within our financial git repository. Access to the repository is controlled by our software forge and all changes automatically have an audit trail.<\/p>\n<p>The ledger has <a href=\"https:\/\/beancount.github.io\/docs\/the_double_entry_counting_method.html#types-of-accounts\">five accounts<\/a> at the top level: Assets, Liabilities, Income, Expenses, and Equity. Each ledger will define an account hierarchy under these five accounts. Using the beancount syntax, our (anonymized) accounts are:<\/p>\n<pre tabindex=\"0\"><code class=\"language-beancount\" data-lang=\"beancount\">2025-01-01 open Assets:BankName:Checking USD\n2025-01-01 open Assets:StoreFront USD\n2025-01-01 open Liabilities:Royalties:Member1 USD\n2025-01-01 open Liabilities:Royalties:Member2 USD\n2025-01-01 open Income:BankName:Interest USD\n2025-01-01 open Expenses:StoreFront:Overhead USD\n2025-01-01 open Expenses:Govt:TaxFees USD\n2025-01-01 open Expenses:Operations USD\n2025-01-01 open Expenses:UnpaidRoyalties USD  ; for balancing out royalty liabilities\n2025-01-01 open Equity:Member1 USD\n2025-01-01 open Equity:Member2 USD\n\n; Do not allow fake accounts to be used after initialization\n2025-01-01 open Equity:Opening-Balances\n2025-01-02 close Equity:Opening-Balances\n\n; Project specific accounts\n\n2025-01-01 open Income:StoreFront:Prod01 USD\n2025-01-01 open Income:StoreFront:Prod02 USD\n2025-01-01 open Expenses:Prod02 USD\n<\/code><\/pre><p>In Beancount, all entries in the ledger are marked with a date. Accounts must be opened on a certain date and can be closed. If an entry is marked against a closed account, it will be reported as an error. We use the <code>Equity:Opening-Balances<\/code> account to handle initialization of values (since the business was already on-going by the time we started the ledger) and close it the day after to prevent using it accidentally.<\/p>\n<p>We have two assets, a checking account at a bank and the account at our storefront since transfers from the storefront are not automatic. Unpaid royalties are liabilities and we group each member under the general Royalties category. Income comes from product sales, which we group under the store, and interest from the bank. There are a variety of sources of expenses, some general, some product specific, and we record equity per member.<\/p>\n<p>As you will see later, we later added an attribute to accounts to differentiate between operational (core business related) and non-operational income and expenses. Instead of adding the attribute, we could have used a hierarchical element (e.g. <code>Income:Op:StoreFront:Prod01<\/code> and <code>Income:NonOp:BankName:Interest<\/code>) which would have aligned better with Beancount&rsquo;s query functionality.<\/p>\n<p>A sales transaction demonstrates the double-entry method. A customer purchases Prod02 and pays $3.<\/p>\n<pre tabindex=\"0\"><code class=\"language-beancount\" data-lang=\"beancount\">2025-12-23 txn &quot;StoreFront&quot; &quot;Prod02 Sale&quot;\n        Income:StoreFront:Prod02        -3.00 USD       ; Total (Gross) Sale\n        Expenses:StoreFront:Overhead     0.90 USD       ; Storefront cut\n        Assets:StoreFront                2.10 USD       ; Earnings\n        Liabilities:Royalties:Member1   -0.55 USD       ; Royalty\n        Liabilities:Royalties:Member2   -0.05 USD       ; Royalty\n        Expenses:UnpaidRoyalties         0.60 USD       ; Royalty Float\n<\/code><\/pre><p>The sum of these six changes to accounts is zero. The signs used on the values may initially seem counter-intuitive but they do make sense. The gross sales value, the Income for the sale, is recorded as a negative number. The storefront takes a portion of the sale, in this case $0.90, so that is recorded as an Expense (or cost of goods sold). (In this example, sales tax and payment processor expenses are part of the overhead.) That leaves $2.10 to be stored in our &ldquo;account&rdquo; with the storefront. Due to royalties, we cannot claim full ownership of that $2.10 so we need to account for our liabilities. We record two Royalty liabilities and balance their sum with an Expenses line for tracking overall UnpaidRoyalties.<\/p>\n<p>When we pay out royalties, we debit the asset where we are drawing money from and credit our liabilities:<\/p>\n<pre tabindex=\"0\"><code class=\"language-beancount\" data-lang=\"beancount\">2025-01-12 txn &quot;Company&quot; &quot;Member1 Royalty Payment 2024Q4&quot;\n    Assets:BankName:Checking        -17.00 USD         ; Payment\n    Liabilities:Royalties:Member1    17.00 USD         ; Reduction of liability\n<\/code><\/pre><p>Notably, we do not change the <code>Expenses:UnpaidRoyalties<\/code> value. Assets and Liabilities have a value at given time (similar to a Gauge in telemetry), while Income and Expenses are <em>deltas<\/em> between two time values (similar to a Counter in telemetry). It makes sense to discuss a bank balance at the start or end of a quarter, but a balance is not defined over a three month period. In contrast, we can discuss sales over a quarter, but not sales at a specific point of time.<\/p>\n<p>Beancount includes a tool <code>bean-check<\/code> which can verify the balancing of accounts. We call <code>bean-check<\/code> as part of our git merge pipeline. In our ledger, we include <code>balance<\/code> checks to verify our accounts match those of the banks and storefronts, e.g.:<\/p>\n<pre tabindex=\"0\"><code class=\"language-beancount\" data-lang=\"beancount\">2025-08-01 balance Assets:BankName:Checking 1670.06 USD\n<\/code><\/pre><p>For populating the ledger, we have Python scripts that translate exported monthly CSV reports into the beancount format. It is also easy to write entries by hand, which is necessary because human judgment is often needed to properly categorize expenses.<\/p>\n<h2 id=\"reporting-using-beancount\">Reporting using Beancount<\/h2>\n<p>The members of the company board expect monthly balance statements, income statements by the month, quarter, and year, and reports of sales by product. So far, I haven&rsquo;t been asked to produce any custom reports, but I expect that will come.<\/p>\n<p>To generate the reports, I use Python, the jinja2 templating library, and beancount&rsquo;s beanquery library. On a merge to the main branch, we update a website (via Gitlab Pages) with the latest set of reports. We use Gitlab&rsquo;s authentication system to control access to the website.<\/p>\n<p><code>beanquery<\/code> provides a SQL-like interface to the ledger that implements the Python database interface. Creating a connection is straight-forward:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">\tconn: beanquery<span style=\"color:#f92672\">.<\/span>Connection <span style=\"color:#f92672\">=<\/span> beanquery<span style=\"color:#f92672\">.<\/span>connect(<span style=\"color:#e6db74\">&#39;beancount:data\/ledger.txt&#39;<\/span>)\n<\/code><\/pre><\/div><p>I pass around a Context instance to store the various run-time objects and configuration:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">\t<span style=\"color:#a6e22e\">@dataclass<\/span>\n\t<span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">RenderContext<\/span>:\n\t    conn: beanquery<span style=\"color:#f92672\">.<\/span>Connection\n\t    env: jinja2<span style=\"color:#f92672\">.<\/span>Environment\n\t    dst: pathlib<span style=\"color:#f92672\">.<\/span>Path\n\t    earliest_ledger: datetime<span style=\"color:#f92672\">.<\/span>date  <span style=\"color:#75715e\"># inclusive transaction date<\/span>\n\t    latest_ledger: datetime<span style=\"color:#f92672\">.<\/span>date  <span style=\"color:#75715e\"># inclusive transaction date<\/span>\n<\/code><\/pre><\/div><p>The earliest and latest ledger date values are retrieved via beanquery:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">    earliest <span style=\"color:#f92672\">=<\/span> fetch_one_row(conn, <span style=\"color:#e6db74\">&#39;SELECT date ORDER BY date ASC LIMIT 1&#39;<\/span>)[<span style=\"color:#ae81ff\">0<\/span>]\n    latest <span style=\"color:#f92672\">=<\/span> fetch_one_row(conn, <span style=\"color:#e6db74\">&#39;SELECT date ORDER BY date DESC LIMIT 1&#39;<\/span>)[<span style=\"color:#ae81ff\">0<\/span>]\n<\/code><\/pre><\/div><p>Beanquery returns rich Python objects from queries, which can be unexpected, but also means you will write fewer type conversion functions.<\/p>\n<h3 id=\"balance-sheets\">Balance Sheets<\/h3>\n<p>A balance sheet reports assets, liabilities, and equity on the close of a financial period. In our case, we report monthly. The function <code>render_balance_sheets<\/code> generates a monthly report for every month in our ledger with recorded transactions. Beanquery does not support parameterized queries, so we use string interpolation via <code>.format<\/code> instead.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">\t<span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">render_balance_sheets<\/span>(ctx: RenderContext):\n\t    template <span style=\"color:#f92672\">=<\/span> ctx<span style=\"color:#f92672\">.<\/span>env<span style=\"color:#f92672\">.<\/span>get_template(<span style=\"color:#e6db74\">&#39;balance.html&#39;<\/span>)\n\t\n\t    <span style=\"color:#66d9ef\">for<\/span> start_inc, end_exc <span style=\"color:#f92672\">in<\/span> month_pair_iter(ctx<span style=\"color:#f92672\">.<\/span>earliest_ledger, ctx<span style=\"color:#f92672\">.<\/span>latest_ledger):\n\t        query <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">&#34;SELECT account, SUM(position) FROM CLOSE ON <\/span><span style=\"color:#e6db74\">{0}<\/span><span style=\"color:#e6db74\"> CLEAR &#34;<\/span> \\\n\t        \t<span style=\"color:#e6db74\">&#34;WHERE account~&#39;Assets|Liabilities|Equity&#39; ORDER BY account&#34;<\/span><span style=\"color:#f92672\">.<\/span>format(end_exc<span style=\"color:#f92672\">.<\/span>isoformat())\n\t        out <span style=\"color:#f92672\">=<\/span> _render_positions(ctx, template, query, (start_inc, end_exc))\n\t        ctx<span style=\"color:#f92672\">.<\/span>dst<span style=\"color:#f92672\">.<\/span>joinpath(<span style=\"color:#e6db74\">f<\/span><span style=\"color:#e6db74\">&#39;balance-<\/span><span style=\"color:#e6db74\">{<\/span>start_inc<span style=\"color:#f92672\">.<\/span>year<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">-<\/span><span style=\"color:#e6db74\">{<\/span>str(start_inc<span style=\"color:#f92672\">.<\/span>month)<span style=\"color:#f92672\">.<\/span>rjust(<span style=\"color:#ae81ff\">2<\/span>, <span style=\"color:#e6db74\">&#34;0&#34;<\/span>)<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">.html&#39;<\/span>) \\\n\t        \t<span style=\"color:#f92672\">.<\/span>write_text(out, encoding<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;utf-8&#39;<\/span>)\n<\/code><\/pre><\/div><p>The function <code>_render_positions<\/code> renders the Jinja2 template with the results from the query. We also translate the raw query results into an <code>AccountPosition<\/code> which breaks out the top-level account type as the field <code>account_type<\/code> and flatten&rsquo;s beancount&rsquo;s complex <code>Inventory<\/code> type into a number. (Although we are explicit with our currencies in the ledger, we only handle USD.)<\/p>\n<p><code>AccountPosition<\/code> is a dataclass defined as:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">\t<span style=\"color:#a6e22e\">@dataclass<\/span>\n\t<span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">AccountPosition<\/span>:\n\t    account_type: str  <span style=\"color:#75715e\"># Assets, Liabilities, Equity, Income, Expense<\/span>\n\t    account: str  <span style=\"color:#75715e\"># e.g. StoreFront, Royalties:Member1<\/span>\n\t    usd: Decimal\n\t\n\t\t<span style=\"color:#75715e\"># @property is_operational described under Income Sheets<\/span>\n\t\n\t    <span style=\"color:#a6e22e\">@staticmethod<\/span>\n\t    <span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">from_tuple<\/span>(record: Tuple[str, Inventory]) <span style=\"color:#f92672\">-&gt;<\/span> <span style=\"color:#e6db74\">&#39;AccountPosition&#39;<\/span>:\n\t        <span style=\"color:#e6db74\">&#34;&#34;&#34;\n<\/span><span style=\"color:#e6db74\">\t        From a bean-query balance record, return an AccountPosition\n<\/span><span style=\"color:#e6db74\">\t\n<\/span><span style=\"color:#e6db74\">\t        Example:\n<\/span><span style=\"color:#e6db74\">\t        &gt; AccountPosition.from_tuple((&#34;&#39;Assets:BankName:Checking&#39;&#34;, &#34;(1055.00 USD)&#34;))\n<\/span><span style=\"color:#e6db74\">\t        AccountPosition(&#39;Assets&#39;, &#39;BankNameChecking&#39;, Decimal(&#39;1055.00&#39;))\n<\/span><span style=\"color:#e6db74\">\t\n<\/span><span style=\"color:#e6db74\">\t        :param record:\n<\/span><span style=\"color:#e6db74\">\t        :return:\n<\/span><span style=\"color:#e6db74\">\t        &#34;&#34;&#34;<\/span>\n\t        account_type, account <span style=\"color:#f92672\">=<\/span> record[<span style=\"color:#ae81ff\">0<\/span>]<span style=\"color:#f92672\">.<\/span>split(<span style=\"color:#e6db74\">&#39;:&#39;<\/span>, maxsplit<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span>)\n\t        inventory: Inventory <span style=\"color:#f92672\">=<\/span> record[<span style=\"color:#ae81ff\">1<\/span>]\n\t        amt: Amount <span style=\"color:#f92672\">=<\/span> inventory<span style=\"color:#f92672\">.<\/span>get_currency_units(<span style=\"color:#e6db74\">&#39;USD&#39;<\/span>)\n\t        <span style=\"color:#66d9ef\">return<\/span> AccountPosition(account_type, account, amt<span style=\"color:#f92672\">.<\/span>number)\n<\/code><\/pre><\/div><p>We excerpt the liabilities table from <code>balance.html<\/code> below. In the report, this would preceded by the assets table and followed by the equity table. For ease of reading, we exclude from the table accounts with a zero balance and represent liabilities as positive values. In the table footer, Jinja2 pipes are a good way to compute filtered sums. Since numbers are stored as Decimals, we do not need to worry about floating point errors.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-html\" data-lang=\"html\">    &lt;<span style=\"color:#f92672\">table<\/span> <span style=\"color:#a6e22e\">id<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;liabilities&#34;<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;table-4in&#34;<\/span>&gt;\n        &lt;<span style=\"color:#f92672\">thead<\/span>&gt;\n\t    &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n\t        &lt;<span style=\"color:#f92672\">th<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-left&#34;<\/span>&gt;Liability&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t        &lt;<span style=\"color:#f92672\">th<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;Value ($)&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t    &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n        &lt;\/<span style=\"color:#f92672\">thead<\/span>&gt;\n    \t&lt;<span style=\"color:#f92672\">tbody<\/span>&gt;\n        {% for balance in balances if balance.account_type == &#39;Liabilities&#39; \n        \t\t\t\t\t\t\tand not balance.usd.is_zero() %}\n        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">td<\/span>&gt;{{ balance.account }}&lt;\/<span style=\"color:#f92672\">td<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">td<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right currency&#34;<\/span>&gt;{{ balance.usd * -1|round(2) }}&lt;\/<span style=\"color:#f92672\">td<\/span>&gt;\n        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n        {% endfor %}\n        &lt;\/<span style=\"color:#f92672\">tbody<\/span>&gt;\n        &lt;<span style=\"color:#f92672\">tfoot<\/span>&gt;\n        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">td<\/span>&gt;&lt;<span style=\"color:#f92672\">em<\/span>&gt;Total&lt;\/<span style=\"color:#f92672\">em<\/span>&gt;&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right currency&#34;<\/span>&gt;{{ balances\n \t\t           |selectattr(&#39;account_type&#39;, &#39;eq&#39;, &#39;Liabilities&#39;)\n \t\t           |sum(attribute=&#39;usd&#39;) * -1|round(2) }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n        &lt;\/<span style=\"color:#f92672\">tfoot<\/span>&gt;\n    &lt;\/<span style=\"color:#f92672\">table<\/span>&gt;\n<\/code><\/pre><\/div><h3 id=\"income-sheets\">Income Sheets<\/h3>\n<p>An income sheet reports income and expenses within a period of time. The function <code>render_income_sheets<\/code> creates files for each month and quarter in our ledger&rsquo;s horizon. Since the data represents flows rather than instances in time, the queries include both an <code>OPEN ON<\/code> and a <code>CLOSE ON<\/code>.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">render_income_sheets<\/span>(ctx: RenderContext):\n    template <span style=\"color:#f92672\">=<\/span> ctx<span style=\"color:#f92672\">.<\/span>env<span style=\"color:#f92672\">.<\/span>get_template(<span style=\"color:#e6db74\">&#39;income.html&#39;<\/span>)\n\n    <span style=\"color:#66d9ef\">for<\/span> start_inc, end_exc <span style=\"color:#f92672\">in<\/span> month_pair_iter(ctx<span style=\"color:#f92672\">.<\/span>earliest_ledger, ctx<span style=\"color:#f92672\">.<\/span>latest_ledger):\n        query <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">&#34;SELECT account, SUM(position) FROM OPEN ON <\/span><span style=\"color:#e6db74\">{0}<\/span><span style=\"color:#e6db74\"> CLOSE ON <\/span><span style=\"color:#e6db74\">{1}<\/span><span style=\"color:#e6db74\"> &#34;<\/span> \\\n        \t<span style=\"color:#e6db74\">&#34;WHERE account~&#39;Income|Expenses&#39; GROUP BY 1 ORDER BY 1&#34;<\/span> \\\n        \t<span style=\"color:#f92672\">.<\/span>format(start_inc<span style=\"color:#f92672\">.<\/span>isoformat(), end_exc<span style=\"color:#f92672\">.<\/span>isoformat())\n        out <span style=\"color:#f92672\">=<\/span> _render_positions(ctx, template, query, (start_inc, end_exc))\n        ctx<span style=\"color:#f92672\">.<\/span>dst<span style=\"color:#f92672\">.<\/span>joinpath(<span style=\"color:#e6db74\">f<\/span><span style=\"color:#e6db74\">&#39;income-<\/span><span style=\"color:#e6db74\">{<\/span>start_inc<span style=\"color:#f92672\">.<\/span>year<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">-<\/span><span style=\"color:#e6db74\">{<\/span>str(start_inc<span style=\"color:#f92672\">.<\/span>month)<span style=\"color:#f92672\">.<\/span>rjust(<span style=\"color:#ae81ff\">2<\/span>, <span style=\"color:#e6db74\">&#34;0&#34;<\/span>)<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">.html&#39;<\/span>) \\\n        \t<span style=\"color:#f92672\">.<\/span>write_text(out, encoding<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;utf-8&#39;<\/span>)\n\n    <span style=\"color:#66d9ef\">for<\/span> quarter_inc, quarter_exc <span style=\"color:#f92672\">in<\/span> quarter_iter(ctx<span style=\"color:#f92672\">.<\/span>earliest_ledger, ctx<span style=\"color:#f92672\">.<\/span>latest_ledger):\n        query <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">&#34;SELECT account, SUM(position) FROM OPEN ON <\/span><span style=\"color:#e6db74\">{0}<\/span><span style=\"color:#e6db74\"> CLOSE ON <\/span><span style=\"color:#e6db74\">{1}<\/span><span style=\"color:#e6db74\">&#34;<\/span> \\ \n        \t<span style=\"color:#e6db74\">&#34;WHERE account~&#39;Income|Expenses&#39; GROUP BY 1 ORDER BY 1&#34;<\/span> \\\n        \t<span style=\"color:#f92672\">.<\/span>format(quarter_inc<span style=\"color:#f92672\">.<\/span>isoformat(), quarter_exc<span style=\"color:#f92672\">.<\/span>isoformat())\n        out <span style=\"color:#f92672\">=<\/span> _render_positions(ctx, template, query, (quarter_inc, quarter_exc))\n        ctx<span style=\"color:#f92672\">.<\/span>dst<span style=\"color:#f92672\">.<\/span>joinpath(<span style=\"color:#e6db74\">f<\/span><span style=\"color:#e6db74\">&#39;income-<\/span><span style=\"color:#e6db74\">{<\/span>quarter_inc<span style=\"color:#f92672\">.<\/span>year<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">-Q<\/span><span style=\"color:#e6db74\">{<\/span>str((quarter_inc<span style=\"color:#f92672\">.<\/span>month <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#f92672\">\/\/<\/span> <span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>)<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">.html&#39;<\/span>) \\\n        \t<span style=\"color:#f92672\">.<\/span>write_text(out, encoding<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;utf-8&#39;<\/span>)\n<\/code><\/pre><\/div><p>The operational income portion of the income sheet is generated via:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-html\" data-lang=\"html\">    {% for balance in balances if balance.account_type == &#39;Income&#39; and balance.is_operational %}\n        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-left indent-1&#34;<\/span>&gt;{{ balance.account }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right currency&#34;<\/span>&gt;{{ balance.usd|abs|round(2) }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n    {% endfor %}\n    &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n        &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-left indent-1&#34;<\/span>&gt;&lt;<span style=\"color:#f92672\">em<\/span>&gt;Total&lt;\/<span style=\"color:#f92672\">em<\/span>&gt;&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n        &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right currency&#34;<\/span>&gt;{{ balances\n\t\t\t|selectattr(&#39;account_type&#39;, &#39;eq&#39;, &#39;Income&#39;)\n\t          |selectattr(&#39;is_operational&#39;)\n\t          |sum(&#39;usd&#39;)|round(2)|abs }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n    &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n<\/code><\/pre><\/div><p>The <code>is_operational<\/code> method is coded as:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">    <span style=\"color:#a6e22e\">@property<\/span>\n    <span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">is_operational<\/span>(self):\n        <span style=\"color:#e6db74\">&#34;&#34;&#34;\n<\/span><span style=\"color:#e6db74\">        :return: true if Income\/Expense is part of normal business operations\n<\/span><span style=\"color:#e6db74\">        &#34;&#34;&#34;<\/span>\n        <span style=\"color:#66d9ef\">if<\/span> self<span style=\"color:#f92672\">.<\/span>account_type <span style=\"color:#f92672\">==<\/span> <span style=\"color:#e6db74\">&#39;Income&#39;<\/span>:\n            <span style=\"color:#66d9ef\">return<\/span> self<span style=\"color:#f92672\">.<\/span>account <span style=\"color:#f92672\">!=<\/span> <span style=\"color:#e6db74\">&#39;BankName:Checking&#39;<\/span>\n        <span style=\"color:#66d9ef\">elif<\/span> self<span style=\"color:#f92672\">.<\/span>account_type <span style=\"color:#f92672\">==<\/span> <span style=\"color:#e6db74\">&#39;Expenses&#39;<\/span>:\n            <span style=\"color:#66d9ef\">return<\/span> self<span style=\"color:#f92672\">.<\/span>account <span style=\"color:#f92672\">!=<\/span> <span style=\"color:#e6db74\">&#39;Govt:TaxFees&#39;<\/span>\n        <span style=\"color:#66d9ef\">else<\/span>:\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">False<\/span>\n<\/code><\/pre><\/div><p>Jinja2 templates are not allowed to call methods on objects, so we use <code>@property<\/code> so the template can retrieve the boolean value.<\/p>\n<p>From a maintability perspective, this implementation is fragile if we define new accounts. Fortunately, we rarely define new accounts. Representing the operational nature of the account in the hierarchy would solve the problem.)<\/p>\n<h3 id=\"sales-by-product\">Sales by Product<\/h3>\n<p>For sales, we provide a table with rows for each product, columns for months within a year, and totals for each row and month.<\/p>\n<p>The table is generated by the template:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-html\" data-lang=\"html\">\t&lt;<span style=\"color:#f92672\">table<\/span> <span style=\"color:#a6e22e\">id<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;gross-sales&#34;<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;table-dense&#34;<\/span>&gt;\n\t    &lt;<span style=\"color:#f92672\">thead<\/span>&gt;\n\t        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-left&#34;<\/span>&gt;Product&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;01&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;02&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;03&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;04&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;05&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;06&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;07&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;08&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;09&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;10&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;11&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;12&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">th<\/span>&gt;Total&lt;\/<span style=\"color:#f92672\">th<\/span>&gt;\n\t        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n\t    &lt;\/<span style=\"color:#f92672\">thead<\/span>&gt;\n\t    &lt;<span style=\"color:#f92672\">tbody<\/span>&gt;\n\t    {% for sale in sales %}\n\t        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-left&#34;<\/span>&gt;{{ sale.positions[0].account }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t            {% for position in sale.positions %}\n\t                {% if position.usd.is_zero() %}\n\t                &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t                {% else %}\n\t                &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;{{ position.usd|abs|round(0) }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t                {% endif %}\n\t            {% end for %}\n\t            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;{{ sale.positions|sum(&#39;usd&#39;)|round(0)|abs }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n\t    {% end for %}\n\t    &lt;\/<span style=\"color:#f92672\">tbody<\/span>&gt;\n\t    &lt;<span style=\"color:#f92672\">tfoot<\/span>&gt;\n\t        &lt;<span style=\"color:#f92672\">tr<\/span>&gt;\n\t            &lt;<span style=\"color:#f92672\">dt<\/span>&gt;&lt;<span style=\"color:#f92672\">em<\/span>&gt;Total&lt;\/<span style=\"color:#f92672\">em<\/span>&gt;&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t            {% for month in range(1, 13) %}\n\t            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;{{ (monthly_sums[month]) }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t            {% end for %}\n\t            &lt;<span style=\"color:#f92672\">dt<\/span> <span style=\"color:#a6e22e\">class<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;align-right&#34;<\/span>&gt;{{ monthly_sums.values()|sum }}&lt;\/<span style=\"color:#f92672\">dt<\/span>&gt;\n\t        &lt;\/<span style=\"color:#f92672\">tr<\/span>&gt;\n\t    &lt;\/<span style=\"color:#f92672\">tfoot<\/span>&gt;\n\t&lt;\/<span style=\"color:#f92672\">table<\/span>&gt;\n<\/code><\/pre><\/div><p>We leverage beancount&rsquo;s <code>PIVOT<\/code> capability:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">render_gross_sales<\/span>(ctx: RenderContext):\n    <span style=\"color:#e6db74\">&#34;&#34;&#34;\n<\/span><span style=\"color:#e6db74\">    Gross sales by year and month for operational income accounts\n<\/span><span style=\"color:#e6db74\">\n<\/span><span style=\"color:#e6db74\">    :param ctx:\n<\/span><span style=\"color:#e6db74\">    :return:\n<\/span><span style=\"color:#e6db74\">    &#34;&#34;&#34;<\/span>\n    template <span style=\"color:#f92672\">=<\/span> ctx<span style=\"color:#f92672\">.<\/span>env<span style=\"color:#f92672\">.<\/span>get_template(<span style=\"color:#e6db74\">&#39;sales.html&#39;<\/span>)\n\n    <span style=\"color:#66d9ef\">for<\/span> year <span style=\"color:#f92672\">in<\/span> range(ctx<span style=\"color:#f92672\">.<\/span>earliest_ledger<span style=\"color:#f92672\">.<\/span>year, ctx<span style=\"color:#f92672\">.<\/span>latest_ledger<span style=\"color:#f92672\">.<\/span>year <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>):\n        query <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">&#34;SELECT account, month(date) as d, SUM(position)&#34;<\/span> \\ \n        \t<span style=\"color:#e6db74\">&#34;WHERE account~&#39;Income&#39; and year(date)=<\/span><span style=\"color:#e6db74\">{0}<\/span><span style=\"color:#e6db74\"> GROUP BY 1, 2 ORDER BY 1 PIVOT BY 1, 2&#34;<\/span> \\\n        \t<span style=\"color:#f92672\">.<\/span>format(year)\n        now <span style=\"color:#f92672\">=<\/span> datetime<span style=\"color:#f92672\">.<\/span>date<span style=\"color:#f92672\">.<\/span>today()<span style=\"color:#f92672\">.<\/span>isoformat()\n\n        incomes <span style=\"color:#f92672\">=<\/span> fetch_all_rows(ctx<span style=\"color:#f92672\">.<\/span>conn, query)\n        sales <span style=\"color:#f92672\">=<\/span> [MonthAccountPositionPivot<span style=\"color:#f92672\">.<\/span>from_tuple(i) <span style=\"color:#66d9ef\">for<\/span> i <span style=\"color:#f92672\">in<\/span> incomes]\n        sales <span style=\"color:#f92672\">=<\/span> list(filter(<span style=\"color:#66d9ef\">lambda<\/span> s: s<span style=\"color:#f92672\">.<\/span>positions[<span style=\"color:#ae81ff\">0<\/span>]<span style=\"color:#f92672\">.<\/span>is_operational, sales))\n\n        <span style=\"color:#75715e\"># jinja2 is ignoring a round(0) filter, so we pre-round the monthly sums before the template<\/span>\n        monthly_sums <span style=\"color:#f92672\">=<\/span> defaultdict(Decimal)\n        <span style=\"color:#66d9ef\">for<\/span> sale <span style=\"color:#f92672\">in<\/span> sales:\n            <span style=\"color:#66d9ef\">for<\/span> i, position <span style=\"color:#f92672\">in<\/span> enumerate(sale<span style=\"color:#f92672\">.<\/span>positions):\n               monthly_sums[i<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">1<\/span>] <span style=\"color:#f92672\">=<\/span> monthly_sums[i<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">1<\/span>] <span style=\"color:#f92672\">+<\/span> position<span style=\"color:#f92672\">.<\/span>usd\n        <span style=\"color:#66d9ef\">for<\/span> month <span style=\"color:#f92672\">in<\/span> range(<span style=\"color:#ae81ff\">1<\/span>, <span style=\"color:#ae81ff\">13<\/span>):\n            monthly_sums[month] <span style=\"color:#f92672\">=<\/span> round(abs(monthly_sums[month]), <span style=\"color:#ae81ff\">0<\/span>)\n\n        out <span style=\"color:#f92672\">=<\/span> template<span style=\"color:#f92672\">.<\/span>render(now<span style=\"color:#f92672\">=<\/span>now, year<span style=\"color:#f92672\">=<\/span>year, sales<span style=\"color:#f92672\">=<\/span>sales, monthly_sums<span style=\"color:#f92672\">=<\/span>monthly_sums)\n        ctx<span style=\"color:#f92672\">.<\/span>dst<span style=\"color:#f92672\">.<\/span>joinpath(<span style=\"color:#e6db74\">f<\/span><span style=\"color:#e6db74\">&#39;sales-<\/span><span style=\"color:#e6db74\">{<\/span>year<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">.html&#39;<\/span>)<span style=\"color:#f92672\">.<\/span>write_text(out, encoding<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;utf-8&#39;<\/span>)\n<\/code><\/pre><\/div><p><code>MonthAccountPositionPivot<\/code> generates a rectangular grid of values from the query results like so:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">\t<span style=\"color:#a6e22e\">@dataclass<\/span>\n\t<span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">MonthAccountPositionPivot<\/span>:\n\t    positions: Sequence[AccountPosition]\n\t\n\t    <span style=\"color:#a6e22e\">@staticmethod<\/span>\n\t    <span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">from_tuple<\/span>(record: Tuple[str, <span style=\"color:#f92672\">...<\/span>]) <span style=\"color:#f92672\">-&gt;<\/span> <span style=\"color:#e6db74\">&#39;MonthAccountPositionPivot&#39;<\/span>:\n\t        positions <span style=\"color:#f92672\">=<\/span> []\n\t        acct <span style=\"color:#f92672\">=<\/span> record[<span style=\"color:#ae81ff\">0<\/span>]\n\t        <span style=\"color:#66d9ef\">for<\/span> m <span style=\"color:#f92672\">in<\/span> range(<span style=\"color:#ae81ff\">1<\/span>, len(record)):\n\t            inv <span style=\"color:#f92672\">=<\/span> record[m]\n\t            <span style=\"color:#66d9ef\">if<\/span> inv <span style=\"color:#f92672\">is<\/span> <span style=\"color:#66d9ef\">None<\/span>:\n\t                inv <span style=\"color:#f92672\">=<\/span> ZERO_INVENTORY\n\t            positions<span style=\"color:#f92672\">.<\/span>append(AccountPosition<span style=\"color:#f92672\">.<\/span>from_tuple((acct, inv)))\n\t        <span style=\"color:#66d9ef\">for<\/span> zero_months <span style=\"color:#f92672\">in<\/span> range(<span style=\"color:#ae81ff\">13<\/span> <span style=\"color:#f92672\">-<\/span> len(record)):\n\t            positions<span style=\"color:#f92672\">.<\/span>append(AccountPosition<span style=\"color:#f92672\">.<\/span>from_tuple((acct, ZERO_INVENTORY)))\n\t\n\t        <span style=\"color:#66d9ef\">assert<\/span> len(positions) <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">12<\/span>, <span style=\"color:#e6db74\">f<\/span><span style=\"color:#e6db74\">&#34;len(positions) must be 12, is <\/span><span style=\"color:#e6db74\">{<\/span>len(positions)<span style=\"color:#e6db74\">}<\/span><span style=\"color:#e6db74\">&#34;<\/span>\n\t        <span style=\"color:#66d9ef\">return<\/span> MonthAccountPositionPivot(positions)\n<\/code><\/pre><\/div><h2 id=\"conclusion\">Conclusion<\/h2>\n<p>I&rsquo;ve been pleased with adopting plain text accounting and beancount. I have more confidence that the data is correct since the ledger is verified against a history of monthly balances. Maintaining royalties is also easier because the logic is stored in one place and it is transparent how they are calculated for each transaction.<\/p>"},{"title":"README: A Bookish History of Computing (Review)","link":"https:\/\/ztoz.blog\/posts\/readme-bookish-history-review\/","pubDate":"Tue, 14 Apr 2026 10:10:27 -0600","guid":"https:\/\/ztoz.blog\/posts\/readme-bookish-history-review\/","description":"<p>While the study of textbooks has proven fruitful in the study of the history of mathematics, there has been little attention to computing books and their role on the development of computer science, the computer industry, and public understanding. Working to close this gap, W. Patrick McCray has written <em>README<\/em>, a &ldquo;book about books about computing.&rdquo; (pg 3) The scope is &ldquo;a history of computing writ large as seen through the histories of a limited but well-chosen selection of books [&hellip;] and their authors, editors, publishers, and readers.&rdquo; (ibid).  McCray selects non-fiction works that influenced the understanding and direction of computing in the latter half of the 20th century.<\/p>\n<figure>\n\t<img src=\"9780262553483.jpg\" style=\"width: 60%\">\n\t<figcaption>Cover of <em>README<\/em><\/figcaption>\n<\/figure>\n<p><em>README: A Bookish History of Computing from Electronic Brains to Everything Machines<\/em>. W. Patrick McCray. The MIT Press. (2025) <a href=\"https:\/\/mitpress.mit.edu\/9780262553483\/readme\/\">Publisher Link<\/a><\/p>\n<p>Organized largely chronologically, the book begins with Edmund Berkeley&rsquo;s <em>Giant Brains<\/em> (1949), an early description of computers for the masses, and ends with Esther Dyson&rsquo;s <em>Release 2.0<\/em> (1997), a volume about the Internet and implications of a digital age. McCray&rsquo;s analysis has a broad scope and each chapter covers the social context that the authors and publishers worked under and how that influenced their works. Many chapters cover multiple works, particularly the final chapter which expounds on popular public works such as the <em>For Dummies<\/em> series and the rise of computing magazines. Two chapters focus on more technical works: <em>TeXbook<\/em> and <em>Introduction to VLSI Systems<\/em> and their professional impact. Most of the works he describes are more policy or philosophically-oriented, such as Weizenbaum&rsquo;s (ELIZA) <em>God &amp; Golem, Inc.<\/em> and Theodor Nelson&rsquo;s (hypertext) <em>Computer Lib\/Dream Machines<\/em>. Those interested in computers and society-type questions or the debates on the nature of automation and articial intelligence will find the content rich and a source of further reading.<\/p>\n<p>Positively, <em>README<\/em> places the books into a greater narrative and connects computing into greater social trends. McCray is able to supply much insight into the individual authors and their backgrounds and how that influenced their work. He also avoids retreading familiar business history.<\/p>\n<p>Of his chosen list of books, we are most familiar with the <em>TeXbook<\/em> and Knuth&rsquo;s writings on layout and typography. We find it strange that the account does not mention Adobe at all (for either Postscript or Type 1 fonts) and barely mentions Xerox. In our opinion, the impact of TeX and METAFONT should be placed in the broader context of the desktop publishing revolution and its competitors. In general, evaluating the impact of books is difficult, particularly in fields such as engineering which often ignore history. As an example, consider the continuing debate over the impact of <em>Liber Abaci<\/em>. McCray makes the best case for the impact of <em>Introduction to VLSI Systems<\/em>, but his is not the last word on which books should be included in the list.<\/p>\n<p>We recommend this book to those interested in computer ethics, the history of practice in computing, and the on-going debate on human labor, automation, the concept of deceit in artificial intelligence, and our relation to machines. The book would be an excellent companion volume to a computers and society course. The book is accessible to non-specialists.<\/p>\n<hr>\n<p>The Computer History Museum interviewed McCray about the book during a live event in January 2026:<\/p>\n<iframe width=\"560\" height=\"315\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/reeVfmw4sbI?si=ftMcj9RHHAB_hBv8\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>"},{"title":"Fibonacci's Composed Fractions","link":"https:\/\/ztoz.blog\/posts\/fibonacci-fractions\/","pubDate":"Thu, 26 Mar 2026 12:21:38 -0600","guid":"https:\/\/ztoz.blog\/posts\/fibonacci-fractions\/","description":"<p>Leonardo of Pisa (c1170-1241+), better known as Fibonacci, wrote <em>Liber Abaci<\/em> to introduce and advocate the adoption of &ldquo;Indian figures&rdquo; (Hindu-Arabic numerals) to Italy. The book explains how to perform arithmetic and common business problems using the digits and a positional number system. Leonardo uses a composed fractions (aka composite fractions, aka linked fractions) notation to express mixed radix quantities (i.e. positional numbers where the base varies by position). This notation lasted about 300 years until the adoption of decimal fractions. Drawing from his examples, we derive algorithms for the arithmetic operations in recipe and in formal Typescript representations.<\/p>\n<figure>\n\t<img src=\"liber-abaci-2v.png\" class=\"fit-image\">\n\t<figcaption><em>Liber Abaci<\/em> page 2v, introducing Indian numbers (Source: <a href=\"https:\/\/bibdig.museogalileo.it\/tecanew\/opera?bid=1072400&seq=10\">Museo Galileo<\/a>)<\/figcaption>\n<\/figure>\n<h2 id=\"background\">Background<\/h2>\n<p>Little is known about Leonardo of Pisa&rsquo;s life. (Also known as Leonardo Pisano, which just means Leonardo of Pisa.) He provides a short biographical note in <em>Liber Abaci<\/em> (an unusual touch) where he notes he traveled to Bugia (today&rsquo;s B\u00e9ja\u00efa, Algeria) to work with his father who was a public official in the customshouse. In Bugia, he learned about Indian figures (Hindu-Arabic numerals), Arabic mathematics, Pythagorean arcs, Euclid&rsquo;s algorithm for computing the greatest common divisor, and Euclidean geometry. Later, he traveled extensively learning more mathematics. Back in Pisa, Leonardo had access to multiple high-quality mathematical works (at least some in Latin translation). He was inspired to write a work introducing the Italian people to these figures and the new methodology.<\/p>\n<p><em>Liber Abaci<\/em> (the Book of Calculation) is a thick manuscript of fifteen chapters (the Sigler translation is over 600 pages) written in Latin. The first edition was written in 1202 (no extant copies) and the second edition in 1228 (nineteen known copies). The first chapter introduces Hindu-Arabic numerals and compares them with Roman numerals (who Leonardo&rsquo;s audience were used to working). The next four chapters cover multiplication, addition, subtraction, and division of integral numbers. This is followed by two chapters on fractions. Four chapters on business problems then follow, such as bartering, assigning profits to shareholders, and the alloying of metals. Chapter 12 contains many miscellaneous problems touching on topics in algebra and linear equations and includes the famous rabbits problem that leads to the Fibonacci sequence (see Aside below). Chapter 13 concerns an iterative method for solving equations. Chapter 14 concerns finding roots of equations and the final chapter discusses geometry.<\/p>\n<p>The book is focused on practical concerns for merchants and other business people, with examples drawn from trade and finance. The book fulfilled a need. Pisa was an international trade hub and, even though it would lose that status soon in a war with Genoa, the various Italian city states were in the middle of a trade revolution. Hundreds of schools and textbooks were created over the next four centuries to teach calculation under the new method. This <em>Abbacus<\/em> School tradition focused on training students in mathematics useful for business and trade. The training used Hindu-Arabic numerals and arithmetic and followed the techniques found in <em>Liber Abaci<\/em>.<\/p>\n<p>While historians differ on the precedence and impact of <em>Liber Abaci<\/em> to the Abbacus tradition and the Arithmetic Revolution (e.g. H\u00f8yrup 2022, Devlin 2011 Chapter 8), of the extant volumes known, <em>Liber Abaci<\/em> is the most comprehensive and mathematically sophisticated, by including proofs, more advanced problems and topics, and featuring far fewer errors.<\/p>\n<h2 id=\"notation\">Notation<\/h2>\n<p>Leonardo wrote fractional values in a composed or linked form. This notation is found, with variations, in earlier Arabic sources (Abdeljaouad 2004), so did not originate with Leonardo. The notation uses a horizontal bar to separate a list of numerator values (in ascending scale or &ldquo;little-endian&rdquo; order) with a list of corresponding denominator values. The denominators represent the base or radix of the number. Numerator values include zero and, in &ldquo;proper&rdquo; form (although he did not use the term), range from zero to one less than the base.<\/p>\n<p>For example, \\(\\frac{1\\ 4}{2\\ 7}\\) means, in modern notation, \\(\\frac{1}{2\\times7} + \\frac{4}{7}=\\frac{9}{14}\\). Similarly, \\(\\frac{1\\ 5\\ \\ 7}{2\\ 6\\ 10}\\) equals \\(\\frac{1}{2\\times6\\times10} + \\frac{5}{6\\times10} + \\frac{7}{10} = \\frac{185}{120} = 1 \\frac{13}{24}\\).<\/p>\n<p>Integral values follow the fractional part, such as \\(\\frac{3}{4}15\\) meaning \\(15\\frac{3}{4}\\). Ordering numbers in ascending scale may have been influenced by Arabic&rsquo;s right-to-left reading order, but this is conjectural. Modern readers may interpret these numbers as being multiplied by each other, but Leonardo used the sum as the implicit operator, rather than today&rsquo;s multiplication (i.e. \\(\\frac{1}{3}\\frac{1}{4}=\\frac{1}{3}+\\frac{1}{4}\\), note the space between the two fractions rather than a single horizontal bar).<\/p>\n<p>Decimal fractions, such as \\(\\frac{271,828}{100,000}\\) can be represented as a special case of composed fractions with repeated bases of 10: \\(\\frac{8\\ \\ 2\\ \\ \\ 8\\ \\ \\ 1\\ \\ \\ 7}{10\\ 10\\ 10\\ 10\\ 10}2\\). Leonardo includes examples where the bases are the same number, such as 4, 8, and 10 (Sigler 2002, e.g. 439-443), but these are examples and he does not advocate for them generally.<\/p>\n<p>Decimal fractions, which retain the placement value advantages already found in the integral portion of the numbers, began to be adopted and popularized in the 15th century. (The decimal point also largely dates to the 15th century, although there are earlier examples (Van Brummelen 2024)).<\/p>\n<p>The &ldquo;delay&rdquo; in adopting decimal fractions may have less to do with the unpredictability of the discovery process but more as the result that practical uses of decimals were quite rare in the Renaissance world and that mixed radix numbers were the norm.<\/p>\n<p>For example, money. The Pisa monetary system, similar to English currency pre-decimalization, featured denari, twelve denari made a solidus, and twenty solidi made a libra. (Individual cities minted their own money, so conversions were required between Pisa and Genoa, among others.) Money values could thus be represented as:<\/p>\n<p>\\[ \\frac{d\\ s}{12\\ 20}l \\]<\/p>\n<p>Weights and measures. The Venetian&rsquo;s gross weight system featured 100 pounds to the hundredweight, 12 ounces to the pound, and 194 carats to the ounce, and 4 grains to the carat. This is represented as:<\/p>\n<p>\\[ \\frac{g\\ \\ \\ c\\ \\ \\ o\\ \\ \\ l}{4\\ 194\\ 12\\ 100}h \\]<\/p>\n<p>Time. Since the French attempt at decimalization of the calendar did not catch on, we continue to use a mixed radix approach.<\/p>\n<p>\\[ \\frac{s\\ \\ m\\ \\ h\\ \\ d}{60\\ 60\\ 24\\ 7}w  \\]<\/p>\n<p>In Simon Stevin&rsquo;s pamphlet <em>On Decimal Fractions<\/em> (1585), which advocates for an early decimal notation, he provides example applications for surveying, measuring length, measuring volume, computing with angles, and money. Tellingly, he argues that each of these applications should convert <em>their<\/em> units to decimal units, rather than showing how the values would need to be translated into decimals and back to a mixed radix representation. Full adoption of decimals required two revolutions.<\/p>\n<h3 id=\"opposition-and-adoption\">Opposition and Adoption<\/h3>\n<p>Although the growth in Abbacus schools and the proliferation of manuals teaching arithmetic with Hindu-Arabic numerals speaks to the strong demand within Italy, adoption was opposed by several forces.<\/p>\n<p>Roman numerals were considered easier to read and more resistant to fraud than decimals. A Venetian bookkeeping manual stated &ldquo;The old figures alone are used because they cannot be falsified as easily as those of the new art of computation, of which one can, with ease, make one out of another, such as turning the zero into a 6 or a 9, and similarly many others can be falsified.&rdquo; (Devlin 2011, pg 113).<\/p>\n<p>Those trained the abacus also resisted because the productivity gains of place value arithmetic threatened the value prop of the abacus. The Guild of Money Changers in Florence forbade their members from using the new figures in 1299 (ibid.).<\/p>\n<p>As a measure of adoption, Hindu-Arabic numerals start appearing in the Medici account books in 1406 within the &ldquo;description&rdquo; column. In 1439, they begin to be used in the primary accounts and all but one ledger are converted by 1482. In 1494, all ledgers use the numerals (ibid, pg 115).<\/p>\n<h2 id=\"aside-fibonacci-sequence\">Aside: Fibonacci Sequence<\/h2>\n<p>Leonardo is best known by the <a href=\"https:\/\/mathworld.wolfram.com\/FibonacciNumber.html\">Fibonacci Sequence<\/a> (1, 1, 2, 3, 5, 8, 13, &hellip;; <a href=\"https:\/\/oeis.org\/A000045\">OES A000045<\/a>). The sequence appears in an example problem in chapter 12 of the work (Sigler 2002, pg. 404-405) about rabbits. He poses the question of how many pairs of rabbits will exist by year end, if you start with one breeding pair:<\/p>\n<blockquote>\n<p>A certain man had one pair of rabbits together in a certain enclosed place, and one wishes to know how many are created from the pair in one year when it is the nature of them in a single month to bear another pair, and in the second month those born to bear also. Because the abovewritten pair in the first month bore, you will double it; there will be two pairs in one month. [&hellip;]<\/p>\n<\/blockquote>\n<p>In a long run-on sentence, Leonardo describes the increase month-by-month, annotating the description with a table (see figure).<\/p>\n<figure>\n  <img src=\"a000045.jpg\">\n  <figcaption>Leonardo's table for the number of rabbit pairs by month (Source: <a href=\"https:\/\/oeis.org\/A000045\">OES, photo by David Singmaster<\/a>)<\/figcaption>\n<\/figure>\n<p>He summarizes the process:<\/p>\n<blockquote>\n<p>You can indeed see in the margin how we operated, namely that we added the first number to the second, namely the 1 to the 2, and the second to the third, and the third to the fourth, and the fourth to the fifth, and thus one after another until we added the tenth to the eleventh, namely the 144 to the 233, and we had the abovewritten sum of rabbits, namely 377, and thus you can in order find it for an unending number of months.<\/p>\n<\/blockquote>\n<p>The entire entry is less than a page in length and is sandwiched between problems about finding perfect numbers and a system of linear equations problem (as it is known today). Overall, we would describe chapter 12 as a &ldquo;miscellaneous&rdquo; problem chapter given the range of problems. Leonardo does not dig deeper into numerical sequences or recurrence functions within the book.<\/p>\n<p>The &ldquo;rabbit problem&rdquo; is unusual within the book both because of its nature (a number sequence) and because of its subject matter. Leonardo&rsquo;s example problems tend to draw on practical, commercial problems such as calculating prices for various volumes of goods and how to share profits among partners with different stakes. Thus, a question about breeding rabbits is memorable, although not unique as he has a few other whimsical problems.<\/p>\n<p>However, his name does not deserve the link. He was not the earliest mathematician to describe the sequence nor provided much mathematical insight into the specific or general notion of the recurrence function. To Leonardo&rsquo;s credit, he did not claim discovery of the sequence. Linking Leonardo to the sequence, under the name Fibonacci, was the work of Edouard Lucas, a 19th century French number theorist who extensively studied recursive series. Precedence of discovery is a messy basis for naming things and always subject to revision.<\/p>\n<h2 id=\"arithmetic\">Arithmetic<\/h2>\n<p>Leonardo used series of examples to demonstrate the process or recipe for solving a problem and expected the reader to infer the algorithm. Our descriptions below for the arithmetic operations are thus our understanding of the algorithm, drawn from Leonardo&rsquo;s examples and, unavoidably, modern understanding.<\/p>\n<h3 id=\"representation\">Representation<\/h3>\n<p>A composed fraction of the form:<\/p>\n<p>\\[ \\frac{a_1\\ a_2\\ \\cdots a_n}{b_1\\ b_2\\ \\cdots b_n}a_0 \\]<\/p>\n<p>is equivalent to (in modern representation):<\/p>\n<p>\\[ a_0 + \\frac{a_n}{b_n} + \\frac{a_{n-1}}{b_n \\times b_{n-1}} + \\frac{a_{n-2}}{b_n \\times b_{n-1} \\times b_{n-2}} + \\ldots + \\frac{a_1}{\\prod_{1}^{n}{b_i}}  \\]<\/p>\n<p>We can represent the composed fraction in Typescript via an immutable class, with a field for the integer portion (<code>a0<\/code>) and arrays to store the fractional numerators and denominators (<code>as<\/code> and <code>bs<\/code>).<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\"><span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\"> * Fibonacci Composed Fractions\n<\/span><span style=\"color:#75715e\"> *\/<\/span>\n<span style=\"color:#66d9ef\">export<\/span> <span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">FCF<\/span> {\n\n    <span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\">     * Integer portion of the fraction\n<\/span><span style=\"color:#75715e\">     * @private\n<\/span><span style=\"color:#75715e\">     *\/<\/span>\n    <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">readonly<\/span> <span style=\"color:#a6e22e\">a0<\/span>: <span style=\"color:#66d9ef\">bigint<\/span>\n\n    <span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\">     * Numerators of the composed fraction in little-endian order. First element is a1.\n<\/span><span style=\"color:#75715e\">     * @private\n<\/span><span style=\"color:#75715e\">     *\/<\/span>\n    <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">readonly<\/span> <span style=\"color:#66d9ef\">as<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">bigint<\/span>[]\n\n    <span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\">     * Denominators (bases or radix) of the composed fraction in little-endian order. First\n<\/span><span style=\"color:#75715e\">     * element is b1. Since a0 is always integral, b0 is not required.\n<\/span><span style=\"color:#75715e\">     * @private\n<\/span><span style=\"color:#75715e\">     *\/<\/span>\n    <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">readonly<\/span> <span style=\"color:#a6e22e\">bs<\/span>: <span style=\"color:#66d9ef\">bigint<\/span>[]\n\n    <span style=\"color:#66d9ef\">constructor<\/span>(<span style=\"color:#a6e22e\">a0<\/span>: <span style=\"color:#66d9ef\">bigint<\/span>, <span style=\"color:#66d9ef\">as<\/span><span style=\"color:#f92672\">?:<\/span> <span style=\"color:#66d9ef\">bigint<\/span>[], <span style=\"color:#a6e22e\">bs?<\/span>: <span style=\"color:#66d9ef\">bigint<\/span>[]) {\n        <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">a0<\/span>;\n        <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#66d9ef\">as<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">as<\/span> <span style=\"color:#f92672\">||<\/span> [];\n        <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">bs<\/span> <span style=\"color:#f92672\">||<\/span> [];\n\n        <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#66d9ef\">as<\/span>.<span style=\"color:#a6e22e\">length<\/span> <span style=\"color:#f92672\">!==<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>.<span style=\"color:#a6e22e\">length<\/span>) {\n            <span style=\"color:#66d9ef\">throw<\/span> <span style=\"color:#66d9ef\">new<\/span> Error(<span style=\"color:#e6db74\">&#34;Length of as must equal the length of bs&#34;<\/span>);\n        }\n    }\n<\/code><\/pre><\/div><h3 id=\"multiplication\">Multiplication<\/h3>\n<p>Leonardo introduces multiplication before addition, so we follow his lead. His approach for multiplying whole numbers by each other is effectively the same as the modern recipe, so we will not repeat it. For fractions, he converts numbers to a singular fraction (our term) such that there is a single number in the numerator and a single number in the denominator without any integral portions. The result is the product of the two numerators, divided by the product of the denominators, simplified.<\/p>\n<p>In recipe form:<\/p>\n<ol>\n<li>If both numbers are whole, multiply them via the place-value method.<\/li>\n<li>If either numbers have a fractional component:\n<ol>\n<li>Transform each to a fraction with a single numerator and denominator<\/li>\n<li>Multiply each numerator together for the new numerator<\/li>\n<li>Multiply each denominator together for the new denominator<\/li>\n<li>Simplify as required<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>More formally in Typescript (without simplification step):<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\">    <span style=\"color:#a6e22e\">mult<\/span>(<span style=\"color:#a6e22e\">b<\/span>: <span style=\"color:#66d9ef\">FCF<\/span>)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#a6e22e\">FCF<\/span> {\n        <span style=\"color:#66d9ef\">if<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">isIntegralOnly<\/span>()) {\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">a0<\/span>);\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">m1<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">asSingularFraction<\/span>();\n            <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">m2<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">asSingularFraction<\/span>();\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>, [<span style=\"color:#a6e22e\">m1<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">m2<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>]], [<span style=\"color:#a6e22e\">m1<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">m2<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#ae81ff\">0<\/span>]]);\n        }\n    }\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\">    <span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\">     * Return a singular fraction (a\/b) in the form FCF(0, [a], [b])\n<\/span><span style=\"color:#75715e\">     *\/<\/span>\n    <span style=\"color:#a6e22e\">asSingularFraction<\/span>()<span style=\"color:#f92672\">:<\/span> <span style=\"color:#a6e22e\">FCF<\/span> {\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">a<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#66d9ef\">as<\/span>.<span style=\"color:#a6e22e\">reduceRight<\/span>((<span style=\"color:#a6e22e\">prev<\/span>, <span style=\"color:#a6e22e\">curr<\/span>, <span style=\"color:#a6e22e\">idx<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">curr<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">prev<\/span> <span style=\"color:#f92672\">*<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#a6e22e\">idx<\/span>], <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span>);\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">b<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>.<span style=\"color:#a6e22e\">reduce<\/span>((<span style=\"color:#a6e22e\">a<\/span>, <span style=\"color:#a6e22e\">b<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">a<\/span> <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">b<\/span>, <span style=\"color:#ae81ff\">1<\/span><span style=\"color:#a6e22e\">n<\/span>);\n        <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>, [<span style=\"color:#a6e22e\">a<\/span>], [<span style=\"color:#a6e22e\">b<\/span>]);\n    }\n<\/code><\/pre><\/div><h3 id=\"addition\">Addition<\/h3>\n<p>Similar to multiplication of whole numbers, Leonardo&rsquo;s approach to adding whole numbers is effectively the same as the modern school child method. If the two numbers contain fractions of differing bases, Leonardo finds the factors of the denominators in order to find the least common multiple (LCM). He then transforms the numbers into simple mixed fractions with an integer portion and a single numerator and denominator, with the denominators of each number being the same.<\/p>\n<p>In recipe form:<\/p>\n<ol>\n<li>If both numbers are whole, sum them via the place-value method.<\/li>\n<li>If either numbers have a fractional component:\n<ol>\n<li>Find the least common multiple of the denominators<\/li>\n<li>Transform each fraction to having the LCM as the sole denominator<\/li>\n<li>Sum the fractions and integers<\/li>\n<li>Simplify as required<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>Leonardo does not include any examples with addition of two fractional numbers with the same bases, but we are confident that he would use a place value approach like in whole numbers. In the Typescript form, the differing bases case is the third clause.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\">    <span style=\"color:#a6e22e\">add<\/span>(<span style=\"color:#a6e22e\">b<\/span>: <span style=\"color:#66d9ef\">FCF<\/span>)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#a6e22e\">FCF<\/span> {\n        <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">isIntegralOnly<\/span>() <span style=\"color:#f92672\">&amp;&amp;<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">isIntegralOnly<\/span>()) {\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">a0<\/span>);\n        } <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">equalsBase<\/span>(<span style=\"color:#a6e22e\">b<\/span>)) {\n            <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">carry<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>;\n            <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">r_as<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">zip<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#66d9ef\">as<\/span>, <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#66d9ef\">as<\/span>).<span style=\"color:#a6e22e\">map<\/span>((([<span style=\"color:#a6e22e\">ai<\/span>, <span style=\"color:#a6e22e\">bi<\/span>], <span style=\"color:#a6e22e\">idx<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> {\n                <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">sum<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">ai<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">bi<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">carry<\/span>;\n                <span style=\"color:#a6e22e\">carry<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">sum<\/span> <span style=\"color:#f92672\">\/<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#a6e22e\">idx<\/span>];  <span style=\"color:#75715e\">\/\/ side effect mutation\n<\/span><span style=\"color:#75715e\"><\/span>                <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#a6e22e\">sum<\/span> <span style=\"color:#f92672\">%<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#a6e22e\">idx<\/span>];  <span style=\"color:#75715e\">\/\/ remainder\n<\/span><span style=\"color:#75715e\"><\/span>            }));\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">carry<\/span>, <span style=\"color:#a6e22e\">r_as<\/span>, <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>);\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            <span style=\"color:#66d9ef\">const<\/span> [<span style=\"color:#a6e22e\">std_a<\/span>, <span style=\"color:#a6e22e\">std_b<\/span>] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">stddenom<\/span>(<span style=\"color:#a6e22e\">b<\/span>);\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">a0<\/span>, [<span style=\"color:#a6e22e\">std_a<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">std_b<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>]], [<span style=\"color:#a6e22e\">std_a<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#ae81ff\">0<\/span>]]);\n        }\n    }\n<\/code><\/pre><\/div><p>Leonardo does not use the term &ldquo;least common multiple&rdquo; but his use of factors implies the same concept.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\">    <span style=\"color:#66d9ef\">protected<\/span> <span style=\"color:#a6e22e\">stddenom<\/span>(<span style=\"color:#a6e22e\">b<\/span>: <span style=\"color:#66d9ef\">FCF<\/span>)<span style=\"color:#f92672\">:<\/span> [<span style=\"color:#a6e22e\">FCF<\/span>, <span style=\"color:#a6e22e\">FCF<\/span>] {\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">lcm_denom<\/span>: <span style=\"color:#66d9ef\">bigint<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">chain<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>, <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">bs<\/span>).<span style=\"color:#a6e22e\">reduce<\/span>((<span style=\"color:#a6e22e\">acc<\/span>, <span style=\"color:#a6e22e\">denom<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">lcm<\/span>(<span style=\"color:#a6e22e\">acc<\/span>, <span style=\"color:#a6e22e\">denom<\/span>));\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">a_num_lcm<\/span>: <span style=\"color:#66d9ef\">bigint<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#66d9ef\">as<\/span>.<span style=\"color:#a6e22e\">reduce<\/span>((<span style=\"color:#a6e22e\">acc<\/span>, <span style=\"color:#a6e22e\">num<\/span>, <span style=\"color:#a6e22e\">idx<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">acc<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">num<\/span> <span style=\"color:#f92672\">*<\/span> (<span style=\"color:#a6e22e\">lcm_denom<\/span> <span style=\"color:#f92672\">\/<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#a6e22e\">idx<\/span>]), <span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>);\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">b_num_lcm<\/span>: <span style=\"color:#66d9ef\">bigint<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#66d9ef\">as<\/span>.<span style=\"color:#a6e22e\">reduce<\/span>((<span style=\"color:#a6e22e\">acc<\/span>, <span style=\"color:#a6e22e\">num<\/span>, <span style=\"color:#a6e22e\">idx<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">acc<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#a6e22e\">num<\/span> <span style=\"color:#f92672\">*<\/span> (<span style=\"color:#a6e22e\">lcm_denom<\/span> <span style=\"color:#f92672\">\/<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#a6e22e\">idx<\/span>]), <span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>);\n        <span style=\"color:#66d9ef\">return<\/span> [<span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">a0<\/span>, [<span style=\"color:#a6e22e\">a_num_lcm<\/span>], [<span style=\"color:#a6e22e\">lcm_denom<\/span>]), <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">a0<\/span>, [<span style=\"color:#a6e22e\">b_num_lcm<\/span>], [<span style=\"color:#a6e22e\">lcm_denom<\/span>])];\n    }\n<\/code><\/pre><\/div><h3 id=\"subtraction\">Subtraction<\/h3>\n<p>Subtraction is a mirror of addition. In all his examples, subtraction is always a greater number minus a less number, so his work does not discuss negative numbers. Negative numbers were first explained in print in Luca Pacioli&rsquo;s <em>Summa de arithmetica, geometria, proportioni et proportionalita<\/em> in 1494. Similar to <em>Liber Abaci<\/em>, Pacioli&rsquo;s work is both a theoretical and practical work, focused on problems on business, and notably described double-entry accounting (Devlin 2011, pg. 106).<\/p>\n<h3 id=\"division\">Division<\/h3>\n<p>The process for division for whole numbers is similar to long division except that Leonardo uses a box notation that is compact but harder to follow. For fractions, division is a mirror of multiplication but inverting the divisor.<\/p>\n<ol>\n<li>If both numbers are whole, divide them via the box method.<\/li>\n<li>If either numbers have a fractional component:\n<ol>\n<li>Transform each to a fraction with a single numerator and denominator<\/li>\n<li>Multiply the left&rsquo;s numerator with the right&rsquo;s denominator for the new numerator<\/li>\n<li>Multiply the left&rsquo;s denominator with the right&rsquo;s numerator for the new denominator<\/li>\n<li>Simplify as required<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>In Typescript:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\">    <span style=\"color:#a6e22e\">divide<\/span>(<span style=\"color:#a6e22e\">b<\/span>: <span style=\"color:#66d9ef\">FCF<\/span>)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#a6e22e\">FCF<\/span> {\n        <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">isZero<\/span>()) {\n            <span style=\"color:#66d9ef\">throw<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">RangeError<\/span>(<span style=\"color:#e6db74\">&#34;Division by zero&#34;<\/span>);\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">dividend<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">this<\/span>.<span style=\"color:#a6e22e\">asSingularFraction<\/span>();\n            <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">divisor<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">b<\/span>.<span style=\"color:#a6e22e\">asSingularFraction<\/span>();\n            <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">FCF<\/span>(<span style=\"color:#ae81ff\">0<\/span><span style=\"color:#a6e22e\">n<\/span>, [<span style=\"color:#a6e22e\">dividend<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">divisor<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#ae81ff\">0<\/span>]], [<span style=\"color:#a6e22e\">dividend<\/span>.<span style=\"color:#a6e22e\">bs<\/span>[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">divisor<\/span>.<span style=\"color:#66d9ef\">as<\/span>[<span style=\"color:#ae81ff\">0<\/span>]]);\n        }\n    }\n<\/code><\/pre><\/div><p>We were unable to find any mention of division by zero in <em>Liber<\/em> but assume Leonardo would treat it as an error.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>The composed fraction notation met the needs of the time by matching the reality of non-decimalization measures used in almost every aspect of business. The notation is explicit and suggests a universality. Writing the currency value of \u00a32 3s 4d as\n\\( \\frac{4\\ \\ 3}{12\\ 20}2 \\) is not only explicit in the proportions between librae, solidi, and denarii but also suggests quantities exist independent of their units. Place value arithmetic, described comprehensively and rigorously by Leonardo in the book, was superior to Roman numerals and calculating via the abacus.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Abdeljaouad 2004) Abdeljaouad, Mahdi. 2004. &ldquo;The Eight Hundred Anniversary of the Death of Ibn al-Yasamin: Bilaterality as part of his thinking and practice&rdquo; in <em>Huiti\u00e8me Colloque Maghr\u00e9bin sur l\u2019histoire des math\u00e9matiques arabes<\/em> (Tunis).<\/p>\n<p>(Devlin 2011) Devlin, Keith. 2011. <em>The Man of Numbers: Fibonacci&rsquo;s Arithmetic Revolution<\/em>. Walker &amp; Company.<\/p>\n<p>(H\u00f8yrup 2022) H\u00f8yrup, Jens. 2022. <em>The World of the Abbaco: Abbacus mathematics analyzed and situated historically between Fibonacci and Stifel<\/em>. (2022-03-01 version, manuscript in progress).<\/p>\n<p>(Sigler 2002) Sigler, L.E. 2002. <em>Fibonacci&rsquo;s Liber Abaci: Leonardo Pisano&rsquo;s Book of Calcuation<\/em>. Springer-Verlag.<\/p>\n<p>(Van Brummelen 2024) Van Brummelen, Glen. 2024. &ldquo;Decimal fractional numeration and the decimal point in 15th-century Italy&rdquo; in <em>Historia Mathematica<\/em> (Volume 66, Pages 1-13). <a href=\"https:\/\/doi.org\/10.1016\/j.hm.2024.01.001\">https:\/\/doi.org\/10.1016\/j.hm.2024.01.001<\/a><\/p>"},{"title":"The Life and Work of George Boole (Review)","link":"https:\/\/ztoz.blog\/posts\/life-work-george-boole-review\/","pubDate":"Thu, 19 Feb 2026 10:30:14 -0700","guid":"https:\/\/ztoz.blog\/posts\/life-work-george-boole-review\/","description":"<p>George Boole (1815-1864) was a mathematician best known as the progenitor of the Boolean algebra, the system of logic that acts as a fundamental layer within digital systems.  Although he never gained a college degree, he spent many productive years as the chair of mathematics at Queen&rsquo;s College, Cork. Desmond MacHale&rsquo;s <em>The Life and Work of George Boole<\/em> is a well-written exploration of Boole&rsquo;s life, personality, and impact.<\/p>\n<figure>\n\t<img src=\"cover.jpg\" class=\"fit-image\">\n\t<figcaption><em>The Life and Work of George Boole<\/em> Cover<\/figcaption>\n<\/figure>\n<p><em>The Life and Work of George Boole: A Prelude to the Digital Age<\/em> by Desmond MacHale. 2014. First published in 1985 by Boole Press Limited. Cork University Press. <a href=\"https:\/\/www.corkuniversitypress.com\/9781782050049\/the-life-and-work-of-george-boole\/\">Publisher Link<\/a><\/p>\n<p>Organized largely chronologically, the book covers the entirety of Boole&rsquo;s life, details on his parents, and features a chapter detailing his wife&rsquo;s life after his death and the lives of his children, many of which led adventurous lives. Boole&rsquo;s mathematics and contributions are covered at a general reader level; MacHale explicitly does not try to convey Boole&rsquo;s innovations at a professional level. That said, the book does not shy from equations (e.g. the key \\(x^2=x\\)) and Boole&rsquo;s contributions are described with precision. Boole is covered as a &ldquo;full&rdquo; human being and MacHale covers Boole&rsquo;s social attitudes, marriage and family relationships, academic relations and politics, and attitudes towards religion.<\/p>\n<p>MacHale&rsquo;s biography is not a hagiography and he includes criticism of Boole and places his work into context. For example, George showed an early passion for Greek and Latin texts and was largely self-taught as a translator. When George was 14, his father sent George&rsquo;s translation of <em>Ode to the Spring<\/em> by Meleager to the <em>Lincoln Herald<\/em>. A reader of the <em>Herald<\/em> objected that this supposed youth must have had help or cribbed the translation from someone else, which led to a series of letters to the editor about this supposed plagarism. While there is no evidence of plagarism and MacHale could have let the anecdote stand as a humorous controversy, he submitted the translation to several experts to ascertain its quality. The expert review helps cast Boole&rsquo;s &ldquo;boy genius&rdquo; exploits into a more grounded account.<\/p>\n<p>While Boole made important contributions to differential equations and difference equations, his work on logic has merited him the most fame. His first publication on logic was his book, <em>Mathematical Analysis of Logic<\/em> (1847). The book was published hurriedly and features some vague writing and a poor choice of notation, which led to a poor reception and minimal impact. Boole returned to the subject in 1854 with <em>The Laws of Thought<\/em>, which was far better received. As MacHale notes, however, Boole&rsquo;s algebra is not the Boolean algebra we know today, but some further development of notation and terminology (e.g. Boole used &lsquo;or&rsquo; to mean &lsquo;exclusive-or&rsquo;, while some readers assumed &lsquo;inclusive-or&rsquo;; later mathematicians made the choice of &lsquo;or&rsquo; explicit) made the concepts clearer. The development of mathematics is often messy.<\/p>\n<p>The book is well-written and MacHale manages to keep the reader interested even when the subject is conflicts about university management. Boole, after all, was an educator who was mostly home-bound and avoided controversy. Chapters on Boole&rsquo;s religion, social work, and poetry all add insight to his character. The book features about two dozen photographs and illustrations, although half date to after Boole&rsquo;s death.<\/p>\n<p>Negatively, with the subtitle &ldquo;a prelude to the digital age,&rdquo; we would have expected a more explicit connection between Boole&rsquo;s work and digital systems. We are not disputing the existence of that connection, but rather the author seems to treat it as a given. Victorian mathematics fell behind mathematics on the continent, but Boole was one of the few mathematicians to have name recognition across the Channel. We would have liked to see this explored, but the book focuses far more on relations within the United Kingdom (including an intriguing chapter on why Boole and Sir William Rowan Hamilton, both in Ireland, never collaborated).<\/p>\n<p>We recommend this book, particularly to those with an interest in the history of mathematics.<\/p>\n<p>For additional context on Victorian mathematics, we recommend <a href=\"https:\/\/global.oup.com\/academic\/product\/mathematics-in-victorian-britain-9780199601394\"><em>Mathematics in Victorian Britain<\/em><\/a> by Raymond Flood, Adrian Rice, and Robin Wilson (Oxford University Press). Additionally, <a href=\"https:\/\/www.corkuniversitypress.com\/9781782052906\/new-light-on-george-boole\/\"><em>New Light on George Boole<\/em><\/a> is a 2018 book from Demond MacHale and Yvonne Cohen that is a supplement to <em>Life and Work<\/em>. At the time of this review, we have ordered a copy of this book but have not yet read it.<\/p>"},{"title":"OREGON (Trail): BASIC Game Design from the Teletype Era","link":"https:\/\/ztoz.blog\/posts\/oregon\/","pubDate":"Tue, 20 Jan 2026 10:55:48 -0700","guid":"https:\/\/ztoz.blog\/posts\/oregon\/","description":"<p>In 1971, three college roommates developed an educational computer game. Titled OREGON, it was a simulation of westward migration along the Oregon Trail and proved to be a hit among the students. One of the designers, Don Rawitsch, later carried a copy of the source code to his new employer, Minnesota Educational Computing Consortium (MECC). From 1975 to 1983, OREGON was MECC&rsquo;s most popular courseware on their timeshare service. When MECC decided a new version was merited, which became <em>The Oregon Trail<\/em>, they mandated the design team to &ldquo;preserve whatever made the original so popular.&rdquo; With the availability of the source code for the 1975 and 1978 versions, we explore the design and implementation of OREGON and why it was popular and memorable.<\/p>\n<p><strong>Contents<\/strong><\/p>\n<ul>\n<li><a href=\"#timeline\">Timeline<\/a><\/li>\n<li><a href=\"#background-and-developers\">Background and Developers<\/a><\/li>\n<li><a href=\"#development-and-production-environment\">Development and Production Environment<\/a><\/li>\n<li><a href=\"#oregon-1975-version\">OREGON: 1975 Version<\/a><\/li>\n<li><a href=\"#oregon-1978--creative-computing-version\">OREGON: 1978 \/ Creative Computing Version<\/a><\/li>\n<li><a href=\"#the-journey-to-1985s-the-oregon-trail\">The Journey to 1985&rsquo;s <em>The Oregon Trail<\/em><\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ul>\n<h2 id=\"timeline\">Timeline<\/h2>\n<dl>\n<dt>1967<\/dt>\n<dd>Minneapolis-St. Paul schools form TIES, <em>Total Information for Educational Systems<\/em>, to provide timesharing services to schools. TIES acquires an HP 2000A computer.<\/dd>\n<dt>1970<\/dt>\n<dd>Honeywell launches EDINET, Educational Instructional Network. Unlimited timeshare access for $1000\/month, although schools had to supply their own teleprinters, modems, and pay for phone calls. The 1970s sees rapid growth of commercial and non-commercial timesharing solutions.<\/dd>\n<dt>1971<\/dt>\n<dd>Don Rawitsch, Bill Heinemann, and Paul Dillenberger design and implement the first version of OREGON within two weeks. Developed during off-hours, they use a teletype at a local high school connected to the TIES network. Students prove enthusiastic, but the program is deleted from the computer at the end of the semester to save space.<\/dd>\n<dt>1973<\/dt>\n<dd>MECC founded to rationalize computer use and spending within Minnesota schools.<\/dd>\n<dt>1974<\/dt>\n<dd>Don Rawitsch joins MECC, bringing a hard-copy of OREGON. Don makes some updates to the design and source code.<\/dd>\n<dt>1975<\/dt>\n<dd>MECC releases OREGON on the MECC timesharing network. The game immediately becomes MECC&rsquo;s most popular timesharing program and stays that way until 1983.<\/dd>\n<dd>After a competitive bid, MECC works to replace their HP 2000A with a Univac 1110. The transition is rocky as the Univac team misses capacity milestones.<\/dd>\n<dt>1977<\/dt>\n<dd>MECC publishes a user guide\/teacher&rsquo;s manual for OREGON. Detailing the game design, historical sources, and advice on integrating the game into a curriculum, the sample run includes the difficulty adjustment of the shooting sub-system seen in the 1978 version.<\/dd>\n<dt>1978<\/dt>\n<dd>CDC provides a pair of Cyber 73s to MECC to replace the Univac 1110.<\/dd>\n<dd>Rawitsch publishes the source code to OREGON in <em>Creative Computing<\/em>. The source code has been ported to the CDC Cyber 70\/73-26 BASIC 3.1 language. The article claims that MECC is serving 1,100 interactive terminals.<\/dd>\n<dd>MECC purchases five Apple II computers anticipating the market switch from timesharing to personal computers.<\/dd>\n<dt>1979<\/dt>\n<dd>MECC issues a competitive bid for educational-use microcomputers. Apple wins. The other bids are either non-compliant or miss the deadline.<\/dd>\n<dd>MECC places an Apple II port of OREGON on their timesharing network for downloading.<\/dd>\n<dt>1980<\/dt>\n<dd>MECC publishes <em>Elementary Volume 6<\/em>, a collection of five titles including OREGON, on Apple II diskettes. This version of OREGON (same as 1979) adds a graphical map, eliminates scrolling text, and features a limited graphical shooting game.<\/dd>\n<dt>1983<\/dt>\n<dd>MECC shutdowns their timesharing network.<\/dd>\n<dt>1984 October to 1985 July<\/dt>\n<dd>MECC develops a new version of OREGON from scratch since OREGON has become obsolete. This version becomes <em>The Oregon Trail<\/em> and MECC&rsquo;s most popular school and home product.<\/dd>\n<\/dl>\n<h2 id=\"background-and-developers\">Background and Developers<\/h2>\n<p>In 1971, Don Rawitsch, Bill Heinemann, and Paul Dillenberger are college roommates attending Carleton College in Minnesota. Don is a history major while Bill and Paul are math majors, but all are participating in the Carleton Urban Teaching Program as student teachers (Dillenberger et. al. 1972).<\/p>\n<p>As part of his student teacher training, Don is expected to teach an eighth-grade module on westward expansion in three weeks time. Seeking ways to better engage with the students, Don spends a week designing a board game to simulate a voyage along the Oregon Trail. While playtesting the game with his roommates, Bill suggests making it a computer game. While Bill and Paul have some experience programming, Don does not, and two weeks seems too short a time to write a new program. They decide to risk it.<\/p>\n<p>The three split the work. Don acts as designer and incorporates historical aspects based on his research. Bill provides the majority of the programming and program planning, including allocating line numbers for major sections. Paul writes subroutines and contributes OREGON&rsquo;s trademark humor as well as testing the program (Wong 2017). (Incidentally, the 1975 OREGON source code only credits Don while the 1978 version credits both Don and Bill.)<\/p>\n<p>Bill&rsquo;s experience with programming comes from a five-week course at Carleton that he took while a junior. He was evidently successful as he became a lab assistant the next trimester (Wong 2017). (This course was likely MATH 13, an introduction to computer programming. In the 1969\/1970 academic year, students ran programs on an IBM 1620 and used Fortran II. In the 1971\/1972 year, they switched to using BASIC or FOCARL (sic? FOCAL?) on a time sharing system. (Carleton 1969, pg 70))  Bill later became a professional programmer and held that position for 32 years (Hendricks 2020). Paul had similar training.<\/p>\n<p>Two weeks later, in December 1971, Don introduces the first draft of the program to his students at Jordan Junior High. Since the school has only a single teletype machine, Don separates the class into teams and each team sequentially attempt a journey. Although the program is buggy and many teams &ldquo;die&rdquo; during the journey, students are eager to play again. Paul, at Bryant Junior High, permits his math students to play the game after class. It is similarly a hit with students lining up for their turn (Bouchard 2016, 59).<\/p>\n<p>At the end of the semester, Don deletes the program. However, the three keep a hard copy of the source code.<\/p>\n<p>Although they were interviewed after their student teaching and collectively wrote a letter describing their experiences (Dillenberger et. al. 1972), neither source mentions OREGON directly. However, the interview, written by an unnamed staff writer of the student newspaper, may allude to the experiment. Karl Ruge, who was part of the Carleton teaching program and lived with the three, is quoted &ldquo;The term gimmicks has a bad connotation, but it shouldn&rsquo;t. Why do so many people flock to the computer lab? Working with those machines is enjoyable, and it leads people to explore things for themselves.\u201d Don followed up with &ldquo;Maybe &lsquo;variety&rsquo; is a better term than &lsquo;gimmick&rsquo; for the kind of teaching change we have in mind. By &lsquo;variety&rsquo; I mean continuous experimentation in classrooms to discover the most effective way to get things across.&rdquo; (Staff 1972, 3)<\/p>\n<p>This apparent attitude, and lack of publicity for their work, supports their later claims that the three did not consider the commercialization possibilities of the game. Rather, they saw it as an educational experiment and part of a larger effort to interest students in the material and make the material relevant to the students.<\/p>\n<p>A historical simulation as a computer game was ambitious for 1971, although computer games were in the zeitgeist at Carleton. &ldquo;Games People Play&rdquo;, an article in the student newspaper from May that year (Wrede 1971), writes about &ldquo;demos&rdquo; as means to encourage interest in computers. Carleton had recently acquired a DEC computer and it included a number of demos or games. The article notes that students and professors have written their own games to supplement those that came with the computer. While a pinball simulator had been banned due to students damaging the joysticks, a snowball fight simulator, sailing simulator, and a Star Trek battle game lived on.<\/p>\n<p>In 1974, Don joined MECC. MECC saw its mission as both providing computing resources by creating their own timesharing network but also as a source of courseware and other software specific to education. MECC was thus eager to expand their catalog with Don&rsquo;s copy of OREGON. An updated version of OREGON was available on the MECC network by 1975.<\/p>\n<h2 id=\"development-and-production-environment\">Development and Production Environment<\/h2>\n<p>Bill Heinemann and Paul Dillenberger used a teleprinter at Bryant Junior High during off-hours to type in the OREGON program. The terminal was connected via phone to the TIES educational network, a timesharing network of local schools connected to a HP 2000 series minicomputer running the 2000A timeshare system. Since access time was limited, they first wrote code on paper.<\/p>\n<p>HP Timesharing BASIC (also called Access BASIC) is an expanded version of Dartmouth BASIC, adding extensions such as a string datatype and computed goto.<\/p>\n<p>The 2000A timeshare system features a private workspace for each user to develop their own BASIC programs. Users can also share their programs with a group or the entire system with various protections in place. Thus, the developers could share OREGON with all students to run, but students would be unable to modify the program (HP 1975, section 8-6). (The system also supports protections such as prohibiting listing of the source code, but it is unclear if these additional controls were used.)<\/p>\n<p>The 2000A limits program names to six characters in length, hence &lsquo;OREGON&rsquo;.<\/p>\n<h2 id=\"oregon-1975-version\">OREGON: 1975 Version<\/h2>\n<p>The source code for the 1975 version of OREGON comes via <a href=\"https:\/\/mecc.co\/software\/timeshare\/\">MECC.Co<\/a>, a MECC software repository run by volunteers. When Don Rawitsch joined MECC in 1974, he carried a print-out of the 1971 version. Retyping the code from the print-out, Don made some software updates, including updating &ldquo;hostile Indians&rdquo; to &ldquo;hostile riders&rdquo; (Bouchard 2016, pg 74) and likely updating event probabilities based on further research. Comments in the source code date this version to March 27, 1975.<\/p>\n<p>Physically, the source code is 626 lines long and fewer than 17,900 characters in length. All characters are in upper-case. HP 2000A supported 7-bit ASCII (HP 1975, section 11-3), but many terminals did not support printing lower-case characters (e.g. <a href=\"https:\/\/www.pdp8online.com\/asr33\/asr33.shtml\">ASR 33<\/a>). Each line fits within 72 columns which was a common teletypewriter limit.<\/p>\n<p>A print-out on fan paper is about ten pages long. In <em>Creative Computing<\/em>, the listing of the 1978 version (686 lines) requires six pages with two columns per page. As a comparison, the three longest programs in <em>101 BASIC Computer Games<\/em> (1973), CANAM, SPCWAR, and YAHTZE, have 416, 539, and 558 lines respectively. This is evidence of the team&rsquo;s ambition and the complexity of their simulation versus contemporary games.<\/p>\n<h3 id=\"design-and-implementation\">Design and Implementation<\/h3>\n<p>The BASIC language of the time is unstructured, but the developers impose some structure by labeling sections of code. The source code features 15 sections delimited by comments (<code>REM<\/code> statements):<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Section<\/th>\n<th style=\"text-align:right\">Length (Lines)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Instructions<\/td>\n<td style=\"text-align:right\">43<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Initial Purchases<\/td>\n<td style=\"text-align:right\">43<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Setting Date<\/td>\n<td style=\"text-align:right\">42<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Beginning Each Turn<\/td>\n<td style=\"text-align:right\">54<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Stopping at Fort<\/td>\n<td style=\"text-align:right\">25<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Hunting<\/td>\n<td style=\"text-align:right\">20<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Eating<\/td>\n<td style=\"text-align:right\">14<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Riders Attack<\/td>\n<td style=\"text-align:right\">68<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Selection of Events<\/td>\n<td style=\"text-align:right\">115<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Mountains<\/td>\n<td style=\"text-align:right\">35<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Dying<\/td>\n<td style=\"text-align:right\">34<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Final Turn<\/td>\n<td style=\"text-align:right\">66<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Shooting Sub-routine<\/td>\n<td style=\"text-align:right\">9<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Illness Sub-routine<\/td>\n<td style=\"text-align:right\">18<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Variable Documentation<\/td>\n<td style=\"text-align:right\">30<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The program&rsquo;s front-matter accounts for the remaining lines of code.<\/p>\n<p>A section does not necessarily act as a &ldquo;procedure entry-point&rdquo;; while the code often <code>GOTO<\/code>s to the start of the section, some sections, notably Dying, have multiple entry-points depending on the manner of the avatar&rsquo;s death.<\/p>\n<p>OREGON is a resource-management game with a minor typing sub-game for shooting. Players manage food, clothing, medicine, miscellaneous supplies, and ammunition resources while attempting to cover 2,040 miles before winter arrives. Each turn, random events may  hinder or help their progress. The teacher&rsquo;s manual (Rawitsch 1977) is surprisingly detailed. It includes the probability distributions for events and graphically denotes the game-play loop. For our purposes, however, we found it useful to build an activity diagram of the game logic, using marked sections in the code as the activities:<\/p>\n<figure>\n\t<img src=\"OREGON.drawio.svg\" class=\"fit-image\">\n\t<figcaption>Activity Diagram of OREGON<\/figcaption>\n<\/figure>\n<p>Each game loop iteration represents two weeks, although the game calculates a partial two week period for the final turn. True to the memes of dying along the Oregon Trail, there are many paths to the Dying section. Players may choose to stop at a fort (which allows them to purchase supplies), hunt (which invokes the shooting sub-routine and may provide extra food), or continue (which avoids penalties to their movement speed). Players choose between three levels of eating which controls the consumption of food and adjusts their tolerance to disease. Riders may attack, with a probability based on the current mileage, which invokes the shooting sub-routine. Selection of Events selects a random narrative event, with probabilities based on real traveler diaries. Narrative events often draw down resources. If the player has reached the mountains, mountain events come into play. Mountain events are often fatal. Successfully traveling 2,040 miles displays a congratulatory message.<\/p>\n<h3 id=\"oregon-illustrator\">OREGON Illustrator<\/h3>\n<p>To aid in studying OREGON, we built a limited interpreter for HP Timesharing BASIC with debugger-like visibility and controls. Called <a href=\"https:\/\/oregon-illustrator-6a986b.gitlab.io\/\">OREGON Illustrator<\/a>, you can play OREGON while watching the source code execute in parallel.<\/p>\n<figure>\n\t<img src=\"OREGON_Illustrator_screenshot.png\" class=\"fit-image\">\n\t<figcaption>Screenshot of OREGON Illustrator<\/figcaption>\n<\/figure>\n<p>We know of the following limitations\/alterations in OREGON Illustrator versus a real HP Timesharing environment:<\/p>\n<ol>\n<li>Although Illustrator slows down the pace of execution to facilitate understanding of the business logic, the pace does not match the 100 baud limits of a Model 33 nor is there an attempt to match the processing speed limits of a HP 2000 series minicomputer.<\/li>\n<li>PRINT output is misaligned (column-wise) in some cases.<\/li>\n<li>We edited lines 405 and 2005 to use an ahistorical <code>MULTSET<\/code> command to initialize multiple variables rather than implementing the ambiguous <code>=<\/code> assignment\/equality operator.<\/li>\n<li>We removed &ldquo;bells&rdquo; (auditory signals) from lines 1755, 4015, and 4020.<\/li>\n<li>We&rsquo;ve made no attempt at matching HP 2000 series floating point arithmetic with JavaScript&rsquo;s standard floating point arithmetic.<\/li>\n<\/ol>\n<p>OREGON Illustrator is a Typescript Preact application using <a href=\"https:\/\/github.com\/EoinDavey\/tsPEG\">tsPeg<\/a> to generate the BASIC parser.<\/p>\n<h3 id=\"use-of-basic-language-and-techniques\">Use of BASIC Language and Techniques<\/h3>\n<p>Of the 626 lines, 33% are <code>PRINT<\/code> statements (n=210) and 23% are <code>LET<\/code> statements (n=141, implicit and explicit). Of the 50 comments, 29 are spent documenting the variables and 15 marking section boundaries. None of the comments are used to describe techniques or the game design. However, the prevalence of <code>PRINT<\/code> statements provides a form of documentation since most of the business logic and state changes are reported to the player.<\/p>\n<p>A breakdown of statements by line count:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Statement Kind<\/th>\n<th style=\"text-align:right\">Count<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">PRINT<\/td>\n<td style=\"text-align:right\">210<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LET<\/td>\n<td style=\"text-align:right\">141<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">GOTO \/ Computed GOTO<\/td>\n<td style=\"text-align:right\">98<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">IF-THEN<\/td>\n<td style=\"text-align:right\">94<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">REM<\/td>\n<td style=\"text-align:right\">50<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">INPUT<\/td>\n<td style=\"text-align:right\">14<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">GOSUB<\/td>\n<td style=\"text-align:right\">9<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">RETURN<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">STOP<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><code>DATA<\/code>, <code>DIM<\/code>, <code>END<\/code>, <code>ENTER<\/code>, <code>READ<\/code>, and <code>RESTORE<\/code> are each used once.<\/p>\n<p><code>ENTER<\/code> is one of HP&rsquo;s extensions to BASIC. <code>ENTER<\/code> operates similarly to <code>INPUT<\/code>, but the programmer can place a deadline on input and <code>ENTER<\/code> records how long the operator took before hitting enter. In the 1978 version, which did not target the HP Timesharing system, <code>ENTER<\/code> is replaced by calls to record the system clock before and after an <code>INPUT<\/code> to measure operator responsiveness.<\/p>\n<p>With the exception of the <code>ENTER<\/code> statement and the string variables, OREGON could have been ported to the original <a href=\"https:\/\/www.dartmouth.edu\/basicfifty\/commands.html\">Dartmouth BASIC<\/a>. Since the program does not read any files nor call other programs, it places few requirements on the environment. While the game was ambitious, the implementation technique was not.<\/p>\n<h3 id=\"observations-by-section\">Observations by Section<\/h3>\n<h4 id=\"start-of-game-loop-and-setting-date\">Start of Game Loop and Setting Date<\/h4>\n<p>Line 700 acts as the start of the game loop. If the mileage has reached 2,040 miles or the turn number has exceeded 17, we jump into the Final Turn. (The turn number check is eliminated in the 1978 version.)<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">700  <span style=\"color:#66d9ef\">IF<\/span> M <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">2040<\/span> <span style=\"color:#f92672\">OR<\/span> D3<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">17<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">4000<\/span>\n709  <span style=\"color:#75715e\">REM ***SETTING DATE***<\/span>\n710  D3<span style=\"color:#f92672\">=<\/span>D3<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">1<\/span>\n715  <span style=\"color:#66d9ef\">PRINT<\/span>\n720  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;MONDAY &#34;<\/span>;\n725  <span style=\"color:#66d9ef\">IF<\/span> D3<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">10<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">735<\/span>\n730  <span style=\"color:#66d9ef\">GOTO<\/span> D3 OF <span style=\"color:#ae81ff\">740<\/span>,<span style=\"color:#ae81ff\">750<\/span>,<span style=\"color:#ae81ff\">760<\/span>,<span style=\"color:#ae81ff\">770<\/span>,<span style=\"color:#ae81ff\">780<\/span>,<span style=\"color:#ae81ff\">790<\/span>,<span style=\"color:#ae81ff\">800<\/span>,<span style=\"color:#ae81ff\">810<\/span>,<span style=\"color:#ae81ff\">820<\/span>,<span style=\"color:#ae81ff\">830<\/span>\n735  <span style=\"color:#66d9ef\">GOTO<\/span> D3<span style=\"color:#ae81ff\">-10<\/span> OF <span style=\"color:#ae81ff\">840<\/span>,<span style=\"color:#ae81ff\">850<\/span>,<span style=\"color:#ae81ff\">860<\/span>,<span style=\"color:#ae81ff\">870<\/span>,<span style=\"color:#ae81ff\">880<\/span>,<span style=\"color:#ae81ff\">890<\/span>,<span style=\"color:#ae81ff\">900<\/span>\n740  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;APRIL 12 &#34;<\/span>;\n744  <span style=\"color:#66d9ef\">GOTO<\/span> 910\n750  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;APRIL 26 &#34;<\/span>;\n755  <span style=\"color:#66d9ef\">GOTO<\/span> 910\n<span style=\"color:#f92672\">...<\/span>\n890  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;NOVEMBER 8 &#34;<\/span>;\n895  <span style=\"color:#66d9ef\">GOTO<\/span> 910\n900  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;NOVEMBER 22 &#34;<\/span>;\n910  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;1847&#34;<\/span>\n915  <span style=\"color:#66d9ef\">PRINT<\/span>\n<\/code><\/pre><\/div><p>The developers use a computed goto, using HP BASIC&rsquo;s unusual <code>GOTO ... OF<\/code> syntax, to display the start of a two week period. Presumably for readability and to stay within the 72 column limit, the switch on the value of D3 is split into two lines.<\/p>\n<h4 id=\"hunting-and-shooting-subroutine\">Hunting and Shooting Subroutine<\/h4>\n<p>OREGON&rsquo;s action scenes were very popular. Shooting is part of hunting, defending from riders, and defending from wild animal attacks. While later versions embellished the shooting subgame with graphics, the original version requires very few lines and straight-forwardly leverages HP&rsquo;s <code>ENTER<\/code> statement.<\/p>\n<p>Players are given the option to hunt every turn. This is somewhat ahistorical as the diaries re-printed in <em>Overland in 1846<\/em>, a source used in the game design, note how difficult it was to find game once they passed the buffalo ranges. Dale Morgan, the editor, explicitly notes that hunting was an unreliable source of food based on multiple accounts (Morgan 1963, pgs 210-211, 406-407). (The documentation for the later <em>Westward Ho!<\/em> port shows a &ldquo;hunting yield&rdquo; curve versus mileage graph, but the code does not use mileage to adjust food gained by hunting.)<\/p>\n<p>If players type &lsquo;BANG&rsquo; within the deadline (a random time between zero and seven seconds), they gain some food at the cost of some mileage and bullets. If the players type &lsquo;BANG&rsquo; in less than a second, they gain more food and expend fewer bullets. Don noted that students self-organized teams around the fastest typers. The 1978 version varies the word that needs to be typed to increase the challenge.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">1699  <span style=\"color:#75715e\">REM ***HUNTING***<\/span>\n1700  <span style=\"color:#66d9ef\">IF<\/span> B<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">39<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1715<\/span>\n1705  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;TOUGH---YOU NEED MORE BULLETS TO GO HUNTING&#34;<\/span>\n1710  <span style=\"color:#66d9ef\">GOTO<\/span> 1310\n1715  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-45<\/span>\n1720  <span style=\"color:#66d9ef\">GOSUB<\/span> 4500\n1725  <span style=\"color:#66d9ef\">IF<\/span> B1 <span style=\"color:#f92672\">&lt;=<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1755<\/span>\n1730  <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#ae81ff\">100<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#ae81ff\">13<\/span><span style=\"color:#f92672\">*<\/span>B1 <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1780<\/span>\n1735  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">48-2<\/span><span style=\"color:#f92672\">*<\/span>B1\n1740  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;NICE SHOT--RIGHT THROUGH THE NECK--FEAST TONIGHT!!&#34;<\/span>\n1745  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-10-3<\/span><span style=\"color:#f92672\">*<\/span>B1\n1750  <span style=\"color:#66d9ef\">GOTO<\/span> 1800\n1752  <span style=\"color:#75715e\">REM **BELLS IN LINE 1755**<\/span>\n1755  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;RI&#34;<\/span><span style=\"color:#75715e\">&#39;7&#34;GHT BETWEE&#34;&#39;7&#34;N THE EYE&#34;&#39;7&#34;S---YOU GOT A&#34;&#39;7&#34; BIG ONE!!&#34;&#39;7&#34;!!&#34;<\/span>\n1765  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">52<\/span><span style=\"color:#f92672\">+<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">6<\/span>\n1770  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-10<\/span><span style=\"color:#f92672\">-<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">4<\/span>\n1775  <span style=\"color:#66d9ef\">GOTO<\/span> 1800\n1780  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;SORRY---NO LUCK TODAY&#34;<\/span>\n1800  <span style=\"color:#66d9ef\">IF<\/span> F <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">13<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1900<\/span>\n1805  <span style=\"color:#66d9ef\">GOTO<\/span> 3500\n<\/code><\/pre><\/div><p>The shooting subroutine reads the player&rsquo;s input via the <code>ENTER<\/code> statement. <code>ENTER<\/code> is an HP variant on <code>INPUT<\/code> which includes a deadline on input (<code>B2<\/code> in the code below), returns the time taken by the user (<code>B1<\/code>), and records the user&rsquo;s port number (<code>P<\/code>). The variable <code>P<\/code> was previously used to sum all the starting equipment purchases and is now clobbered, but at this point <code>P<\/code> is a write-only variable.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">4499  <span style=\"color:#75715e\">REM ***SHOOTING SUB-ROUTINE***<\/span>\n4500  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;TYPE BANG&#34;<\/span>;\n4505  B2<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">7<\/span>\n4510  C$<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;&#34;<\/span>\n4515  ENTER <span style=\"color:#960050;background-color:#1e0010\">#<\/span>P,B2,B1,C$\n4520  <span style=\"color:#66d9ef\">PRINT<\/span>\n4525  <span style=\"color:#66d9ef\">IF<\/span> C$<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;BANG&#34;<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">4535<\/span>\n4530  B1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">7<\/span>\n4535  <span style=\"color:#66d9ef\">RETURN<\/span>\n<\/code><\/pre><\/div><h4 id=\"eating\">Eating<\/h4>\n<p>The Eating section demonstrates the tedious nature of input validation in BASIC. The player is given three options for eating&mdash;poorly, moderately, or well. The food level (<code>F<\/code>) is decremented accordingly. Validation requires comparing the value to a range, twice, each of which can loop, and then converting to an integer. Given how many programs were menu-driven, we are surprised that a higher-level &ldquo;choice&rdquo; <code>INPUT<\/code> statement was not included in the early BASICs. Oddly, the Eating section also updates mileage and clears the blizzard and clothing flags, an example of how sections were not always independent concerns.<\/p>\n<p>The game can soft-lock if the food level is less than 3. If <code>F<\/code> is 2, then line 1930 will cause <code>F<\/code> to become -1, even if the player is eating poorly. Line 1940 will undo the food consumption and then the game will loop back to 1900 on line 1950.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">1899  <span style=\"color:#75715e\">REM ***EATING***<\/span>\n1900  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;DO YOU WANT TO EAT (1) POORLY  (2) MODERATELY&#34;<\/span>\n1902  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;OR (3) WELL&#34;<\/span>;\n1905  <span style=\"color:#66d9ef\">INPUT<\/span> E\n1910  <span style=\"color:#66d9ef\">IF<\/span> E<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1900<\/span>\n1915  <span style=\"color:#66d9ef\">IF<\/span> E<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">1900<\/span>\n1920  <span style=\"color:#66d9ef\">LET<\/span> E<span style=\"color:#f92672\">=<\/span><span style=\"color:#66d9ef\">INT<\/span>(E)\n1930  <span style=\"color:#66d9ef\">LET<\/span> F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#ae81ff\">-8-5<\/span><span style=\"color:#f92672\">*<\/span>E\n1935  <span style=\"color:#66d9ef\">IF<\/span> F <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2000<\/span>\n1940  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">8<\/span><span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">5<\/span><span style=\"color:#f92672\">*<\/span>E\n1945  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU CAN&#39;T EAT THAT WELL&#34;<\/span>\n1950  <span style=\"color:#66d9ef\">GOTO<\/span> 1900\n2000  <span style=\"color:#66d9ef\">LET<\/span> M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">200<\/span><span style=\"color:#f92672\">+<\/span>(A<span style=\"color:#ae81ff\">-220<\/span>)<span style=\"color:#f92672\">\/<\/span><span style=\"color:#ae81ff\">5<\/span><span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">10<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2005  L1<span style=\"color:#f92672\">=<\/span>C1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">0<\/span>\n<\/code><\/pre><\/div><p>Eating levels were also used in the Illness Subroutine where they adjust the probability of severe illness.<\/p>\n<p>The pattern of updating a variable and, if the result is negative, undoing that update with a &ldquo;negative&rdquo; version of the computation, is visible several times in the code. This appears to be evidence of the author&rsquo;s inexperience with code maintainability as this pattern requires duplicating logic across space. The developers could have potentially used user defined functions to centralize the business logic. User defined functions are supported in Access BASIC and CDC&rsquo;s BASIC 3.1.<\/p>\n<h4 id=\"selection-of-events\">Selection of Events<\/h4>\n<p>Selection of Events, the longest section in the code base, performs a weighted random choice of a narrative event. Narrative events are drawn from diary entries in (Morgan 1963); specifically, the first three diaries. Methodologically, the first three diary entries are not truly independent of each other and Rawitsch could have incorporated additional diary entries for greater variety (although not all diaries in the book are relevant to the simulation). Quibbles aside, the narratives add flavor to the game and Rawitsch is commendably transparent about his approach and the source of data in the teacher&rsquo;s manual.<\/p>\n<p>Programmatically, the probability distribution is stored as a <code>DATA<\/code> array on line 2535. Although the <code>READ<\/code> (line 2525) is before the <code>DATA<\/code> line (line 2535), this is not an issue because BASIC interpreters scanned all lines for <code>DATA<\/code> blocks on start-up. The code generates a random number on line 2510 and then tests the number against the cumulative probability value in the <code>DATA<\/code> block, one by one. Based on the index, the code then jumps to the specific event via computed gotos (lines 2540 and 2545).<\/p>\n<p>The developers either forgot to document <code>D<\/code> as a variable or felt it was unnecessary to document as it is only used as temporary storage.<\/p>\n<p>Conceptually, the narrative events could have been stored in a separate file, allowing a &ldquo;data-driven&rdquo; means to expand the game. While most events add or subtract resources, there is quite a bit of variety and some events invoke the shooting subroutine. Thus, it would have been difficult to codify the events strictly in data.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">2499  <span style=\"color:#75715e\">REM ***SELECTION OF EVENTS***<\/span>\n2500  <span style=\"color:#66d9ef\">LET<\/span> D1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">0<\/span>\n2505  <span style=\"color:#66d9ef\">RESTORE<\/span>\n2510  R1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">100<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(TIM(<span style=\"color:#ae81ff\">0<\/span>))\n2515  <span style=\"color:#66d9ef\">LET<\/span> D1<span style=\"color:#f92672\">=<\/span>D1<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">1<\/span>\n2520  <span style=\"color:#66d9ef\">IF<\/span> D1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">16<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">3020<\/span>\n2525  <span style=\"color:#66d9ef\">READ<\/span> D\n2530  <span style=\"color:#66d9ef\">IF<\/span> R1<span style=\"color:#f92672\">&gt;<\/span>D <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2515<\/span>\n2535  <span style=\"color:#66d9ef\">DATA<\/span> <span style=\"color:#ae81ff\">6<\/span>,<span style=\"color:#ae81ff\">11<\/span>,<span style=\"color:#ae81ff\">13<\/span>,<span style=\"color:#ae81ff\">15<\/span>,<span style=\"color:#ae81ff\">17<\/span>,<span style=\"color:#ae81ff\">22<\/span>,<span style=\"color:#ae81ff\">32<\/span>,<span style=\"color:#ae81ff\">35<\/span>,<span style=\"color:#ae81ff\">37<\/span>,<span style=\"color:#ae81ff\">42<\/span>,<span style=\"color:#ae81ff\">44<\/span>,<span style=\"color:#ae81ff\">54<\/span>,<span style=\"color:#ae81ff\">64<\/span>,<span style=\"color:#ae81ff\">69<\/span>,<span style=\"color:#ae81ff\">95<\/span>\n2537  <span style=\"color:#66d9ef\">IF<\/span> D1<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">10<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2545<\/span>\n2540  <span style=\"color:#66d9ef\">GOTO<\/span> D1 OF <span style=\"color:#ae81ff\">2550<\/span>,<span style=\"color:#ae81ff\">2570<\/span>,<span style=\"color:#ae81ff\">2590<\/span>,<span style=\"color:#ae81ff\">2615<\/span>,<span style=\"color:#ae81ff\">2630<\/span>,<span style=\"color:#ae81ff\">2645<\/span>,<span style=\"color:#ae81ff\">2660<\/span>,<span style=\"color:#ae81ff\">2690<\/span>,<span style=\"color:#ae81ff\">2785<\/span>,<span style=\"color:#ae81ff\">2810<\/span>\n2545  <span style=\"color:#66d9ef\">GOTO<\/span> D1<span style=\"color:#ae81ff\">-10<\/span> OF <span style=\"color:#ae81ff\">2825<\/span>,<span style=\"color:#ae81ff\">2860<\/span>,<span style=\"color:#ae81ff\">2885<\/span>,<span style=\"color:#ae81ff\">2970<\/span>,<span style=\"color:#ae81ff\">2990<\/span>,<span style=\"color:#ae81ff\">3020<\/span>\n2550  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WAGON BREAKS DOWN--LOSE TIME AND SUPPLIES FIXING IT&#34;<\/span>\n2555  <span style=\"color:#66d9ef\">LET<\/span> M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-15-5<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2560  <span style=\"color:#66d9ef\">LET<\/span> M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-8<\/span>\n2565  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2570  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;OX INJURES LEG---SLOWS YOU DOWN REST OF TRIP&#34;<\/span>\n2575  <span style=\"color:#66d9ef\">LET<\/span> M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-25<\/span>\n2580  <span style=\"color:#66d9ef\">LET<\/span> A<span style=\"color:#f92672\">=<\/span>A<span style=\"color:#ae81ff\">-20<\/span>\n2585  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2590  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;BAD LUCK---YOUR DAUGHTER BROKE HER ARM&#34;<\/span>\n2595  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU HAD TO STOP AND USE SUPPLIES TO MAKE A SLING&#34;<\/span>\n2600  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-5-4<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2605  M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-2-3<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2610  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2615  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;OX WANDERS OFF---SPEND TIME LOOKING FOR IT&#34;<\/span>\n2620  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-17<\/span>\n2625  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2630  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOUR SON GETS LOST---SPEND HALF THE DAY LOOKING FOR HIM&#34;<\/span>\n2635  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-10<\/span>\n2640  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2645  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;UNSAFE WATER--LOSE TIME LOOKING FOR CLEAN SPRING&#34;<\/span>\n2650  <span style=\"color:#66d9ef\">LET<\/span> M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-10<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#ae81ff\">-2<\/span>\n2655  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2660  <span style=\"color:#66d9ef\">IF<\/span> M<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">950<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2935<\/span>\n2665  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;HEAVY RAINS---TIME AND SUPPLIES LOST&#34;<\/span>\n2670  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#ae81ff\">-10<\/span>\n2672  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-500<\/span>\n2675  M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-15<\/span>\n2680  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-10<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#ae81ff\">-5<\/span>\n2685  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2690  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;BANDITS ATTACK&#34;<\/span>\n2700  <span style=\"color:#66d9ef\">GOSUB<\/span> 4500\n2705  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-20<\/span><span style=\"color:#f92672\">*<\/span>B1\n2715  <span style=\"color:#66d9ef\">IF<\/span> B <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2735<\/span>\n2720  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU RAN OUT OF BULLETS---THEY GET LOTS OF CASH&#34;<\/span>\n2725  T<span style=\"color:#f92672\">=<\/span>T<span style=\"color:#f92672\">\/<\/span><span style=\"color:#ae81ff\">3<\/span>\n2730  <span style=\"color:#66d9ef\">GOTO<\/span> 2740\n2735  <span style=\"color:#66d9ef\">IF<\/span> B1 <span style=\"color:#f92672\">&lt;=<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2770<\/span>\n2740  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU GOT SHOT IN THE LEG AND THEY TOOK ONE OF YOUR OXEN&#34;<\/span>\n2745  K8<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span>\n2750  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;BETTER HAVE A DOC LOOK AT YOUR WOUND&#34;<\/span>\n2755  M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-5<\/span>\n2760  A<span style=\"color:#f92672\">=<\/span>A<span style=\"color:#ae81ff\">-20<\/span>\n2765  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2770  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;QUICKEST DRAW OUTSIDE OF DODGE CITY!!!&#34;<\/span>\n2775  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU GOT &#39;EM!&#34;<\/span>\n2780  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2785  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;THERE WAS A FIRE IN YOUR WAGON--FOOD AND SUPPLIES DAMAGED&#34;<\/span>\n2790  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#ae81ff\">-40<\/span>\n2792  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-400<\/span>\n2795  <span style=\"color:#66d9ef\">LET<\/span> M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#f92672\">-<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">8-3<\/span>\n2800  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-15<\/span>\n2805  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2810  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;LOSE YOUR WAY IN HEAVY FOG---TIME IS LOST&#34;<\/span>\n2815  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-10-5<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2820  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2825  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU KILLED A POISONOUS SNAKE AFTER IT BIT YOU&#34;<\/span>\n2830  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-10<\/span>\n2835  M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-5<\/span>\n2840  <span style=\"color:#66d9ef\">IF<\/span> M1 <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2855<\/span>\n2845  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU DIE OF SNAKEBITE SINCE YOU HAVE NO MEDICINE&#34;<\/span>\n2850  <span style=\"color:#66d9ef\">GOTO<\/span> 3600\n2855  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2860  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WAGON GETS SWAMPED FORDING RIVER--LOSE FOOD AND CLOTHES&#34;<\/span>\n2865  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#ae81ff\">-30<\/span>\n2870  C<span style=\"color:#f92672\">=<\/span>C<span style=\"color:#ae81ff\">-20<\/span>\n2875  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-20-20<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)\n2880  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2885  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WILD ANIMALS ATTACK!&#34;<\/span>\n2887  <span style=\"color:#66d9ef\">GOSUB<\/span> 4500\n2889  <span style=\"color:#66d9ef\">IF<\/span> B<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">39<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2895<\/span>\n2890  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU WERE TOO LOW ON BULLETS--&#34;<\/span>\n2891  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;THE WOLVES OVERPOWERED YOU&#34;<\/span>\n2892  K8<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span>\n2893  <span style=\"color:#66d9ef\">GOTO<\/span> 3555\n2895  <span style=\"color:#66d9ef\">IF<\/span> B1<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2910<\/span>\n2900  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;NICE SHOOTIN&#39; PARDNER---THEY DIDN&#39;T GET MUCH&#34;<\/span>\n2905  <span style=\"color:#66d9ef\">GOTO<\/span> 2915\n2910  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;SLOW ON THE DRAW---THEY GOT AT YOUR FOOD AND CLOTHES&#34;<\/span>\n2915  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-20<\/span><span style=\"color:#f92672\">*<\/span>B1\n2920  C<span style=\"color:#f92672\">=<\/span>C<span style=\"color:#f92672\">-<\/span>B1<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">4<\/span>\n2925  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#f92672\">-<\/span>B1<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">8<\/span>\n2930  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2935  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;COLD WEATHER---BRRRRRRR!---YOU &#34;<\/span>;\n2940  <span style=\"color:#66d9ef\">IF<\/span> C<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">22<\/span><span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">4<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>) <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">2955<\/span>\n2945  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;DON&#39;T &#34;<\/span>;\n2950  C1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span>\n2955  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;HAVE ENOUGH CLOTHING TO KEEP YOU WARM&#34;<\/span>\n2960  <span style=\"color:#66d9ef\">IF<\/span> C1<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">3100<\/span>\n2965  <span style=\"color:#66d9ef\">GOTO<\/span> 4700\n2970  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;HAIL STORM---SUPPLIES DAMAGED&#34;<\/span>\n2975  M<span style=\"color:#f92672\">=<\/span>M<span style=\"color:#ae81ff\">-5<\/span><span style=\"color:#f92672\">-<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">10<\/span>\n2977  B<span style=\"color:#f92672\">=<\/span>B<span style=\"color:#ae81ff\">-200<\/span>\n2980  M1<span style=\"color:#f92672\">=<\/span>M1<span style=\"color:#ae81ff\">-4<\/span><span style=\"color:#f92672\">-<\/span><span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">3<\/span>\n2985  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n2990  <span style=\"color:#66d9ef\">IF<\/span> E<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">4700<\/span>\n2995  <span style=\"color:#66d9ef\">IF<\/span> E<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">3010<\/span>\n3000  <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">.25<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">4700<\/span>\n3005  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n3010  <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#66d9ef\">RND<\/span>(<span style=\"color:#ae81ff\">0<\/span>)<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#ae81ff\">.5<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">4700<\/span>\n3015  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n3020  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;HELPFUL INDIANS SHOW YOU WHERE TO FIND MORE FOOD&#34;<\/span>\n3025  F<span style=\"color:#f92672\">=<\/span>F<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">14<\/span>\n3030  <span style=\"color:#66d9ef\">GOTO<\/span> 3100\n<\/code><\/pre><\/div><h4 id=\"dying\">Dying<\/h4>\n<p>You can&rsquo;t die of dysentery in OREGON 1975. You can, however, die of starvation, pneumonia, or injuries. Historically, many died of cholera or &ldquo;mountain fever,&rdquo; but this simulation did not attempt to be exhaustive. Family members and tombstones were introduced in the 1985 version. (The <a href=\"https:\/\/web.archive.org\/web\/20090810141840\/http:\/\/www.geocities.com\/meccalumni\/Timeline803.html\">MECC Alumni Timeline<\/a> claims that a Wisconsin teacher asked at the 1982 conference &ldquo;Is there a way to delete names and messages entered by students on the onscreen tombstones in The Oregon Trail?&rdquo;, but this date, and others on the timeline, conflict with other accounts.)<\/p>\n<p>The Dying section has multiple entry points and callers may have jumped to line 3500, 3520, 3550, or 3555. Given how line numbers from 3500 to 3600 increment erratically, we suspect this section was adjusted multiple times to accommodate all the paths to death.<\/p>\n<p>Setting <code>T<\/code> (cash available) to zero on line 3520 is meaningless because <code>T<\/code> is no longer used. There may have been some intent to use <code>T<\/code> as part of a scoring mechanism or possibly as a way to influence the funeral arrangements.<\/p>\n<p>While the game could have used a very simple &ldquo;you have died&rdquo; message, the section includes some narrative dark humor and prolongs the end of the game by asking questions about burial preferences. This embellishment was due to Paul Dillenberger (Wong 2017) and we believe it adds significantly to the program&rsquo;s charm.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-basic\" data-lang=\"basic\">3499  <span style=\"color:#75715e\">REM ***DYING***<\/span>\n3500  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU RAN OUT OF FOOD AND STARVED TO DEATH&#34;<\/span>\n3505  <span style=\"color:#66d9ef\">GOTO<\/span> 3600\n3520  <span style=\"color:#66d9ef\">LET<\/span> T<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">0<\/span>\n3525  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU CAN&#39;T AFFORD A DOCTOR&#34;<\/span>\n3530  <span style=\"color:#66d9ef\">GOTO<\/span> 3555\n3550  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU RAN OUT MEDICAL SUPPLIES&#34;<\/span>\n3555  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOU DIED OF &#34;<\/span>;\n3560  <span style=\"color:#66d9ef\">IF<\/span> K8<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">3575<\/span>\n3565  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;PNEUMONIA&#34;<\/span>\n3570  <span style=\"color:#66d9ef\">GOTO<\/span> 3600\n3575  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;INJURIES&#34;<\/span>\n3600  <span style=\"color:#66d9ef\">PRINT<\/span>\n3602  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;DO TO YOUR UNFORTUNATE SITUATION, THERE ARE A FEW&#34;<\/span>\n3605  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;FORMALITIES WE MUST GO THROUGH&#34;<\/span>\n3610  <span style=\"color:#66d9ef\">PRINT<\/span>\n3615  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WOULD YOU LIKE A MINISTER?&#34;<\/span>\n3620  <span style=\"color:#66d9ef\">INPUT<\/span> C$\n3630  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WOULD YOU LIKE A FANCY FUNERAL?&#34;<\/span>\n3635  <span style=\"color:#66d9ef\">INPUT<\/span> C$\n3650  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WOULD YOU LIKE US TO INFORM YOUR NEXT OF KIN?&#34;<\/span>\n3652  <span style=\"color:#66d9ef\">INPUT<\/span> C$\n3654  <span style=\"color:#66d9ef\">IF<\/span> C$<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#34;YES&#34;<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#ae81ff\">3670<\/span>\n3656  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;YOUR AUNT NELLIE IN ST. LOUIS IS ANXIOUS TO HEAR&#34;<\/span>\n3658  <span style=\"color:#66d9ef\">PRINT<\/span>\n3670  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;WE THANK YOU FOR THIS INFORMATION AND WE ARE SORRY YOU&#34;<\/span>\n3675  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;DIDN&#39;T MAKE IT TO THE GREAT TERRITORY OF OREGON&#34;<\/span>\n3680  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#e6db74\">&#34;BETTER LUCK NEXT TIME&#34;<\/span>\n3685  <span style=\"color:#66d9ef\">PRINT<\/span>\n3690  <span style=\"color:#66d9ef\">PRINT<\/span>\n3695  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#66d9ef\">TAB<\/span>(<span style=\"color:#ae81ff\">30<\/span>);<span style=\"color:#e6db74\">&#34;SINCERELY&#34;<\/span>\n3700  <span style=\"color:#66d9ef\">PRINT<\/span>\n3705  <span style=\"color:#66d9ef\">PRINT<\/span> <span style=\"color:#66d9ef\">TAB<\/span>(<span style=\"color:#ae81ff\">17<\/span>);<span style=\"color:#e6db74\">&#34;THE OREGON CITY CHAMBER OF COMMERCE&#34;<\/span>\n3710  <span style=\"color:#66d9ef\">STOP<\/span>\n<\/code><\/pre><\/div><h3 id=\"popularity\">Popularity<\/h3>\n<p>In the mid-70s, educational software tended to be fairly simple quizzes or drills. Even the contemporary PLATO network, which supported graphics and multiple users, advertised individual &ldquo;lessons&rdquo; that tended to be <a href=\"https:\/\/umsi580.lsait.lsa.umich.edu\/s\/PLATOs-Citizens\/page\/promotional-materials\">straight-forward exercises<\/a> or simple physical simulations. In contrast, OREGON is replayable, as narrative events change each playthrough, and the iterative game loop allows students to test different strategies. Furthermore, the game features an actual, developed theme which helps engage the player&rsquo;s imagination (particularly important given the lack of graphics).<\/p>\n<p>Popular entertainment also contributed to student&rsquo;s interest. Although the American West was less popular as a setting on television than in the 1960s which featured <em>Wagon Train<\/em>, <em>Bonanza<\/em>, and <em>Gunsmoke<\/em>, Western movies were still a staple in theaters and the pioneer drama <em>Little House on the Prairie<\/em> started its nine year run in 1974.<\/p>\n<p>The mid-1980s saw large investments into educational games and new franchises were born such as <em>Math Blaster!<\/em> (1983), <em>Reader Rabbit<\/em> (1983), and <em>Where in the World is Carmen Sandiego<\/em> (1985). However, historical simulation\/roleplaying educational games appear to be rare. <a href=\"https:\/\/serious.gameclassification.com\/EN\/games\/1233-Seven-Cities-of-Gold\/index.html\"><em>Seven Cities of Gold<\/em> (1984)<\/a> is a notable counter-example but was sold as an edutainment title, not educational software. Since OREGON had a teacher&rsquo;s manual and advice for integrating into a lesson plan, OREGON was solidly positioned as courseware or educational software, even if the home market broadened MECC&rsquo;s market positioning.<\/p>\n<p>Thus, even up to the 1985 re-design, OREGON filled a niche with few competitors, featured a popular theme, and could be approached as a game rather than a computerized drill.<\/p>\n<h2 id=\"oregon-1978--creative-computing-version\">OREGON: 1978 \/ Creative Computing Version<\/h2>\n<h3 id=\"provenance\">Provenance<\/h3>\n<p>Using a scan of the May\/June issue of <em>Creative Computing<\/em> (Rawitsch 1978), we extracted a textual version of the source code via Google Gemini. The scan was not high-quality. We then visually checked the extracted source code against the scan and the 1975 version. We found 24 errors in the output. About two-thirds of the errors were character-level&mdash;individual characters swapped with similar looking characters or digits. The remaining errors included broad logic changes (e.g. <code>3120 M=M+20<\/code> rather than the correct <code>3120 GOSUB 4120<\/code>, <code>3560 D1=D1+1<\/code> versus <code>3560 RESTORE<\/code>, and <code>INT(D3)<\/code> rather than <code>;D3;<\/code>). Twenty-four errors out of 686 lines is an error rate of 3.5%. This is a reminder that researchers should be cautious about LLM translations of scanned historical source code.<\/p>\n<p>The source code front-matter states a version of <code>01\/01\/78<\/code>. Since the 1977 user manual has a sample run that includes features seen in the 1978 version, we know that some of the alterations were made prior to this date. Since MECC ran a Univac 1110 for a number of years, we suspect many of the changes were first made on a now-lost Univac BASIC version.<\/p>\n<h3 id=\"delta-from-1975\">Delta from 1975<\/h3>\n<p>We used a <a href=\"OREGON_Parallel_Reading.pdf\">parallel reading<\/a> approach to compare the 1975 to 1978 version. The 1978&rsquo;s line numbers have been renumbered, so a straight-forward file diff was not an option. The 1978 version targets &ldquo;CDC CYBER 70\/73-26 BASIC 3.1&rdquo;.<\/p>\n<p>The game design, probability functions, and narrative events are unchanged between 1975 and 1978. The update includes some minor changes:<\/p>\n<ul>\n<li>Date handling is extended into December<\/li>\n<li>Spelling corrections<\/li>\n<li>Swaps HP&rsquo;s unusual <code>GOTO OF<\/code> syntax with the more common <code>ON GOTO<\/code> syntax<\/li>\n<li>Drops the <code>LIN(2)<\/code> print function and instead uses two <code>PRINT<\/code> statements<\/li>\n<li>Arguments passed to <code>RND<\/code> function changed to reflect CDC semantics<\/li>\n<li>Some player messaging (<code>PRINT<\/code> statements added or dropped)<\/li>\n<\/ul>\n<p>The most significant change is the shooting sub-routine. At the start of the program, the player is asked how good a shot they are. When a player is asked to shoot, rather than always asking them to type &lsquo;BANG&rsquo;, they are randomly prompted with &lsquo;BANG&rsquo;, &lsquo;BLAM&rsquo;, &lsquo;POW&rsquo;, or &lsquo;WHAM&rsquo;. CDC Basic did not include the <code>ENTER<\/code> statement, so the system clock is checked before and after an <code>INPUT<\/code> to record user latency. Measured latency is adjusted by the user&rsquo;s chosen shooting skill, providing an additional challenge to the game. The source code includes a helpful comment describing how the approach may be tailored depending on the computing environment.<\/p>\n<h3 id=\"why-was-this-published\">Why was this published?<\/h3>\n<p>Commercially, it seems odd that MECC would allow its most popular program to be shared  via a magazine article. At this point in history, though, MECC was not a commercial organization, but rather a quasi-public consortium. MECC was expected to provide public goods and this was an inexpensive way to share one of their successes.<\/p>\n<p>Furthermore, MECC&rsquo;s revenue was based on a subscription model paid by their client schools. While a competitor could add OREGON to their own timesharing network, the marginal value of a single program was unlikely to woo subscribers away. Client schools saw MECC as both a &ldquo;production&rdquo; system as MECC provided space and language interpreters for students and staff to develop new programs, as well as a &ldquo;consumption&rdquo; system with existing courseware.<\/p>\n<h3 id=\"westward-ho\">Westward Ho!<\/h3>\n<p>Independent of MECC, David Ahl remade the 1978 version of OREGON as &ldquo;Westward Ho!&rdquo;, one of ten simulation games published in his 1986&rsquo;s book <em>BASIC Computer Adventures<\/em>. Ahl updated the program code to Microsoft BASIC syntax and redesigned some of the logic, shortening the code. (HP Timesharing BASIC did not allow compound statements.) However, the majority of the game is the same, including the narrative events and their probabilities. The existence of ten simulation games, many with historical context, is a counter-example to the claim of OREGON&rsquo;s uniqueness. However, these games were not commercially competitive.<\/p>\n<p>Later re-released in a 25th anniversary edition and ported to Small BASIC, these type-in books served as intermediate training books in programming. The text explains the game design and program design and, through this form, many programmers may have had their craft improve by studying the implementation. Without sales data, however, we have a difficult time estimating the impact.<\/p>\n<h2 id=\"the-journey-to-1985s-the-oregon-trail\">The Journey to 1985&rsquo;s <em>The Oregon Trail<\/em><\/h2>\n<p>By the late 1970s, a Minnesota school might have had a small number of teletypes, enough for classes to form teams. However, students and schools both desired each student to have their own computer interface. MECC saw that microcomputers would replace terminals in the near future and re-aligned the company to that vision.<\/p>\n<p>In 1978, MECC purchased five Apple II computers for experimenting with the new platform (LaFrenz 1995, pg 26). Anticipating large purchases, they issued an RFP to the prominent microcomputer manufacturers of the day, including Apple, Radio Shack, Atari, and Commodore. Apparently seeing the market as too small, Radio Shack replied with a non-compliant bid for the TRS-80 and refused to correct it. As MECC is subject to government acquisition law, they rejected the bid. Atari and Commodore failed to respond by the deadline. Apple submitted a compliant bid within the deadline, so they won. MECC focused on creating software for the Apple II. (ibid, pg 27)<\/p>\n<p>In 1979, MECC finishes porting OREGON to the Apple II and uploads a copy to their timeshare network. In 1980, OREGON is one of five titles included on <em>Elementary Volume 6<\/em>. The game has been updated with a graphical map, the scrolling text is now fixed text on a screen, and the shooting game features some animation. MECC is mixing the Apple II&rsquo;s support for graphic and text modes, but they aren&rsquo;t yet taking full advantage of the machine&rsquo;s capabilities (Bouchard 2016, pg 78).<\/p>\n<p>In 1983, MECC shuts down their timesharing network. OREGON continues to live on via the Apple II port.<\/p>\n<p>With increasing competition, MECC decides to refresh the game from scratch rather than incrementally improving the shooting subgame yet again. In 1984, a new development team, with Philip Bouchard acting as the designer, implement a new game, <em>The Oregon Trail<\/em>. We recommend reading Bouchard&rsquo;s book <em>You Have Died of Dysentery<\/em> for a thorough discussion of that game&rsquo;s design, implementation, and business history.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Bouchard 2016) Bouchard, R. Philip. 2016. You Have Died of Dysentery: The Creation of The Oregon Trail \u2013 the Iconic Educational Game of the 1980s. <a href=\"http:\/\/www.amazon.com\/dp\/B01B8JMKMC\">http:\/\/www.amazon.com\/dp\/B01B8JMKMC<\/a>.<\/p>\n<p>(Carleton 1969) \u201cCarleton College Academic Catalog 1969-1970.\u201d 1969. July. <a href=\"https:\/\/contentdm.carleton.edu\/digital\/collection\/ACAT\/id\/59245\/rec\/21\">https:\/\/contentdm.carleton.edu\/digital\/collection\/ACAT\/id\/59245\/rec\/21<\/a>.<\/p>\n<p>(Dillenberger et. al. 1972) Dillenberger, Paul, Ken Ruge, Bill Heinemann, and Don Rawitsch. 1972. \u201cCity Teaching vs Mother Carleton.\u201d Letters. The \u2019Tonian, January 20. <a href=\"https:\/\/archive.carleton.edu\/Detail\/collections\/146482\">https:\/\/archive.carleton.edu\/Detail\/collections\/146482<\/a><\/p>\n<p>(Hendricks 2020) Hendricks, Kevin D. 2020. \u201cBill Heinemann: Oregon Trail, Creativity, &amp; Chess.\u201d Interviews. West St. Paul Reader, December 14. <a href=\"https:\/\/weststpaulreader.com\/2020\/12\/14\/bill-heinemann-oregon-trail\/\">https:\/\/weststpaulreader.com\/2020\/12\/14\/bill-heinemann-oregon-trail\/<\/a>.<\/p>\n<p>(HP 1975) HP 2000\/Access BASIC Reference Manual. 1975. Hewlett-Packard Company. <a href=\"https:\/\/bitsavers.org\/pdf\/hp\/2000TSB\/22687-90001_AccessBasic9-75.pdf\">https:\/\/bitsavers.org\/pdf\/hp\/2000TSB\/22687-90001_AccessBasic9-75.pdf<\/a>.<\/p>\n<p>(LaFrenz 1995) LaFrenz, Dale. 1995. \u201cAn Interview with DALE LAFRENZ.\u201d April 13. <a href=\"https:\/\/hdl.handle.net\/11299\/107423\">https:\/\/hdl.handle.net\/11299\/107423<\/a>.<\/p>\n<p>(Morgan 1963) Morgan, Dale. 1963. Overland in 1846: Diaries and Letters of the California-Oregon Trail. Vol. 1. University of Nebraska Press.<\/p>\n<p>(Rawitsch 1977) Rawitsch, Don. 1977. Oregon: User Manual to Accompany Programs in Timeshare Library. MECC (Minnesota Educational Computing Consortium). <a href=\"https:\/\/mecc.co\/software\/timeshare\/oregontrail_manual_timeshar.pdf\">https:\/\/mecc.co\/software\/timeshare\/oregontrail_manual_timeshar.pdf<\/a>.<\/p>\n<p>(Rawitsch 1978) Rawitsch, Don. 1978. \u201cOregon Trail (1978).\u201d Creative Computing, June. <a href=\"https:\/\/archive.org\/details\/creativecomputing-1978-05\/page\/132\">https:\/\/archive.org\/details\/creativecomputing-1978-05\/page\/132<\/a><\/p>\n<p>(Staff 1972) Staff. 1972. \u201cCarls\u2019 Teaching Experiences Verify Irony.\u201d The \u2019Tonian, January 13. Volume 91, Number 8 Edition. <a href=\"https:\/\/archive.carleton.edu\/Detail\/collections\/146481\">https:\/\/archive.carleton.edu\/Detail\/collections\/146481<\/a><\/p>\n<p>(Wong 2017) Wong, Kevin. 2017. \u201cThe Forgotten History of \u2018The Oregon Trail,\u2019 As Told By Its Creators.\u201d Vice, February 15. <a href=\"https:\/\/www.vice.com\/en\/article\/the-forgotten-history-of-the-oregon-trail-as-told-by-its-creators\/\">https:\/\/www.vice.com\/en\/article\/the-forgotten-history-of-the-oregon-trail-as-told-by-its-creators\/<\/a>.<\/p>\n<p>(Wrede 1971) Wrede, Jim. 1971. \u201cGames People Play.\u201d The \u2019Tonian, May 20. Vol 90, Number 18 Edition. <a href=\"https:\/\/archive.carleton.edu\/Detail\/collections\/146471\">https:\/\/archive.carleton.edu\/Detail\/collections\/146471<\/a><\/p>"},{"title":"Design of a Wild Shape\/Polymorph Decision Aid","link":"https:\/\/ztoz.blog\/posts\/wildshape-decision\/","pubDate":"Fri, 17 Oct 2025 09:22:41 -0600","guid":"https:\/\/ztoz.blog\/posts\/wildshape-decision\/","description":"<p>In Fifth Edition (5e) <em>Dungeons &amp; Dragons<\/em>\u2122 (D&amp;D), druids can transform into certain Beast forms using the <em>Wild Shape<\/em> ability. Similarly, casters of the <em>Polymorph<\/em> spell can transform allies and enemies into Beasts. Players often find selecting a Beast form onerous (there are 84 forms defined in the System Reference Document (SRD) alone) as this is a multi-dimensional decision problem. We created an <a href=\"https:\/\/owlettepublications.com\/tool\/wildshape-polymorph-aid\/\">online tool<\/a> to aid in the decision process. Our tool presents two synthetic metrics to capture offensive and defensive strengths in combat and assigns beasts to one of four labels (combat, explore, water, and sky) based on the results of machine clustering. In this article, we discuss the user interface and the data science used by the tool.<\/p>\n<h2 id=\"motivation\">Motivation<\/h2>\n<p>Game designer Sid Meier has <a href=\"https:\/\/www.gamedeveloper.com\/design\/gdc-2012-sid-meier-on-how-to-see-games-as-sets-of-interesting-decisions\">stated<\/a> that \u201cGames are a series of interesting decisions&quot;. In D&amp;D, transforming a creature into another form, as the druid&rsquo;s power of <em>Wild Shape<\/em> or the spell <em>Polymorph<\/em> allow, is an interesting decision by Meier&rsquo;s definition. The decision is situational, where a player needs to consider the current environment and other constraints, how the new form will impact the group&rsquo;s ability to explore and combat foes, and if the transformation is worth the resource trade-off.<\/p>\n<p>The choice can be difficult. While both <em>Wild Shape<\/em> and <em>Polymorph<\/em> restrict the transformed form into a Beast, a creature type that encompasses real-world natural creatures such as spiders, wolves, dinosaurs, and giant-sized variants, this is not very restrictive in practice as players will have dozens of potential Beast forms to choose among. Further, while the pool of <em>Wild Shape<\/em> forms has to be selected ahead of time (four to eight slots, with one slot swappable per Long Rest), <em>Polymorph<\/em> forms are selected at the time of casting. Selecting a form can thus slow down the game, particularly during combat, as the caster looks through their options.<\/p>\n<p>As players, we have experienced this difficulty and wanted to make the selection faster and with greater confidence. Commonly, beginning players follow the rule book&rsquo;s suggestion of Rat, Riding Horse, Spider, and Wolf (SRD, pg. 42), but as we discuss near the end of the article, we believe this recommendation can be improved. Experienced players tend to create spreadsheets to help them make these decisions and have the statblocks ready at hand during play. Although creating spreadsheets is a signal of engagement, we do not want the process to require this level of commitment.<\/p>\n<p>We decided to build a tool. Our objectives:<\/p>\n<ul>\n<li>Useful for the 84 beasts in the SRD, but methodologically expandable to other bestiaries. Many authors have created listicles ranking Beast forms, but we would like a solution that can scale to the 400+ products on DM&rsquo;s Guild with new Beast forms or the 1200+ Homebrew Beasts on D&amp;D Beyond with a rating of 2+.<\/li>\n<li>Require fewer rules lookups and implicit filters than <a href=\"https:\/\/www.dndbeyond.com\/monsters\">D&amp;D Beyond<\/a>.<\/li>\n<li>Reduce time and effort spent by players by amplifying the distinctions between Beasts.<\/li>\n<\/ul>\n<h3 id=\"examples\">Examples<\/h3>\n<p>In the D&amp;D movie <em>Honor Among Thieves<\/em>, the druid Doric transforms into a horse to hide in plain sight among a group of mounted knights. The horse form allows Doric to infiltrate the group, whereas other forms might arose suspicion. In this case, the form is highly situational, but normally the druid will need to select forms that fulfill many potential scenarios.<\/p>\n<h4 id=\"wild-shape-combat-selection\"><em>Wild Shape<\/em> Combat Selection<\/h4>\n<p>A player of a 3rd level druid is swapping out a <em>Wild Shape<\/em> form ahead of a trip into a large, natural cave network. The party is in pursuit of a bandit gang. The party consists mostly of spell-casters and the druid is expected to hold the front-line in combat.<\/p>\n<p>Using the tool, the player first selects &ldquo;Druid, Level 2-3&rdquo; under Show Beasts. This filters the list to Beasts with challenge ratings between 0 and 0.25 and removes from the list any Beasts with a Fly speed (per the game&rsquo;s rules for <em>Wild Shape<\/em>). The player looks at the Combat Fitness plot.<\/p>\n<figure><img src=\"combat_fitness_druid_lvl2.png\"\n         alt=\"Combat Fitness Chart: Beasts available for Level 2 Druids\"\/><figcaption>\n            <p>Combat Fitness Chart: Beasts available for Level 2 Druids<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>In combat, dealing damage is less important than holding the line, so the player looks at options with higher defensiveness scores (Tank). The top four options by Tank are the Giant Crab, Giant Lizard, Giant Badger, and the Giant Frog (we present the data using an inline table rather than a screenshot for accessibility):<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:center\">Beast Name<\/th>\n<th style=\"text-align:center\">CR<\/th>\n<th style=\"text-align:center\">Role<\/th>\n<th style=\"text-align:center\">DPR<\/th>\n<th style=\"text-align:center\">Tank<\/th>\n<th style=\"text-align:center\">Senses<\/th>\n<th style=\"text-align:center\">Actions<\/th>\n<th style=\"text-align:center\">Specials<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:center\">Giant Crab<\/td>\n<td style=\"text-align:center\">\u215b<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">2.80<\/td>\n<td style=\"text-align:center\">52.00<\/td>\n<td style=\"text-align:center\">Blindsight 30 ft.<\/td>\n<td style=\"text-align:center\">Claw<\/td>\n<td style=\"text-align:center\">Amphibious, Stealth +3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Giant Lizard<\/td>\n<td style=\"text-align:center\">\u00bc<\/td>\n<td style=\"text-align:center\">Explore<\/td>\n<td style=\"text-align:center\">4.50<\/td>\n<td style=\"text-align:center\">47.50<\/td>\n<td style=\"text-align:center\">Darkvision 60 ft.<\/td>\n<td style=\"text-align:center\">Bite<\/td>\n<td style=\"text-align:center\">Spider Climb<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Giant Badger<\/td>\n<td style=\"text-align:center\">\u00bc<\/td>\n<td style=\"text-align:center\">Fight<\/td>\n<td style=\"text-align:center\">4.80<\/td>\n<td style=\"text-align:center\">42.96<\/td>\n<td style=\"text-align:center\">Darkvision 60 ft.<\/td>\n<td style=\"text-align:center\">Bite<\/td>\n<td style=\"text-align:center\">Perception +3, Resists: Poison<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Giant Frog<\/td>\n<td style=\"text-align:center\">\u00bc<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">3.50<\/td>\n<td style=\"text-align:center\">40.00<\/td>\n<td style=\"text-align:center\">Darkvision 30 ft.<\/td>\n<td style=\"text-align:center\">Bite, Swallow<\/td>\n<td style=\"text-align:center\">Amphibious, Standing Leap, Perception +2, Stealth +4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The player chooses the Giant Lizard. Although all four forms have something to recommend themselves, the Giant Lizard has Spider Climb, which allows it to climb along the walls and ceilings of the cave. The Lizard has excellent ability to operate in the dark environment of the cave, although if the bandits were known to use obscuration, the Giant Crab&rsquo;s blindsight might be preferred. Similarly, if the bandits were known for using poison, the Giant Badger&rsquo;s resistance would be a great asset.<\/p>\n<h4 id=\"polymorph-selection\"><em>Polymorph<\/em> Selection<\/h4>\n<p>The party of a seventh-level wizard are exploring an underwater kelp forest. The wizard considers casting <em>Polymorph<\/em> on their sixth-level ranger to explore the area faster. Within the tool, the player chooses &ldquo;Beasts with max CR 6&rdquo; (the level of the ranger) and sets &ldquo;Require Movement Speeds&rdquo; Swim.<\/p>\n<figure><img src=\"combat_fitness_cr6_swim.png\"\n         alt=\"Combat Fitness Chart: Beasts with CR 6 and below with Swim\"\/><figcaption>\n            <p>Combat Fitness Chart: Beasts with CR 6 and below with Swim<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Since the purpose is exploration rather than pure combat, the player looks at a range of beasts, sorting by challenge rating, and focusing on senses. The Killer Whale pops out of the list with a Blindsight range of 120 ft which will allow it to rapidly map the underwater terrain, even with obscuration from the kelp plants. If the Killer Whale encounters combat, it is still one of the more deadly underwater creatures even though it only has a CR of 3. Unlike <em>Wild Shape<\/em>, <em>Polymorph<\/em> uses the Beast&rsquo;s stat block skills so the Perception +3 and Stealth +4 also assist the purpose of exploration.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:center\">Beast Name<\/th>\n<th style=\"text-align:center\">CR<\/th>\n<th style=\"text-align:center\">Role<\/th>\n<th style=\"text-align:center\">DPR<\/th>\n<th style=\"text-align:center\">Tank<\/th>\n<th style=\"text-align:center\">Senses<\/th>\n<th style=\"text-align:center\">Actions<\/th>\n<th style=\"text-align:center\">Specials<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:center\">Giant Crocodile<\/td>\n<td style=\"text-align:center\">5<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">37.05<\/td>\n<td style=\"text-align:center\">283.33<\/td>\n<td style=\"text-align:center\">-<\/td>\n<td style=\"text-align:center\">Multiattack, Bite, Tail<\/td>\n<td style=\"text-align:center\">Hold Breath, Stealth +5<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Giant Shark<\/td>\n<td style=\"text-align:center\">5<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">22.00<\/td>\n<td style=\"text-align:center\">262.86<\/td>\n<td style=\"text-align:center\">Blindsight 60 ft.<\/td>\n<td style=\"text-align:center\">Multiattack, Bite<\/td>\n<td style=\"text-align:center\">Water Breathing, Perception +3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Archelon<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">11.90<\/td>\n<td style=\"text-align:center\">600.00<\/td>\n<td style=\"text-align:center\">-<\/td>\n<td style=\"text-align:center\">Multiattack, Bite<\/td>\n<td style=\"text-align:center\">Amphibious, Stealth +5<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Hippopotamus<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">14.40<\/td>\n<td style=\"text-align:center\">273.33<\/td>\n<td style=\"text-align:center\">-<\/td>\n<td style=\"text-align:center\">Multiattack, Bite<\/td>\n<td style=\"text-align:center\">Hold Breath, Perception +3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Killer Whale<\/td>\n<td style=\"text-align:center\">3<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">17.85<\/td>\n<td style=\"text-align:center\">225.00<\/td>\n<td style=\"text-align:center\">Blindsight 120 ft.<\/td>\n<td style=\"text-align:center\">Bite<\/td>\n<td style=\"text-align:center\">Hold Breath, Perception +3, Stealth +4<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Giant Constrictor Snake<\/td>\n<td style=\"text-align:center\">2<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">9.35<\/td>\n<td style=\"text-align:center\">150.00<\/td>\n<td style=\"text-align:center\">Blindsight 10 ft.<\/td>\n<td style=\"text-align:center\">Multiattack, Bite, Constrict<\/td>\n<td style=\"text-align:center\">Perception +2, Stealth +4<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Hunter Shark<\/td>\n<td style=\"text-align:center\">2<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">11.90<\/td>\n<td style=\"text-align:center\">112.50<\/td>\n<td style=\"text-align:center\">Blindsight 60 ft.<\/td>\n<td style=\"text-align:center\">Bite<\/td>\n<td style=\"text-align:center\">Water Breathing, Perception +2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:center\">Plesiosaurus<\/td>\n<td style=\"text-align:center\">2<\/td>\n<td style=\"text-align:center\">Water<\/td>\n<td style=\"text-align:center\">9.35<\/td>\n<td style=\"text-align:center\">194.29<\/td>\n<td style=\"text-align:center\">-<\/td>\n<td style=\"text-align:center\">Bite<\/td>\n<td style=\"text-align:center\">Hold Breath, Perception +3, Stealth +4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"a-note-on-references\">A Note on References<\/h3>\n<p>Wizards of the Coast publishes the D&amp;D rules in two different ways: a set of commercially available books and the <a href=\"https:\/\/www.dndbeyond.com\/srd\">System Reference Document<\/a>. The former contain rules and flavor which make D&amp;D a recognizable set of fantasy worlds while the latter solely enumerates mechanics and &ldquo;generic&rdquo; spells and monsters. For example, beholders are an iconic monster in D&amp;D and are detailed in the commercial books but are not included in the SRD. Wizards of the Coast licenses the SRD under the Creative Commons (specifically, CC-BY-4.0), allowing other publishers to produce products compatible with the ruleset. Thus, following the license, we reference the SRD, not the Player&rsquo;s Handbook or Monster Manual.<\/p>\n<h3 id=\"a-note-on-editions-and-versions\">A Note on Editions and Versions<\/h3>\n<p>The D&amp;D rules have evolved over many editions. Fifth edition is the current edition at the time of writing. The first version of Fifth Edition was released in 2014 and is covered in the SRD version 5.1. A second version of Fifth Edition was released in 2024 and is covered in the SRD version 5.2. Our tool uses the mechanics and stat blocks described in SRD version 5.2.1.<\/p>\n<h2 id=\"user-interface\">User Interface<\/h2>\n<p>The tool&rsquo;s user interface is displayed in the figure below. Filter controls are featured along the top and below the fold is a visualization of combat fitness and a table with the matching beasts.<\/p>\n<p>When a player interacts with the tool, we expect they come either oriented as a druid character or as a caster of <em>Polymorph<\/em>. Thus, the first and primary filter (&ldquo;Show Beasts&rdquo;) either operates to filter by the <em>Wild Shape<\/em> criteria for a druid&rsquo;s level or the challenge rating filter for  <em>Polymorph<\/em>. Since subclasses, homebrew, and other spells (such as <em>Animal Shapes<\/em>) may modify what is considered a valid Beast target, we provide some additional options to cover likely cases. However, the list is not inclusive of all possibilities. For example, <em>Animal Shapes<\/em> restricts forms to Large and smaller in size, but none of our controls filter by size. As an 8th level spell, few players will ever cast <em>Animal Shapes<\/em> and we have tried to balance likelihood of use versus flexibility.<\/p>\n<p>The existence of a movement speed operates as the secondary filter. Movement speeds act as a proxy for environment (creatures without a Climb or Swim speed can do both, but at a penalty) and, in our experience, druids will often reserve a slot for access to an unusual environment.<\/p>\n<p>The third filter is a free-form field that restricts by textual prefix matched against a creature&rsquo;s list of actions, traits, and similar fields. Our hypothesis is that the third criteria players will use will tend to focus on specific abilities. Since the universe of &ldquo;specials&rdquo; is small, we could use a drop-down, but we have left it as an open-field since we may need a more flexible filter, such as one that allows exclusive filters (&ldquo;not&rdquo;).<\/p>\n<figure><img src=\"wildshape-selection-ui-zoom67.png\"\n         alt=\"User Interface at 67% Zoom\"\/><figcaption>\n            <p>User Interface at 67% Zoom<\/p>\n        <\/figcaption>\n<\/figure>\n<p>The Combat Fitness Plot (shown expanded in the screenshot) positions beasts matching the filter by their DPR (damage per round) and Tank (hit points scaled by difficulty of being hit) scores, or roughly their offensive and defensive strength. (See &ldquo;Interpreting Combat Effectiveness&rdquo; section below for more details on these metrics.) The plot can be hidden since 1) players may not be interested in the combat fitness scores 2) most displays will require vertical scrolling to show the current filter, plot, and table and 3) encourage players to focus on the table. We have kept the plot because, in our experience, we found it quickly summarizes the basic combat effectiveness of creatures and helps players focus on differentiating factors.<\/p>\n<p>Our table has columns for challenge rating, suggested role label, DPR score, tank score, list of unusual senses, list of actions, bonus actions, and reactions, and a list of specials which includes skills, traits, and similar fields. In comparison, the leading D&amp;D data source, D&amp;D Beyond, displays columns for challenge rating, creature type, size, alignment, habitat, and tags (which is usually blank). Both tables include links to full descriptions of a creature. D&amp;D Beyond&rsquo;s choice of columns (which are fixed in their UI) are meant for high-level encounter design. A game master building an encounter may be interested in monster alignment and habitat to fit world-building flavor, and is not restricted to Beast as a creature type, but these fields are irrelevant to a druid or caster. (Also, from a license perspective, creature habitat is not in the SRD.)<\/p>\n<h2 id=\"interpreting-combat-effectiveness-dpr-and-tank-scores\">Interpreting Combat Effectiveness: DPR and Tank Scores<\/h2>\n<p>Combat in 5e is highly tactical, incorporating positioning, cover, constrained movement, choice of attacks, and varied spells and special abilities. However, Beasts, being modeled after real-world creatures, tend to have more straight-forward combat tactics than, say, a Gibbering Mouther which creates difficult terrain, can mind control opponents to act randomly, and has a rechargeable area of effect attack. Thus, our hypothesis is that we can capture a useful portion of a Beast&rsquo;s combat effectiveness with two metrics.<\/p>\n<p>We propose two metrics: Damage Per Round (DPR) and Tank. DPR is the expected damage to an opponent with AC 10 if the Beast uses all their attacks against the same foe. Tank is the creature&rsquo;s hit points adjusted by their AC as if they were being attacked by an average human (Attack +0). Both metrics ignore any situational adjustments or special abilities. For instance, in combat Giant Frogs and Giant Toads will attempt to swallow opponents, rendering them Grappled and thus impaired on their attacks, but in our model, they never do. Similarly, apes are unusual among Beasts because they have a ranged attack (throwing rocks), but this model does not consider the advantages of attacking at range.<\/p>\n<h3 id=\"challenge-rating\">Challenge Rating<\/h3>\n<p>Challenge Rating is a pre-existing metric that serves to &ldquo;summarize the threat&rdquo; (SRD, pg 178) of a monster. While useful (we include it in the tool&rsquo;s table), we feel it is insufficient as a metric of combat effectiveness. First, it insufficiently differentiates monsters. A 2nd level Druid will have access to Beasts with three challenge ratings: 0, \u215b, and \u00bc with roughly a dozen Beasts in each category. However, a Giant Crab and a Venomous Snake, both CR \u215b, operate differently in combat, with the Giant Crab acting as a defensive sponge of attacks, while the Venomous Snake is more of a glass cannon. CR is too imprecise.<\/p>\n<h3 id=\"damage-per-round-dpr\">Damage Per Round (DPR)<\/h3>\n<p>Damage Per Round models a creature&rsquo;s melee and direct damage attacks using their attack bonuses and expected damage scores. This score underestimates the deadliness of creatures with special abilities and unusual traits, while sometimes overestimating the damage output. For instance, many snakes inflict both Piercing and Poison damage and thus measure highly on DPR. However, if the opponent has Poison resistance, a snake will be of little threat. Plotting DPR against Challenge Rating shows a large variance of scores within a single CR category. The Pearson correlation score between the two metrics, across all monsters, is 0.59, a moderately positive correlation.<\/p>\n<figure><img src=\"dpr_by_low_cr_boxplot.png\"\n         alt=\"Boxplot of DPR by Challenge Rating\"\/><figcaption>\n            <p>Boxplot of DPR by Challenge Rating<\/p>\n        <\/figcaption>\n<\/figure>\n<p>(The CR-1 outlier is the &ldquo;Pirate&rdquo;, a non-Beast monster whose attack is written in an unusual format which does not match our Beast-oriented stat block regular expression. Calculating DPR for monsters, particularly higher challenge-rating monsters, is difficult due to the variety of attacks and special defenses.)<\/p>\n<h3 id=\"tank\">Tank<\/h3>\n<p>Tank is the equivalent hit points of a creature if we adjusted their AC to 10. This model ignores other defensive traits like resistances and immunities, but these are rare among Beasts. 5e&rsquo;s design incorporates a concept called &ldquo;bounded accuracy&rdquo; where the curve of improvements in AC is suppressed so more attacks will hit. To compensate, hit points are increased compared to previous editions. When we plot Tank scores by Challenge Rating, we can see that the percentiles follow a curve. Across all monsters in the SRD, the Tank score has a Pearson correlation score of 0.88 with the monster&rsquo;s Challenge Rating, a strong, positive correlation.<\/p>\n<figure><img src=\"tank_by_low_cr_boxplot.png\"\n         alt=\"Boxplot of Tank by Challenge Rating\"\/><figcaption>\n            <p>Boxplot of Tank by Challenge Rating<\/p>\n        <\/figcaption>\n<\/figure>\n<p>(The CR-1 Tank outlier is &ldquo;Animated Armor,&rdquo; a Construct made of Plate Armor that is tough and has a high armor class.)<\/p>\n<h2 id=\"clustering-beasts-by-role\">Clustering Beasts by Role<\/h2>\n<p>Early users provided feedback that it would be useful if beasts were labeled with a specialty or role. Although manually assigning labels is certainly feasible for the 84 beasts in the SRD, we wanted a method that could scale to the far greater number of beasts that exist in homebrew and in non-core rulebooks. Thus, we decided to use a machine clustering approach to discover existing but implicit groupings of Beasts.<\/p>\n<p>Based on our experiences and Internet commentary, we expected the algorithm to find three roles: a combat role (with mostly carnivores), a reconnaissance role (with flyers, burrowers, and stealthy creatures like cats), and a mobility or utility role (with horses, mules, and other beasts of burden). Instead, the algorithm found four clusters (roles). Two of the roles reasonably match the combat and reconnaissance roles, but the other two roles were environment-based: water-based creatures and flying creatures.<\/p>\n<p>As a contrast, the 2016 <a href=\"https:\/\/media.wizards.com\/2016\/dnd\/downloads\/UA_Druid11272016_CAWS.pdf\">Unearthed Arcana: Druid<\/a> document has &ldquo;combat, climbing, stealth, and serving as a mount&rdquo; as the list of suggested roles. We similarly found a cluster for combat. Our Explore cluster contains most of the &ldquo;climbing&rdquo; beasts and those that can serve as stealthy explorers, although Stealth by itself was too common a skill to be a differentiator between clusters. (Since Stealth is a skill, the Wild Shaped creature will have the greater Stealth score of the original form and the beast form.) Mechanically, a mount is any creature at least one size larger than the rider and with &ldquo;appropriate anatomy&rdquo; (SRD, pg. 15). Our algorithm did not find a &ldquo;mount&rdquo; cluster because &ldquo;appropriate anatomy&rdquo; is flavor, not a quantitative element found in the monster&rsquo;s stat block. The lists of mounts in the SRD are tied with historical examples, such as mastiffs, horses, and camels. However, would a game master really rule that a character couldn&rsquo;t ride a (willing) polar bear because it lacked an &ldquo;appropriate anatomy&rdquo;? Thus, &ldquo;serving as a mount&rdquo; seems to be a role only if a game master requires it.<\/p>\n<p>Philosophically, is an unsupervised approach more valid than a supervised approach where we specified the clusters and let the computer fit a classification function? Unsupervised is a bit of misnomer because we chose the features that went into the algorithm and iterated until the results seemed &ldquo;good,&rdquo; based on both our understanding of the beasts and quality metrics. Although there is some bias, we believe our methodology is defensible as evidenced by the fact that the results were not what we expected initially. (One bias, we will admit, is a focus on the mechanical stat block aspects of each Beast. A mobility role may appear if we included features such as &ldquo;saddles for this beast can be bought&rdquo; or carrying capacity.)<\/p>\n<p>If we had chosen a supervised approach, we would need validated labels for each beast in the training set. However, although there are suggested roles in the rulebooks, there are no canonical roles for each Beast. Each guide approaches the decision differently. We could create and validate labels under some role schema, but we would then be bending the data to our labels, rather than letting labels emerge from the data.<\/p>\n<p>Thus, we think our clusters, delivered through an unsupervised approach, are more indicative of the reality of the rules than ad-hoc clusters imposed on the beasts.<\/p>\n<h3 id=\"methodology\">Methodology<\/h3>\n<p>Through iteration, we tested a variety of encodings for beasts, evaluating the results visually using a two-dimensional principal components analysis (PCA) plot and using the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.metrics.silhouette_score.html\">silhouette metric<\/a>. The groupings of columns were chosen based on their similar nature within the game. Some traits were very common so they became their own columns, such as &ldquo;Pack Tactics&rdquo;. The Oddball column may be seen as a violation of the approach, but beasts with Oddball traits did not congregate to their own cluster.<\/p>\n<p>After experimentation, we ended up with the following set of twelve columns:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">COLUMNS <span style=\"color:#f92672\">=<\/span> [<span style=\"color:#e6db74\">&#39;Has Burrow&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Has Climb&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Has Fly&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Has Swim&#39;<\/span>, \n\t\t<span style=\"color:#e6db74\">&#39;Combat&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Nimble&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Pack&#39;<\/span>, <span style=\"color:#e6db74\">&#39;LandWater&#39;<\/span>, \n\t\t<span style=\"color:#e6db74\">&#39;Oddball&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Jumping&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Web&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Water Breathing&#39;<\/span>]\n<\/code><\/pre><\/div><p>Notably, neither Challenge Rating and Tank (as both a score by itself or as an &ldquo;outsized&rdquo; binary value) were found to be useful features. For <code>is_outsized_dpr<\/code>, we treated a challenge rating greater than 1 as always outsized, while we measured the lower challenge ratings versus the median values within the SRD. Because there are so few examples of Beasts with higher challenge ratings, we found that the error in the measured median was too great. Furthermore, Perception and Stealth skills were too common to usefully differentiate clusters.<\/p>\n<p>Our encoding logic:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">is_outsized_dpr<\/span>(m) <span style=\"color:#f92672\">-&gt;<\/span> bool:\n    <span style=\"color:#66d9ef\">if<\/span> m[<span style=\"color:#e6db74\">&#39;challenge_rating&#39;<\/span>] <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#ae81ff\">1<\/span>:\n        <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">True<\/span>\n    <span style=\"color:#66d9ef\">else<\/span>:\n        med_dpr <span style=\"color:#f92672\">=<\/span> {<span style=\"color:#ae81ff\">0.0<\/span>: <span style=\"color:#ae81ff\">0.75<\/span>, <span style=\"color:#ae81ff\">0.125<\/span>: <span style=\"color:#ae81ff\">3.0<\/span>, <span style=\"color:#ae81ff\">0.25<\/span>: <span style=\"color:#ae81ff\">4.5<\/span>, <span style=\"color:#ae81ff\">0.5<\/span>: <span style=\"color:#ae81ff\">4.2<\/span>, <span style=\"color:#ae81ff\">1.0<\/span>: <span style=\"color:#ae81ff\">6.4<\/span>}\n        <span style=\"color:#66d9ef\">return<\/span> m[<span style=\"color:#e6db74\">&#39;damageness&#39;<\/span>] <span style=\"color:#f92672\">&gt;=<\/span> med_dpr[m[<span style=\"color:#e6db74\">&#39;challenge_rating&#39;<\/span>]]\n\n<span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">intersection_size<\/span>(a, b) <span style=\"color:#f92672\">-&gt;<\/span> int:\n    set_a <span style=\"color:#f92672\">=<\/span> set(a)\n    set_b <span style=\"color:#f92672\">=<\/span> set(b)\n    <span style=\"color:#66d9ef\">return<\/span> len(set_a<span style=\"color:#f92672\">.<\/span>intersection(set_b))\n\n<span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">monster2array<\/span>(m: dict):\n    arr <span style=\"color:#f92672\">=<\/span> np<span style=\"color:#f92672\">.<\/span>zeros(len(COLUMNS))\n\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Has Burrow&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Burrow&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;speeds&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Has Climb&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Climb&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;speeds&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Has Fly&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Fly&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;speeds&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Has Swim&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Swim&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;speeds&#39;<\/span>])\n\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Combat&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(is_outsized_dpr(m)) \n    \t<span style=\"color:#f92672\">+<\/span> int(<span style=\"color:#e6db74\">&#39;Multiattack&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;actions_summary&#39;<\/span>]) \n    \t<span style=\"color:#f92672\">+<\/span> int(<span style=\"color:#e6db74\">&#39;Bloodied Fury&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Nimble&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> intersection_size(m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>], [<span style=\"color:#e6db74\">&#39;Agile&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Flyby&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Pack&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Pack Tactics&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;LandWater&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> intersection_size(m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>], [<span style=\"color:#e6db74\">&#39;Amphibious&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Hold Breath&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Oddball&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> intersection_size(m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>], [<span style=\"color:#e6db74\">&#39;Beast of Burden&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Compression&#39;<\/span>, \n    \t<span style=\"color:#e6db74\">&#39;Illumination&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Mimicry&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Resists: Cold&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Resists: Fire&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Resists: Poison&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Jumping&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> intersection_size(m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>], [<span style=\"color:#e6db74\">&#39;Jumper&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Running Leap&#39;<\/span>, \n    \t<span style=\"color:#e6db74\">&#39;Standing Leap&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Web&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> intersection_size(m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>], [<span style=\"color:#e6db74\">&#39;Spider Climb&#39;<\/span>, <span style=\"color:#e6db74\">&#39;Web Walker&#39;<\/span>])\n    arr[COLUMNS<span style=\"color:#f92672\">.<\/span>index(<span style=\"color:#e6db74\">&#39;Water Breathing&#39;<\/span>)] <span style=\"color:#f92672\">=<\/span> int(<span style=\"color:#e6db74\">&#39;Water Breathing&#39;<\/span> <span style=\"color:#f92672\">in<\/span> m[<span style=\"color:#e6db74\">&#39;specials&#39;<\/span>])\n\n    <span style=\"color:#66d9ef\">return<\/span> arr\n<\/code><\/pre><\/div><p>In order to render each factor of equal weight, we linearly scaled each column independently to values between 0 and 1, inclusive. Linear scaling yielded greater quality than standardizing values.<\/p>\n<p>We ran the resulting matrix through <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.decomposition.PCA.html\">PCA<\/a> and then used <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.cluster.DBSCAN.html#sklearn.cluster.DBSCAN\">DBSCAN<\/a> to find clusters within the PCA output. We also evaluated K-Means and HDBSCAN, but found DBSCAN to be superior in quality.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">computePY<\/span>(X):\n    pca <span style=\"color:#f92672\">=<\/span> PCA(n_components<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">2<\/span>, whiten<span style=\"color:#f92672\">=<\/span><span style=\"color:#66d9ef\">True<\/span>)\n    reduced <span style=\"color:#f92672\">=<\/span> pca<span style=\"color:#f92672\">.<\/span>fit_transform(X)\n    dbscan <span style=\"color:#f92672\">=<\/span> DBSCAN()\n    Y <span style=\"color:#f92672\">=<\/span> dbscan<span style=\"color:#f92672\">.<\/span>fit_predict(reduced)\n    <span style=\"color:#66d9ef\">return<\/span> reduced, Y\n<\/code><\/pre><\/div><p>With a silhouette score of 0.81, DBSCAN found four clusters and one noise cluster (figure below). The first cluster (lavender) features tigers, wolfs, triceratops, and other dinosaurs. We labeled that cluster the &ldquo;Fight&rdquo; cluster due to their higher damage output and greater resilience. The second cluster (white) features cats, lizards, and spiders. We labeled this cluster the &ldquo;Explore&rdquo; cluster due to their stealth and enhanced movement (many have a climb speed). The third cluster (light green) is the &ldquo;Water&rdquo; cluster due to the creatures living in or near the water. The fourth cluster (dark green) is the &ldquo;Sky&rdquo; cluster where each beast has a flying speed. The final cluster (labeled -1 by DBSCAN and colored dark red) is a noise cluster. The Black Bear is an intermediary, having both a climb speed and a swim speed and an outsized damage output. Mathematically, the DBSCAN algorithm prefers rejecting the Black Bear from existing clusters rather than weakening a cluster.<\/p>\n<figure><img src=\"beast_clusters_dbscan.png\"\n         alt=\"Clusters of Beasts using DBSCAN\"\/><figcaption>\n            <p>Clusters of Beasts using DBSCAN<\/p>\n        <\/figcaption>\n<\/figure>\n<p>We believe the resulting clusters are reasonable. The algorithm certainly found movement speeds as a highly explanatory feature. The banana shape of the Water cluster suggests the cluster could be broken up, and, in fact, the &ldquo;Water Breathing&rdquo; limiting trait divides the group between the purely water-based beasts and the amphibious beasts. The &ldquo;Sky&rdquo; cluster is highly cohesive because the beasts are very similar to each other. In the 2024 edition of the rules, most of the &ldquo;Giant&rdquo; Beast variants were re-categorized as &ldquo;Celestials&rdquo;, which reduced the mechanical diversity. With more examples, the existing clusters might be further divided. Ultimately, validating that this scheme scales to other collections of beasts will require stress-testing.<\/p>\n<h3 id=\"table-beasts-by-role\">Table: Beasts by Role<\/h3>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Fight<\/th>\n<th style=\"text-align:left\">Explore<\/th>\n<th style=\"text-align:left\">Water<\/th>\n<th style=\"text-align:left\">Sky<\/th>\n<th style=\"text-align:left\">Noise \/ Hybrid<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Allosaurus<\/td>\n<td style=\"text-align:left\">Ape<\/td>\n<td style=\"text-align:left\">Archelon<\/td>\n<td style=\"text-align:left\">Bat<\/td>\n<td style=\"text-align:left\">Black Bear<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Ankylosaurus<\/td>\n<td style=\"text-align:left\">Baboon<\/td>\n<td style=\"text-align:left\">Constrictor Snake<\/td>\n<td style=\"text-align:left\">Blood Hawk<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Badger<\/td>\n<td style=\"text-align:left\">Brown Bear<\/td>\n<td style=\"text-align:left\">Crab<\/td>\n<td style=\"text-align:left\">Eagle<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Boar<\/td>\n<td style=\"text-align:left\">Cat<\/td>\n<td style=\"text-align:left\">Crocodile<\/td>\n<td style=\"text-align:left\">Giant Bat<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Camel<\/td>\n<td style=\"text-align:left\">Giant Ape<\/td>\n<td style=\"text-align:left\">Frog<\/td>\n<td style=\"text-align:left\">Giant Wasp<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Deer<\/td>\n<td style=\"text-align:left\">Giant Centipede<\/td>\n<td style=\"text-align:left\">Giant Constrictor Snake<\/td>\n<td style=\"text-align:left\">Hawk<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Dire Wolf<\/td>\n<td style=\"text-align:left\">Giant Fire Beetle<\/td>\n<td style=\"text-align:left\">Giant Crab<\/td>\n<td style=\"text-align:left\">Owl<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Draft Horse<\/td>\n<td style=\"text-align:left\">Giant Goat<\/td>\n<td style=\"text-align:left\">Giant Crocodile<\/td>\n<td style=\"text-align:left\">Pteranodon<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Elephant<\/td>\n<td style=\"text-align:left\">Giant Lizard<\/td>\n<td style=\"text-align:left\">Giant Frog<\/td>\n<td style=\"text-align:left\">Raven<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Elk<\/td>\n<td style=\"text-align:left\">Giant Rat<\/td>\n<td style=\"text-align:left\">Giant Octopus<\/td>\n<td style=\"text-align:left\">Vulture<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Giant Badger<\/td>\n<td style=\"text-align:left\">Giant Spider<\/td>\n<td style=\"text-align:left\">Giant Seahorse<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Giant Boar<\/td>\n<td style=\"text-align:left\">Giant Weasel<\/td>\n<td style=\"text-align:left\">Giant Shark<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Giant Hyena<\/td>\n<td style=\"text-align:left\">Giant Wolf Spider<\/td>\n<td style=\"text-align:left\">Giant Toad<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Giant Scorpion<\/td>\n<td style=\"text-align:left\">Goat<\/td>\n<td style=\"text-align:left\">Giant Venomous Snake<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Hyena<\/td>\n<td style=\"text-align:left\">Lizard<\/td>\n<td style=\"text-align:left\">Hippopotamus<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Jackal<\/td>\n<td style=\"text-align:left\">Panther<\/td>\n<td style=\"text-align:left\">Hunter Shark<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Lion<\/td>\n<td style=\"text-align:left\">Rat<\/td>\n<td style=\"text-align:left\">Killer Whale<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Mammoth<\/td>\n<td style=\"text-align:left\">Spider<\/td>\n<td style=\"text-align:left\">Octopus<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Mastiff<\/td>\n<td style=\"text-align:left\">Weasel<\/td>\n<td style=\"text-align:left\">Piranha<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Mule<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\">Plesiosaurus<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Pony<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\">Polar Bear<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Rhinoceros<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\">Reef Shark<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Riding Horse<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\">Seahorse<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Saber-Toothed Tiger<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\">Venomous Snake<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Scorpion<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Tiger<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Triceratops<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Tyrannosaurus Rex<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Warhorse<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Wolf<\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"is-rat-riding-horse-spider-and-wolf-a-good-recommendation\">Is Rat, Riding Horse, Spider, and Wolf a Good Recommendation?<\/h2>\n<p>The rulebook suggests a level 2 druid adopt the Rat, Riding Horse, Spider, and Wolf as their initial four <em>Wild Shape<\/em> forms (SRD, pg. 42). Using our system, how does this recommendation hold up?<\/p>\n<p>Based on our cluster-based roles, the Rat and Spider belong to the Explore role, while the Riding Horse and Wolf belong to the Fight Role. Since second level druids cannot take a Beast form with Flying movement, the Sky role is off-limits. However, there are many beasts in the Water role that are permitted.<\/p>\n<p>The Rat and Spider are very similar: both have a challenge rating of 0, are Tiny in size, have equivalent Walk and Climb speeds, and have Darkvision to 30 ft. The Spider has a superior DPR, although a Rat can use Agile to move through combat. Thus, there seems little reason to spend slots on both at the same time.<\/p>\n<p>Within the Explore role, the Giant Wolf Spider has a strong Stealth score, has 40 ft movement in both Walk and Climb, has one of the best DPRs of any form available at this level, and has both Blindsight and Darkvision. Being Medium in size, it cannot fit through as tight of spaces as a Tiny or Small creature which may be a strong factor against it for certain caves or lairs.<\/p>\n<p>In the Combat role, while both the Riding Horse and Wolf have a challenge rating of \u00bc, the fitness chart shows both as similar scoring and in the middle of options (below). The Wolf&rsquo;s DPR of 3.75 may be deceptively low if among allies when it may use its Pack Attack ability, but the Panther, which is more aligned with the Explore role, is a more dangerous opponent one-on-one.<\/p>\n<figure><img src=\"combat_fitness_druid_lvl2.png\"\n         alt=\"Combat Fitness Chart: Beasts available for Level 2 Druids\"\/><figcaption>\n            <p>Combat Fitness Chart: Beasts available for Level 2 Druids<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Although horses can have situational uses, there are many forms that will deal more damage and last longer in combat than the horse and the wolf. Including Water roles that can operate outside of water (i.e. lack Waterbreathing), the Giant Crab is tough and hard to hit, Giant Venomous Snake and the Giant Wolf Spider both hit hard with Poison damage.<\/p>\n<p>Thus, the game&rsquo;s recommendations seem too similar to each other. We believe slots should be chosen with greater diversity. We recommend:<\/p>\n<ol>\n<li>Rat <em>or<\/em> Spider, Tiny Beasts that can explore extensively while not arousing attention<\/li>\n<li>Giant Wolf Spider, a Medium Beast that can perform reconnaissance in force<\/li>\n<li>Giant Lizard, a Large Beast that can climb along ceilings and is one of the top fighters (alternatively, Giant Badger a Medium Beast with Burrow speed)<\/li>\n<li>Riding Horse (or Draft Horse or Mule), Large beasts (or Medium in the case of the Mule) that provide mobility, carrying capacity, and brute strength. Horses also blend in with most social encounters.<\/li>\n<\/ol>\n<p>If water is anticipated, the Venomous Snake or Crab can be exchanged for the Spider. The Venomous Snake has greater combat ability and movement than the Crab, but the Crab has Blindsight at 30 ft (versus 10 ft for the snake) so crabs explore faster and are better able to avoid enemies. Crabs are also less threatening, but might be pursued for food.<\/p>\n<h2 id=\"implementation\">Implementation<\/h2>\n<p>For the data analysis, we used <a href=\"https:\/\/scikit-learn.org\">scikit-learn<\/a> and <a href=\"https:\/\/plotly.com\/\">Plotly<\/a> within a <a href=\"https:\/\/jupyter.org\/\">Jupyter<\/a> notebook. Given the small size of the data, we did not attempt any neural net-based approaches. For the web-based tool, we used  <a href=\"https:\/\/www.typescriptlang.org\/\">Typescript<\/a>, <a href=\"https:\/\/react.dev\">React<\/a>, and <a href=\"https:\/\/plotly.com\/\">Plotly<\/a>, with <a href=\"https:\/\/vite.dev\/\">Vite<\/a>\/<a href=\"https:\/\/vitest.dev\/\">Vitest<\/a> for building and testing, respectively.<\/p>\n<p>We used commercial LLMs to extract structured data from the SRD PDF into JSON and then transformed the data via scripts into a more convenient form. We found the LLMs would stop processing the PDF after a non-deterministic number of pages, regardless of instructions in the prompt, which required us to split the SRD PDF into roughly ten page sections. While the SRD monster descriptions typically follow very regular sentence formulations, and thus is amenable to processing via regular expressions, we did need special casing for some aberrant examples. We considered using pre-existing extracts of the SRD but, as of September 2025, were unable to find any high-quality extracts that cover SRD version 5.2, released in April 2025.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>Compared to other online guides, we find that Beasts can be grouped into Combat and Exploration roles, but that, mechanically, there does not seem to be an identifiable Mobility or Utility role. Movement speeds are a more prominent differentiator in the data. Our tool, by labeling Beasts using four roles, by featuring rules-relevant columns rather than flavor-based columns in the user interface, and by clarifying combat effectiveness through the DPR and Tank scores, allows players to efficiently select Beast forms during play.<\/p>\n"},{"title":"Emergency Software: Software Development Lessons from EMISARI","link":"https:\/\/ztoz.blog\/posts\/emisari\/","pubDate":"Thu, 18 Sep 2025 10:19:16 -0600","guid":"https:\/\/ztoz.blog\/posts\/emisari\/","description":"<p>On August 15th, 1971, President Nixon declared a 90-day freeze on wages, rents, and prices. The next day, the Office of Emergency Preparedness (OEP) was charged with implementing the Freeze. OEP launched an administrative crash program, leveraging the expansive power of the federal government, and also took the risky step of developing novel software to facilitate coordination of hundreds of federal employees. Originally called the Emergency Partyline, the software was soon known as  <abbr title=\"Emergency Management Information System and Reference Index\">EMISARI<\/abbr>. EMISARI aided the OEP with internal communications, reporting, and conduct of their mission. Leveraging early time-sharing technology and a dialect of the BASIC language, EMISARI supported communications in the styles of chat, forums, and email as well as providing a flexible data collection and reporting workflow system. We explore the design and implementation of EMISARI and PARTY-LINE, the adjunct synchronous chat program, how EMISARI compared to the other software systems used by the OEP during the Emergency, and what lessons may be applied to modern startups and crash programs.<\/p>\n<h2 id=\"before-the-emergency\">Before the Emergency<\/h2>\n<p>The OEP, when not involved in managing a crisis, created emergency response plans and cultivated a list of subject matter experts who might be called upon to provide aid and advice. The OEP had an expansive remit of emergencies, including traditional emergencies like hurricanes and droughts, industrial actions (e.g. strikes, supply chain interruptions), and civil defense (e.g. telecommunication resilience in the event of a nuclear war).<\/p>\n<p>One of the techniques used to explore disaster scenarios was a Delphi exercise or the Delphi method. Originally invented by <a href=\"https:\/\/www.rand.org\/topics\/delphi-method.html\">RAND<\/a>, Delphi is a structured group communication process intended to drive clarity on a complex issue. A standard exercise involved a small monitor team sending a questionnaire out to a respondent group. The monitor team would summarize the responses from the group and send out the summary along with a new questionnaire, diving deeper into the issue. The monitor group may also distribute background material. Through iteration, the disparate group would ideally standardize on terminology and find aspects of agreement as well as contention. <em>The Delphi Method: Techniques and Applications<\/em> (1975) by Linstone and Turoff contains a series of essays on the method and provides insight into how the process was understood at the time.<\/p>\n<p>The Delphi method is a more structured process than a modern internet debate via social media, email, or a forum. Delphi have controls in place to prevent individuals from dominating the conversation or unfairly influencing the outcome. For instance, individuals may be capped in how frequently they can contribute to give others a chance to speak. Some Delphi conferences include anonymous or pseudonymous features to prevent rank or pre-existing prestige from influencing the results. These features often required extensive work by the monitor team, so many practitioners were interested in using computers to automate the method.<\/p>\n<p>Within the OEP, the System Evaluation Division (SED) was formed in the late 1960s to seek technologies useful for emergency management. The division conducted Delphi studies, used network optimization to improve the resilience of gas pipelines and the telephone system, and developed early computer conferencing software (Hiltz Turoff 1978, pg 47). Murray Turoff, the later system designer for EMISARI, was hired by the SED in 1968 and studied various potential emergencies, often using the Delphi method.<\/p>\n<p>While Turoff was training as a physicist (PhD 1965, Brandeis University), he had developed a strong interest in computing. His thesis was a computer simulation of a planetary nebula using FORTRAN on a IBM 704. Each case study of the model required about three hours of computer time. Restricted to only using the computer on Friday and Saturday evenings, he saw interactive computing as the path to greatly enhanced productivity (Subramanian 2012). He also grew interested in the social aspect of connecting computer users together which fueled his interest in computer conferencing and later computer-mediated social networks.<\/p>\n<p>In the spring of 1970, Turoff surreptitiously launched a computer conferencing experiment with twenty participants distributed across the United States. Hall describes the technical design and implementation of this system, or something close to it, in (Hall 1971). The paper also contains representations of the user interface from Turoff. According to Turoff, the experiment had to be kept quiet at the time because:<\/p>\n<ol>\n<li>The organizational culture demanded that failures were to be punished,<\/li>\n<li>The official software development group wanted to maintain a monopoly on all development activity and Turoff was not a member of that group, and<\/li>\n<li>Twenty unauthorized individuals using the system was an embarrassing security violation (Hiltz Turoff 1978, pg 48).<\/li>\n<\/ol>\n<p>After seven weeks, the computer operators discovered the unauthorized access and traced it to Turoff&rsquo;s work. The head of the SED, Robert Kupperman, interceded and allowed the experiment to complete, but Turoff had to be punished. His terminal was taken away. However, since his <em>accounts<\/em> were not disabled, Turoff simply found a spare terminal the next day.<\/p>\n<p>The experiment was successful &mdash; a distributed set of participants had asynchronously  discussed a subject usefully over the computer. The OEP had a new capability to use for emergencies, although one without broad support.<\/p>\n<h2 id=\"the-1971-wage-price-emergency\">The 1971 Wage Price Emergency<\/h2>\n<p>In President Nixon&rsquo;s <a href=\"https:\/\/www.presidency.ucsb.edu\/documents\/address-the-nation-outlining-new-economic-policy-the-challenge-peace\">&ldquo;The Challenge of Peace&rdquo;<\/a> address, he outlined three policies intended to increase employment, reduce inflation, and mitigate international speculation on the dollar. To reduce inflation, his first step was ordering a 90 day freeze on all prices and wages throughout the United States. On the same day, Nixon issued <a href=\"https:\/\/www.presidency.ucsb.edu\/documents\/executive-order-11615-providing-for-stabilization-prices-rents-wages-and-salaries\">EO 11615<\/a>, which created a Cost of Living Council that was delegated the powers and responsibility to &ldquo;stabilize prices, rents, wages, and salaries&rdquo; via the <a href=\"https:\/\/www.congress.gov\/91\/statute\/STATUTE-84\/STATUTE-84-Pg796.pdf\">Economic Stabilization Act of 1970<\/a>. In turn, the Council delegated implementation of their policy decisions to the Office of Emergency Preparedness.<\/p>\n<p>George A. Lincoln, director of the OEP, had been told the previous day (Saturday) to return to Washington D.C. in anticipation of an address by the President. On Tuesday, August 17th, the OEP was formally delegated powers. To manage the Wage-Price Freeze, it would (Yoshpe 1972, pg. 24-25):<\/p>\n<ol>\n<li>Provide overall management and direction<\/li>\n<li>Develop, in conjunction with other agencies, critical policy question and suggest answers to the Council<\/li>\n<li>Respond to national and local queries on the application and interpretation of policy<\/li>\n<li>Ensure and enforce compliance<\/li>\n<li>Receive and consider requests for exceptions and exemptions<\/li>\n<li>Keep the public informed<\/li>\n<li>Monitor the progress and maintain a full record<\/li>\n<li>Collaborate with other agencies in planning the post-freeze policy<\/li>\n<\/ol>\n<figure><img src=\"Inflation-Flow-of-Communication.png\"\n         alt=\"Flow of Communication during the Emergency. Source Yoshpe 1972, pg 62\"\/><figcaption>\n            <p>Flow of Communication during the Emergency. Source Yoshpe 1972, pg 62<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Although the OEP was experienced managing and coordinating emergency responses, the office did not have any specific plans in place to manage an economic emergency of this nature. However, the OEP had an extensive list of subject matter experts and could pull personnel from other organizations. At the beginning of the Emergency, the national headquarters had 230 staff. By the end of the first week, staff increased to 381. By October, staff increased to 415. (ibid, pg 44). Staffing similarly increased within OEP&rsquo;s regional offices by 300 by October (ibid, pg 45). In three months, the OEP had doubled in size.<\/p>\n<p>Successful administration of the Emergency required rapid formulation of plans, clear and consistent communication, both internal and external, and coordination of a large, newly organized staff, many on temporary details. Director Lincoln was open to innovative approaches, particularly those which aligned with Nixon&rsquo;s wish to keep the temporary bureaucracy lean. Seizing an opportunity to test their computer conferencing ideas during a real emergency, his System Evaluation Division head, Robert Kupperman, suggested adapting the recently developed software for the emergency&rsquo;s needs. Lincoln approved.<\/p>\n<p>EMISARI was rapidly developed, with a prototype available within a week, and used to handle many communication and reporting needs of the Emergency.<\/p>\n<p>Ninety days later, on November 14, the Freeze was over and &ldquo;Phase II&rdquo; of the economic stabilization program began under the management of the Internal Revenue Service. The IRS adopted IRMIS, a slightly modified version of EMISARI, to administer their phase. EMISARI would continued to be used in later national emergencies handled by the OEP.<\/p>\n<p>Within this article, we are not concerned with evaluating the government&rsquo;s economic policies. Instead, we are concerned with computing&rsquo;s contribution to the administration of the Freeze and how these lessons might be applied to modern situations where schedule and resources are particularly tight.<\/p>\n<h2 id=\"emisari-design-and-development\">EMISARI Design and Development<\/h2>\n<h3 id=\"development-history\">Development History<\/h3>\n<p>With the Emergency to last 90 days, each week represented 8% of the duration. George Lincoln was told by Kupperman that software to aid in administering the Emergency could be available in two weeks. After Lincoln approved, Kupperman told the team that the software was needed in a week (Hiltz Turoff 1978, pg 52).<\/p>\n<p>The first version of the software, dubbed the Emergency Party-Line, became operative during the second week of the Emergency (Renner 1973, pg 30). This version included \u201cagencies, contacts, messages, estimates, programs, update choices, and a description and explanation section.\u201d and required 2,500 lines of XBASIC (ibid). In classic start-up fashion, Turoff coded the prototype by taking a terminal home and working continuously for four days (Hiltz Turoff 1978, pg 52). Turoff did have the advantage of access to the previous computer conferencing software developed by Thomas Hall and designed by himself, which at least provided reusable &ldquo;design principles and software techniques&rdquo; (Renner 1973, pg 3) if not reusable code.<\/p>\n<p>Two weeks after version one, the Emergency Party-Line was renamed EMISARI and the program gained Rulings, Exceptions\/Exemptions, and the Bulletin Board. \u201cIn three weeks of intensive work, over 12,000 lines of XBASIC code, including revised and rewritten sections, were produced by the three programmers.&quot; (ibid, pg 32) These features included a form of free-form text search. The total lines of code rose from 2,500 to 6,000. (ibid, pg 32).<\/p>\n<p>Timeline-wise, the PARTY-LINE software, which was an adjunct to EMISARI but independent in execution, was used in Lincoln&rsquo;s third weekly status meeting, so it must have been developed prior to the release of version 2 of EMISARI. Since PARTY-LINE lacks any dependencies on EMISARI data, it may have been developed as part of the 1970 computer conferencing experiment, although we have no evidence to support an earlier development.<\/p>\n<p>The rapid development of EMISARI took a toll and the next version required six weeks before being released. This version became the baseline for the IRS&rsquo;s IRMIS system and stood at 6,900 lines of code (ibid, pg 32).<\/p>\n<p>Functionally, the first version provided necessary features for users to report required data items (estimates and programs), identify who they were and how to contact others (agencies and contacts), provide context on their estimates and communicate with others (messages), and had a basic section for questions and answers. Smartly, the features necessary to record rulings, exceptions, and exemptions waited until the second version when the OEP started actively administering the Emergency. The third version was focused on resolving technical debt.<\/p>\n<p>Notably, the features list for the various versions do not include tables. Tables took estimates and programs and rendered them for print. We suspect tables were implemented in an &ldquo;on demand&rdquo; basis, rather than release-based, to align with emerging management needs.<\/p>\n<p>Timeline (all dates 1971):<\/p>\n<ul>\n<li>08-15: Nixon declares the Emergency as part of a national address.<\/li>\n<li>08-17: OEP delegated powers to administer the Emergency.<\/li>\n<li>08-22: Emergency Party-Line (EMISARI version 1) released sometime this week.<\/li>\n<li>09-05: EMISARI (version 2) released sometime this week.<\/li>\n<li>09-15: Weekly Reports to the Council switch to a computerized report (Wed-Tue reporting cycle) from previous manual compilation.<\/li>\n<li>09-25: EMISARI table outputs within the Weekly Report take their final form.<\/li>\n<li>10-10: EMISARI (version 3) released sometime this week.<\/li>\n<li>11-13: Final Weekly Report published.<\/li>\n<li>11-14: Phase 1 of the Emergency ends.<\/li>\n<\/ul>\n<h3 id=\"operating-environment\">Operating Environment<\/h3>\n<p>The OEP operated a Univac 1108 mainframe running Exec 8, which was Univac&rsquo;s first time-sharing, interactive operating system. The 1108 was a 36-bit system and used a 6-bit character encoding called Fieldata. Units could be equipped with 65,536 words to 262,144 words of memory. Based on comments about planned software improvements, the developers considered memory the most constrained resource.<\/p>\n<p>Users connected to the machine via terminals, most likely Teletype Model 37s or Execuports (OEP-SED, General Correspondence &amp; Directives, Coordination meeting with CLC Staff 1971-08-31). These terminals did not have a &ldquo;glass&rdquo; interface, so all interactions were printed to paper.<\/p>\n<figure><a href=\"https:\/\/commons.wikimedia.org\/wiki\/File:Teletype_Model_37.jpg\"><img src=\"Teletype_Model_37.jpg\"\n         alt=\"Teletype Model 37. Source MBlairMartin CC BY-SA 4.0\"\/><\/a><figcaption>\n            <p>Teletype Model 37. Source MBlairMartin CC BY-SA 4.0<\/p>\n        <\/figcaption>\n<\/figure>\n\n<h3 id=\"design-philosophy\">Design Philosophy<\/h3>\n<p>Although the development of EMISARI was obviously time and resource-constrained, the development team anticipated using the software for future emergencies and built the application without tying it specifically to Phase 1 requirements. To fulfill this broader vision, the team adopted these design philosophies as (Kupperman 1972) reports:<\/p>\n<ol>\n<li>An effective management information system requires communication upward, downward, and laterally. Without this, reporters will lack feedback on the quality and usefulness of their data.<\/li>\n<li>Requests of data must differentiate between different degrees of timeliness (from emergency, reply at once, to store data for later historical analysis) so reporters can effectively prioritize their work.<\/li>\n<li>Emergency responses must be flexible to changing user requirements and environmental changes.<\/li>\n<\/ol>\n<p>The first statement was achieved through the flexible interface between reporting and messaging. A reporting requirement, or Estimate, is delegated to a single reporter. The reporter could respond to the request by linking the Estimate to a Message, which could then be read by anyone. Estimates also stored the &ldquo;workflow&rdquo; status, so a reviewer could see when the respondent expected to report the value. If an Estimate seemed incorrect, a reviewer could attach a Message to that Estimate with their questions, or a reviewer could send a Message to the reporter directly. Using the Agency and Contacts table, anyone could search for other users.<\/p>\n<p>For the second philosophical statement, Estimates were tied to Programs. Programs could communicate expected deadlines and periodicity by the naming and design of the rows. Furthermore, Messages sent by the requester could alert reporters of the urgency or if a data value was no longer required.<\/p>\n<p>The team&rsquo;s choice of programming language and data model demonstrate how they planned to keep the application flexible. The team believed that XBASIC, as opposed to assembly, was more amenable to change. (Internally, there also seems to be a bias that XBASIC wasn&rsquo;t a real programming language and thus normal users could be trusted with writing programs in it, expanding the development base.) Architecturally, they also divided the program into highly focused sub-programs with limited interfaces to other sub-programs. This simplified the effort required to understand a program to modify it.<\/p>\n<p>Furthermore, the Text Files (searchable lists) were the mechanism to record news, policy decisions, frequently asked questions, and such. These could be easily tailored to the particular emergency or left out (as NEWS was dropped for IRMIS). All users could read and respond to items, although certain files permitted only certain users to edit their contents. Although files could contain structured data (e.g. People), most were relatively flat. Thus, the files were understood through their title and how users came to use them, rather than requiring development effort to impose a certain use.<\/p>\n<h3 id=\"data-model\">Data Model<\/h3>\n<p>The EMISARI data model was intended to support future emergencies and thus be highly flexible. As shown in the figure below, there were four groupings of primary objects: users and groups (green), reporting (blue), commentary and messaging (purple), and searchable text (orange). Physically, the data was stored in 28 word long data blocks. Thomas Hall, who wrote the computer conferencing system Turoff used in 1970 and contributed during the Emergency, describes many of the software techniques necessary to work within the Univac 1108 in (Hall 1971).<\/p>\n<figure><img src=\"EMISARI-EPT-ERD.svg\"\n         alt=\"EMISARI Conceptual Entity Relationship Diagram\"\/><figcaption>\n            <p>EMISARI Conceptual Entity Relationship Diagram<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>(Renner 1973) is the primary source on the data model for EMISARI as it existed at the end of the Emergency, while (Hall 1971) provides insights into how it was likely physically implemented.<\/p>\n<h4 id=\"users-and-groups-contacts-and-agencies\">Users and Groups (Contacts and Agencies)<\/h4>\n<p>Contacts served as both a Rolodex to find individuals as well as representing users of the system. A contact was expected to be &ldquo;responsible for gathering or acting on information&rdquo; (Renner 1973, pg 9), so the entity was only necessary for adding or modifying data. Each contact had an access code, assigned by the monitor, which would permit the modification of data. Otherwise, users could view any data within the system.<\/p>\n<p>An Agency is a grouping of contacts. We do not find any indications agencies were used for permissions or served more than an informational role within the system.<\/p>\n<h4 id=\"reporting\">Reporting<\/h4>\n<p>The gathering and aggregation of data, based on shifting needs, was the core responsibility of EMISARI. Data collection was focused solely on numerical data.<\/p>\n<p>The Estimate was the fundamental entity for reporting. \u201cAn estimate is a specific item of data that a particular contact is responsible for reporting.\u201d (Renner 1973, pg 12) An example estimate would be the number of inquiries made in region 3 during the week of September 5th. Consisting of either a number or a code, Estimates also had an associated descriptive title, an internal label, designated owner, and a modification date. If a number was not available, respondents could use the following codes:<\/p>\n<ul>\n<li>&lsquo;U&rsquo; for unknown and will not be available for this report<\/li>\n<li>&lsquo;T&rsquo; (default), unknown but is expected to be provided<\/li>\n<li>&lsquo;X&rsquo; not required for the report<\/li>\n<\/ul>\n<p>A Program was a collection of Estimates, semantically like a row or vector. In most cases, a Program would be a time series representing the value per week.<\/p>\n<p>Tables consisted of one to five columns and one to 21 rows, along with associated titles for each. The data for a table would come from Programs, but could also consist of computed rows and columns, such as sums. Tables could also be derived from other Tables, for instance each OEP region could have a detailed table and then a national table could combine the regions. Tables could also be archived for historical data preservation.<\/p>\n<h4 id=\"commentary-and-messaging\">Commentary and Messaging<\/h4>\n<p>Fulfilling the design philosophy that a management information system should provide effective upward, downward, and lateral communication, the commentary and messaging data entities provided flexible means to communicate requests and context.<\/p>\n<p>A Message contained a title and approximately five lines of free-form text content. A Message could be sent from a Contact to another Contact, or from a Contact to everyone. (This feature was leveraged by the Monitor for administrative notifications.) Alternatively, a Message could be attached to an Estimate, Program, or Table. Similar to a comment in a spreadsheet, anyone looking at the reported data could examine the attached comments for additional context or explanation.<\/p>\n<p>A Letter stored up to 100 lines of text and was sent from one Contact to a group of Contacts. Letters could reference or transclude other items within the system. For example, the link <code>&amp;TABLE 20,30<\/code> would insert tables 20 and 30 into the Letter. Similarly, <code>&amp;NEWS ITEMS 53,54<\/code> would insert news items into the Letter. The transclusion was performed at render-time so the viewer would always see the latest version (Renner 1973, pg 10-12).<\/p>\n<p>Both Messages and Letters were asynchronous styles of communication. Contacts would see new messages and letters when they opened the &ldquo;home&rdquo; screen. There were no notifications to interrupt their flow. Synchronous communication was supported via the PARTY-LINE program, which is discussed later.<\/p>\n<h4 id=\"searchable-text\">Searchable Text<\/h4>\n<p>EMISARI provided multiple &ldquo;text files&rdquo; to store content on various topics. The final list of topics were Bulletin Board, Policy and Guidance, Actions, News, Information, People, and Explanations (Renner 1973, pg 17-18). These files were used to report items, summarize relevant media reports, and document policies. The contents were widely used. For example, one of Lincoln&rsquo;s secretaries was abstracting articles from five newspapers and adding the summaries into the News file. When she was out sick one day, twenty calls came in complaining about the delay in updates (Hiltz Turoff 1978).<\/p>\n<p>Each file consisted of control blocks, which were small and could be contained in memory, and data blocks, which were separated into &ldquo;line units&rdquo;. Each line unit could contain up to six lines and a data block could contain one to five line units.<\/p>\n<p>EMISARI maintained a small, adaptive index for each file. The index stored a limited number of search terms and a list of matching entries. The index was updated as users made searches, with older and longer keys replaced by newer and shorter keys. Search used a substring prefix matching approach. (Renner 1973, pg 34)<\/p>\n<p>A typical use of this feature was to prepare for public meetings. A team member preparing to talk to an insurance consortium would search for &ldquo;insurance&rdquo; to pull the latest relevant policy decisions, inquiries, and news to that group (Kupperman 1972).<\/p>\n<p>(Kupperman 1972) states that the index terms were also examined by the team to determine what the public was interested in. If a search term was not finding relevant documents, then the team might invest in creating documents for that term. The index&rsquo;s ability to shrink queries to a common root helped identify the broadest query.<\/p>\n<h3 id=\"architecture\">Architecture<\/h3>\n<p>Architecturally, EMISARI was composed of multiple XBASIC programs. Programs read and wrote to files using locks for synchronization. Each program was focused on a specific set of functionality and could invoke another program via XBASIC&rsquo;s <code>chain<\/code> command. (This technique was not unique to XBASIC.) Scalar variables could be passed between programs to retain user context. (Renner 1973, pg 32) claims this design &ldquo;eases the coordination of the programming effort and limits the amount of computer resources [&hellip;]&rdquo;. Page 33 provides a breakdown of the lines of code for each section (below) which closely follows EMISARI&rsquo;s functional boundaries.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Section<\/th>\n<th style=\"text-align:right\">Lines of Code<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">INITIAL<\/td>\n<td style=\"text-align:right\">266<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Agency-Contact<\/td>\n<td style=\"text-align:right\">312<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Program-Estimate<\/td>\n<td style=\"text-align:right\">334<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Message-Letter<\/td>\n<td style=\"text-align:right\">309<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Table<\/td>\n<td style=\"text-align:right\">239<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">File Retrieval and Explanation<\/td>\n<td style=\"text-align:right\">491<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">People<\/td>\n<td style=\"text-align:right\">539<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Description<\/td>\n<td style=\"text-align:right\">292<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Update (Messages and Estimates)<\/td>\n<td style=\"text-align:right\">536<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Table Update<\/td>\n<td style=\"text-align:right\">259<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">File Update<\/td>\n<td style=\"text-align:right\">427<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>SUBTOTAL<\/strong><\/td>\n<td style=\"text-align:right\">4004<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Monitor Programs<\/td>\n<td style=\"text-align:right\">2290<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Special IRMIS Programs<\/td>\n<td style=\"text-align:right\">606<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>TOTAL<\/strong><\/td>\n<td style=\"text-align:right\">6900<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Notably, the programs with &ldquo;read-write&rdquo; capabilities are separate from the programs that merely &ldquo;read&rdquo; (e.g. Table vs. Table Update). Although we might see this separation as a security feature, similar to programs dropping <code>suid<\/code> privileges, (Hall 1971) presents the advantage as &ldquo;saving some amount of memory space&rdquo; by not having to load the file editing code.<\/p>\n<p>Given the limited source control tools available at the time, another advantage would have been to allow the developers to work on independent files concurrently, similar to microservices being independently deployable. Renner&rsquo;s claim that the design &ldquo;eases the coordination of the programming effort&rdquo; may have been referencing this aspect.<\/p>\n<h3 id=\"role-of-the-monitor\">Role of the Monitor<\/h3>\n<p>Delphi-style conferences appointed staff members, sometimes named monitors, to help guide and administer the proceedings. For instance, monitors were responsible for summarizing written responses and identifying conflicts within the responses and turning them into new questions.<\/p>\n<p>Within a computerized conference, the computer ideally took over many of the monitor tasks. Prior to the Emergency, the vision for the monitor&rsquo;s role was seemingly focused on system administration. The monitor had four roles (Hall 1971):<\/p>\n<ol>\n<li>Setting up a conference - establishing list of respondents (with their access codes), providing an abstract of the conference&rsquo;s purpose, and possibly initializing some of the data items<\/li>\n<li>Modifying the conference - cleaning and codifying data items, changing respondents as necessary, and broadcasting necessary messages to the participants<\/li>\n<li>Analyzing the conference - summarizing voting results<\/li>\n<li>Deleting the conference - cleaning up resources after the conference is complete<\/li>\n<\/ol>\n<p>The monitor&rsquo;s role broadened during the Emergency. Although the monitor was still involved in creating accounts and assigning access codes, they were also involved in training, community management, and influencing the product roadmap. (Renner 1973, pg 25-27) has a similar list of core responsibilities for the monitor as Hall but lists as additional monitor roles:<\/p>\n<ol>\n<li>Public relations - meeting with people about the system, generate interest in joining<\/li>\n<li>Anticipatory trouble shooter (e.g. conflict in data requirements)<\/li>\n<li>Participate in improving system design<\/li>\n<li>Improve \/ tailor training for specific user groups (e.g. new users vs. advanced users)<\/li>\n<\/ol>\n<p>This list of responsibilities presages later roles such as community relations manager or customer success. Perhaps due to the professional setting, the monitor is not ascribed any moderation responsibilities such as we see in a forum moderation role. The monitor was expected to edit posts made by others, but this is usually cast in terms of eliminating duplicate posts or reducing disk space by trimming obsolete information, not defusing flame wars. Murray Turoff was explicitly interested in the sociological impact of computer use and computer conferences, but EMISARI does not seem to have generated a community like the later PLATO system.<\/p>\n<h3 id=\"comparison-to-the-census-system\">Comparison to the Census System<\/h3>\n<p>As supplied in a list to the Cost of Living Council, the OEP used four management information systems during the emergency (OEP-SED, Weekly Report, 09-15):<\/p>\n<ol>\n<li>EMISARI<\/li>\n<li>Census<\/li>\n<li>Record of incoming mail to the national office and action taken<\/li>\n<li>Record of inquiries to the national office requiring further guidance<\/li>\n<\/ol>\n<p>Within the weekly reports, EMISARI produced Appendix A (&ldquo;Summary of Inquiries&rdquo;), while the Census system produced Appendix B (&ldquo;Census Bureau Alleged Violations&rdquo;) and Appendix C (&ldquo;Census Bureau Exceptions and Exemptions&rdquo;).<\/p>\n<p>In the Director&rsquo;s After Action Report (OEP-SED, <em>After Action Reports: Wage-Price Freeze<\/em> (Director\u2019s Memo 10\/27\/71), dated November 8, 1971), he praises EMISARI, stating that it &ldquo;achieved a level of reliability and completeness such that by mid-October it was accepted as the sole vehicle for statistical reporting within OEP&rdquo;. In contrast, he criticizes that the Census system &ldquo;could not be modified rapidly enough to accommodate changes such as the shift in compliance emphasis from complainant orientation to violator orientation.&rdquo; He attributes this rigidity to its programming in assembly language and &ldquo;complex and widely-distributed data input network&rdquo;.<\/p>\n<p>Additional evidence for EMISARI&rsquo;s reliability and the OEP&rsquo;s struggles with the Census system can be seen in the successive weekly reports. None of the weekly reports (there are nine from September 15th through November 13th) include any corrections to EMISARI&rsquo;s Appendix A outputs. The Census&rsquo;s Appendix B are similarly unmarked, but the introduction of Appendix C on October 9 notes that the current week totals are actually cumulative totals. While this error was apparently corrected by the next week, from October 16th to and including November 13th, Appendix C is marred with a note that weekly sums are actually two week sums.<\/p>\n<p>While the Census system was criticized, it was apparently necessary. In a October 26, 1971 memo to the IRS, who were to administer &ldquo;Phase 2&rdquo; of the Emergency, Kupperman describes the Census system as a necessary source for certain analytical data. Although necessary, the Census system was not sufficient as Kupperman also pushed the IRS to adopt an EMISARI-like system, termed IRMIS (Internal Revenue Management Information System), rather than developing a new system from scratch. (The IRS adopted IRMIS.)<\/p>\n<h2 id=\"party-line-design-and-development\">PARTY-LINE Design and Development<\/h2>\n<p>Termed an &ldquo;adjunct&rdquo; program to EMISARI, PARTY-LINE is a group-oriented chat program. At a designated time, all participants would login to the Univac 1108 and invoke the program. Within the user interface during a chat, participants would either be writing a message or reading and retrieving the latest messages. Internally, the program appended messages to a sequential list stored in common storage (using locks for synchronization) which would be discarded at the end of the conference. While seemingly ephemeral, given the nature of the terminals, all participants would have a hard-copy.<\/p>\n<p>PARTY-LINE supported 15 concurrent participants, although Turoff claimed this limit was arbitrary and the program could support 30 to 50 individuals. (The number of phone lines connected to the mainframe would have been a constraint.) During the Emergency, 15 participants were sufficient for meetings with the ten regional offices and the national headquarters staff. The (Turoff 1972) paper, which includes an example post-Emergency conversation using the program, implies that the program supported multiple numbered groups at the same time. Participants could join an existing group or create a new group. In contrast, PLATO&rsquo;s Talk-o-matic, CompuServe&rsquo;s CB Simulator, and BITNET Relay all had a fixed number of channels. Additionally, at least by 1972, the program supported confidential participant-to-participant messages.<\/p>\n<p>Notably, participants could be pseudonymous within the chat, choosing a name separate from their login name. This feature was used early in the Emergency. As reported by Turoff (Hiltz Turoff 1978, pg 63), there was a regular meeting of the 10 regional offices and the national office. During the first two weeks, the regional offices repeatedly communicated that all was well. For the third week, they were ordered to use PARTY-LINE but connect to the chat using a fake name. This time, over the next three hours, the regional offices communicated many of the challenges and issues they were facing, finally providing transparency to management.<\/p>\n<p>This anecdote aligns with Turoff&rsquo;s theories that features computer conferencing can shape and <em>improve<\/em> human communication, rather than merely providing another channel.<\/p>\n<p>In a memo dated October 5th to Turoff, Richard H. Wilcox suggested some conventions for using PARTY-LINE such that conferences \u201cdo not degenerate into chaotic collections of uncoordinated input\u201d. His suggestions:<\/p>\n<ul>\n<li>Participants log-in at most 5 minutes prior to scheduled start<\/li>\n<li>Organizer arrives early and broadcasts the meeting\u2019s purpose and rules<\/li>\n<li>Monitor initiates the first question and allows only a certain amount of time for an answer. Further questions come every 2-3 minutes.<\/li>\n<li>Monitor asks a general question before conference ends<\/li>\n<\/ul>\n<p>Turoff claims PARTY-LINE was coded in a single day by Thomas Hall for the Freeze (Turoff 1972), but by the 1972 paper an additional two man months of development had been applied to OEP&rsquo;s computer conferencing systems (which includes PARTY-LINE and DISCUSSION, a forum-system). PARTY-LINE, like EMISARI, was developed in XBASIC. The paper states that the &ldquo;program&rdquo; is about 350 lines in length, but we are unclear if the author was referencing PARTY-LINE, DISCUSSION, or both.<\/p>\n<h2 id=\"organization\">Organization<\/h2>\n<p>(Hiltz Turoff 1978, pg 47) lists the principal contributors to EMISARI and their roles and affiliations:<\/p>\n<ul>\n<li>Supervisory Member of Executive Staff: Robert Kupperman<\/li>\n<li>Project Manager, System Architecture: Richard Wilcox<\/li>\n<li>System Designer, Programming Manager: Murray Turoff<\/li>\n<li>Operations Manager: Nancy Goldstein<\/li>\n<li>News File Manager: A. Lee Canfield<\/li>\n<li>Exceptions\/Exemptions File Manager: Albert Martin (Naval Research Lab.)<\/li>\n<\/ul>\n<p>Programming Consultants:<\/p>\n<ul>\n<li>Thomas Hall (Language and Systems Development, Inc.)<\/li>\n<li>George Olmacker (Language and Systems Development, Inc.)<\/li>\n<li>Ruth Anderson (National Bureau of Standards)<\/li>\n<\/ul>\n<p>Programmers (OEP Mathematics and Computation Laboratory):<\/p>\n<ul>\n<li>Robert Renner<\/li>\n<li>Ronald L. Wynn<\/li>\n<li>David Marbray<\/li>\n<li>Robert Bechtold<\/li>\n<li>Charles Clark<\/li>\n<\/ul>\n<p>(Affiliations are with OEP unless noted otherwise)<\/p>\n<h2 id=\"after-emergency-phase-1\">After Emergency Phase 1<\/h2>\n<p>When the IRS took over &ldquo;Phase 2&rdquo; of the Emergency, they adopted IRMIS, a slightly modified version of EMISARI, to fulfill a similar role. Since the IRS&rsquo;s computers were IBM models and thus could not run EMISARI directly, the IRS team continued to use OEP&rsquo;s 1108. Kupperman pushed for the IRS to take this route, arguing that rewriting the system would take six to eight months (OEP-SED, Post-freeze Information and Reporting System for IRS, October 26, 1971). During (at least) the first week of Phase 2, the IRS had operational issues with IRMIS, which necessitated Kupperman to defend the software and the (unplanned and unreimbursed) support from OEP&rsquo;s Mathematics and Computation Laboratory (OEP-SED, MCL Support to IRS, 1971-11-30).<\/p>\n<p>The OEP was dissolved in 1973 and re-organized under the General Services Administration. Kupperman, who had long been a champion of EMISARI, left the organization. Murray Turoff, Richard Wilcox, and Nancy Goldstein soon followed.<\/p>\n<p>EMISARI continued to be used in national emergencies (e.g. 1973&rsquo;s Voluntary Petroleum Allocation Program and 1974&rsquo;s National Trucker&rsquo;s Strike) but saw little investment and was dead by 1975 (Subramanian 2012). The adjunct portions of EMISARI, as well as the pre-EMISARI computer conferencing system, saw continued use in some university computer services departments (ibid).<\/p>\n<p>Language Systems and Development made an effort to sell PARTY-LINE and FORUM commercially as part of their XBASIC language (Leavitt 1972). They continued to support XBASIC for Univac 1100 machines until at least 1980 when they released an <a href=\"https:\/\/archive.org\/details\/sim_computerworld_1980-02-11_14_6\/mode\/2up\">ASCII compatible version<\/a>.<\/p>\n<h2 id=\"was-emisari-effective-impactful\">Was EMISARI effective? Impactful?<\/h2>\n<p>Did EMISARI contribute to Phase 1&rsquo;s success?<\/p>\n<p>While the official history of the Emergency is generally positive about EMISARI&rsquo;s role, their summary of the impact is underwhelming: &ldquo;EMISARI was used extensively by those who availed themselves of its special capabilities, but was not widely adopted as a person-to-person communication system.&rdquo; (Yoshpe 1972, pg. 65). The criticism may be unfair in that, although additional terminals were installed to support the Emergency and the Council Staff were allocated a portable Execuport terminal (OEP-SRD, Coordination meeting with CLC Staff (08-31)), most line workers would have shared a terminal rather than having a personal machine to themselves. Furthermore, there was cultural resistance to using a computer directly as typing was seen as a secretarial job.<\/p>\n<p>During the Emergency, there was concern that EMISARI would be considered as unused. (Hiltz Turoff 1978, pg 59) quotes Richard Wilcox: \u201cFortunately, Murray and his troops, about a third of the way through the freeze, had put in a monitor, which showed how many times different parts of the system had been used. This helped to refute the charges that nobody had used the system, and helped in its continued life&hellip;&quot;.<\/p>\n<p>These metrics are reprinted in (Kupperman 1972, Table 1). The table shows user accesses to various program sections over two periods. We are cautious about interpreting the data, as the table&rsquo;s title states that the timeline is from September 2, 1971 to November 8, 1971, while the column headers say the timeline starts on September 27th. Overall, the table shows 29,090 items retrieved during the time period, with the Bulletin Board file being the most active with 9,672 items retrieved. Since printouts could be shared, the usage data may understate consumption because a team might have a single user perform all the searches and share the printouts with the group. The usage data is somewhat corroborated by a billing report on use of the 1108 which shows a dramatic increase in CPU seconds consumed starting with the Emergency (OEP-SED).<\/p>\n<p>To the broader government, though, success was not measured by the usage of some internal software tool, but OEP&rsquo;s administration of the Emergency. (The Freeze itself was considered successful due to improving measures of inflation.) Lincoln said consistency in policy application was a necessary and important factor in their success, as fulfilled by an efficient communication network and close knit operations between the headquarters and field staff (Yoshpe 1972, pg 203). Although the first week of the Freeze was characterized as chaotic with the field offices complaining that headquarters was failing to provide guidance, conditions quickly improved. The timeline, at least, correlates with the introduction of EMISARI and its functionality for reporting and aligning policy. However, management used more tools than just EMISARI.<\/p>\n<p>OEP&rsquo;s continued use of EMISARI for the next few years suggests it was useful and flexible for various types of emergencies, although management may have mandated its use over user feedback. Design decisions, like the use of the XBASIC language, which was proprietary to a vendor, as well as implementation decisions that made it likely difficult to port to non-36-bit systems, worked against a longer life. Furthermore, although there were attempts to regularize use of EMISARI, it was only expected to be used in emergencies, so it lacked a &ldquo;stickiness&rdquo; that software used daily possess. Given repeated bureaucratic reorganizations and changes to how emergency management was envisioned, we would not expect any asset to last long.<\/p>\n<p>Citation-wise, the academic community largely ignored the small web of EMISARI-related papers. As an &ldquo;application,&rdquo; we think this is expected as &ldquo;theory&rdquo; is easier to build upon. Turoff&rsquo;s later book, <em>Network Nation: Human communication via Computer<\/em>, co-authored with his wife, Starr Hiltz, was far more influential.<\/p>\n<p>Like many pioneering software systems, EMISARI might have been too early to be well-remembered. Economically, terminals were too expensive to be common-place and, at 30 characters per second, remote interfaces were slow and awkward. Within the next decade, many of the features would be implemented and scaled within academic and commercial networks.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>The OEP used the PARTY-LINE and EMISARI software to effectively administer the 90-day Phase 1 Wage Price Freeze Emergency. Functionally, the software performed similar roles as a modern knowledge base, forum, online chat, and multi-user spreadsheets. Development was rapid (a minimally viable version within one calendar week) and used a small, motivated team. The user base was distributed across eleven sites and there were about a thousand potential users, although actual concurrent users were far less.<\/p>\n<p>We believe there are four lessons generalizable to other teams facing similar challenges:<\/p>\n<h3 id=\"schedule-matters-more-than-performance\">Schedule matters more than performance<\/h3>\n<p>The team used XBASIC because it provided a means to create interactive programs in less time than in assembly. Although Turoff was experienced in multiple programming languages and could have programmed the tool in assembly, programming in XBASIC was faster and operations were &ldquo;fast enough&rdquo; on the 1108. He also could reuse code from the earlier Delphi experiment which was also written in XBASIC. Although the team mention a desire to port the program into assembly to reduce memory usage, memory constraints never seem to have blocked operations.<\/p>\n<h3 id=\"broaden-value-generation\">Broaden Value Generation<\/h3>\n<p>The reporting philosophy of encouraging upwards, downwards, and lateral communication created a community effect where all users saw value out of the system. Individuals were not siloed, but could view any post, see what others were reporting and how information was aggregated, and communicate to anyone. If EMISARI was solely the reporting functions without the forums and messaging, the low-level staff would have seen EMISARI as a data entry chore. Everyone benefited from aligning the distributed team.<\/p>\n<h3 id=\"management-has-levers\">Management Has Levers<\/h3>\n<p>Management made a few decisions to promote success. First, Turoff was given the job to write the first version. Turoff was heavily invested in making the software a success and had the necessary skills and background. The official development team in the company might have more skillful developers, but they were unlikely to volunteer to spend four days manically writing code to meet the schedule.<\/p>\n<p>Secondly, management chose to &ldquo;eat their own dog food.&rdquo; Director Lincoln could have chosen to continue to receive news updates directly from his secretary, but by having the secretary type the updates into EMISARI, others could see value from that work and it demonstrated he was willing to login and do work within EMISARI. Further, using PARTY-LINE for the staff meeting showed commitment to trying new things rather than continue to use the telephone. The fact that the experiment also broke down a wall was pure bonus.<\/p>\n<h3 id=\"track-usage\">Track Usage<\/h3>\n<p>The OEP staff did not all agree that using and investing in EMISARI was worthwhile.  Opponents charged that nobody was using it, but the inclusion of usage metrics gave the team the necessary ammunition to refute these charges. Novel programs should, at least, provide a way to track usage if not value generated.<\/p>\n<h2 id=\"acknowledgments\">Acknowledgments<\/h2>\n<p>I&rsquo;d like to thank the archivists at the National Archives in College Park, Maryland for their assistance in researching OEP records.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Hall 1971) Hall, Thomas W. 1971. \u201cImplementation of an Interactive Conference System.\u201d Proceedings of the May 18-20, 1971, Spring Joint Computer Conference, 217\u201329. <a href=\"https:\/\/doi.org\/10.1145\/1478786.1478818\">https:\/\/doi.org\/10.1145\/1478786.1478818<\/a>.<\/p>\n<p>(Hiltz Turoff 1978) Hiltz, Starr Roxanne, and Murray Turoff. 1978. The Network Nation: Human Communication via Computer. Addison-Wesley Publishing Company, Inc. <a href=\"https:\/\/archive.org\/details\/networknationhum00hiltrich\">https:\/\/archive.org\/details\/networknationhum00hiltrich<\/a>.<\/p>\n<p>(Kupperman 1972) Kupperman, Robert H, and Richard H Wilcox. 1972. \u201cEMISARI - An On-Line Management System in a Dynamic Environment.\u201d The First International Conference on Computer Communication, 117\u201320. <a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102714088\">https:\/\/www.computerhistory.org\/collections\/catalog\/102714088<\/a>.<\/p>\n<p>(Leavitt 1972) Leavitt, Dan. 1972. \u201c\u2018XBASIC\u2019 Backs Delphi Conferencing.\u201d Computerworld, January 26, 17.<\/p>\n<p>(OEP-SED) Record Group 396, Subject Files Relating to Science and Technology [Office Of Science And Technology Subject Files, 1967-73]<\/p>\n<p>(Renner 1973) Renner, Rod L, Robert M. Bechtold, Charles W. Clark, David O. Marbray, Ronald L. Wynn, and Nancy H. Goldstein. 1973. \u201cEMISARI: A Management Information System Designed to Aid and Involve People.\u201d Paper presented at Fourth International Symposium on Computers and Information Science (COINS-72). February. <a href=\"https:\/\/archives.njit.edu\/vhlib\/cccc-materials\/njit-cccc-tm-230\/njit-cccc-tm-230.pdf\">https:\/\/archives.njit.edu\/vhlib\/cccc-materials\/njit-cccc-tm-230\/njit-cccc-tm-230.pdf<\/a>.<\/p>\n<p>(Subramanian 2012) Subramanian, Ramesh. 2012. \u201cMurray Turoff: Father of Computer Conferencing.\u201d IEEE Annals of the History of Computing 34 (1): 92\u201398. <a href=\"https:\/\/doi.org\/10.1109\/MAHC.2012.12\">https:\/\/doi.org\/10.1109\/MAHC.2012.12<\/a>.<\/p>\n<p>(Turoff 1972) Turoff, Murray. 1972. \u201c\u2018Party-Line\u2019 and \u2018Discussion\u2019 Computerized Conference Systems.\u201d The First International Conference on Computer Communication, 161\u201371. <a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102714088\">https:\/\/www.computerhistory.org\/collections\/catalog\/102714088<\/a>.<\/p>\n<p>(Yoshpe 1972) Yoshpe, Harry B., John F. Allums, Joseph E. Russell, and Barbara A. Atkin. 1972. Stemming Inflation\u202f: The Office of Emergency Preparedness and the 90-Day Freeze. U.S. Government Printing Office. <a href=\"https:\/\/fraser.stlouisfed.org\/title\/198\">https:\/\/fraser.stlouisfed.org\/title\/198<\/a>.<\/p>"},{"title":"Interview with John Bridges (PCPaint, GRASP, GL Pro)","link":"https:\/\/ztoz.blog\/posts\/interview-john-bridges\/","pubDate":"Fri, 11 Jul 2025 10:52:00 -0600","guid":"https:\/\/ztoz.blog\/posts\/interview-john-bridges\/","description":"<p>John Bridges is a software engineer that co-developed PCPaint, which in 1984 was one of the earliest digital painting programs and helped spur the use of mice within the IBM PC market. He supported the growing number of PC video developers by freely releasing the VGAKIT and TGA utilities, tools that helped developers understand and compensate for the many hardware and software quirks of the era. He was also on the forefront of presentation software, developing GRASP, GL Pro, and AfterGrasp, which supported the commercial development of interactive games, screen savers, and demos of graphical algorithms.<\/p>\n<p>In this interview, we seek to understand the engineering craft during the pioneering days of the 1980s and 1990s, how Bridges approached design and implementation choices, and the impact of a programmer outside academia and the big technology firms.<\/p>\n<figure><a href=\"https:\/\/archive.org\/details\/pc-paint-plus-version-2.0\"><img src=\"pcpaint_2_emulator_screenshot.png\"\n         alt=\"screenshot of PCPaint 2 w\/ palettes of tools and settings around a blank content area\"\/><\/a><figcaption>\n            <p>PCPaint Plus 2.0 (Credit archive.org)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<h2 id=\"q-what-led-you-to-software-development-what-was-your-first-computer\">Q: What led you to software development? What was your first computer?<\/h2>\n<p>I was excited about computers from an early age, but didn&rsquo;t get to touch one until my father brought home a new Texas Instruments TI-59 programmable calculator with printer when I was 13. It was programmed in something like a simple floating point assembly language, and I really enjoyed exploring its limits. I had an interest in prime numbers then, and tried some experiments of calculating as many as possible in the memory by packing multiple integer values in each floating point variable. Over the next few years I had access to an IBM 1130 programming Fortran at Stuyvesant High School, and a Tektronix 4052 at an internship at NYU Medical Center. My first &ldquo;real&rdquo; computer I owned was an Apple II+ computer with floppy drive where I started programming in Basic and Assembly language. I pored over a bound printout I made of the infamous DOS 3.3 Disassembly to figure out how to speed up reading by 500% and wrote an article in Hardcore magazine entitled <a href=\"https:\/\/archive.org\/details\/hardcore-computing-3\">HyperDOS<\/a>.<\/p>\n<p>My first commercial software was the RAM Disk and Task Swapping\/Hacking utilities included with the Know-Drive, made by Abacus Enterprises. I didn&rsquo;t get to make a lot of money, but it was enormous fun!  That&rsquo;s eventually how I did the screen captures of the Apple MousePaint program that we used to mock up the first PCPaint screens.<\/p>\n<p>My first real job programming was the summer after high school at CCM (Classroom Consortia Media), a small company founded by teachers to make educational software. I was recommended by my 8th grade math teacher whose son attended the same high school as me, and had told his father about my computer prowess.<\/p>\n<p>At first it was on Monroe color computers where I mostly wrote a drawing program that became the early version of what would become SuperDraw. CCM then made a deal with IBM to produce educational science software as part of their efforts to get IBM PCs into schools. IBM provided a bunch of PCs, and I soon got an early IBM XT to take home so I could work from both home and the office. At CCM we decided to go with C, which I knew almost nothing about. My first tasks were to create all the tools we would need, including graphics routines, utilities to create image libraries, and a drawing program to create\/edit images which eventually became a commercial product as &ldquo;SuperDraw&rdquo;.<\/p>\n<p>Since I was already comfortable with assembly language it was easy for me to pick up x86 assembly on the PC. Because I knew nothing about C or the terminology of C programming (like pointers\/structures\/indirection\/parameter passing), I often fell back on looking at a disassembly of the compiled C code to see what was really going on. That allowed me to directly translate my ideas of how the computer worked into higher level C constructs, and interface my assembly routines with the C calls.<\/p>\n<p>It also meant a lot of my early C code tended to have a lot of global variables, which it took me many years to break that habit! Because of a fondness towards Forth, and the whole philosophy of breaking code into small functions, I developed a style of using tons of tiny little functions. Whenever I started to see code repeating I&rsquo;d break it out into a generalized function, and nest this sort of process. It often led to major refactoring, but I found it easier to try to isolate logic as much as possible.<\/p>\n<p>As for what Compiler\/Assembler, we initially used the Computer Innovations C compiler for DOS, then eventually switched to Manx Aztec C, then quite a few years after that to Microsoft C++ when Manx faded away. For assembly I used Microsoft MASM until I moved to inline assembly as soon as that was available. When we were porting our educational software to Apple II computers we also used Aztec 6502 C compiler. The relationship with Manx developed to the point that they were selling SuperDraw for us for a while.<\/p>\n<h2 id=\"q-one-of-your-earliest-commercial-projects-c-1983-was-a-graphics-library-for-classroom-consortia-media-ccm-who-was-working-with-ibm-to-develop-educational-software-what-challenges-did-you-face-how-did-ibm-contribute\">Q: One of your earliest commercial projects (c. 1983) was a graphics library for Classroom Consortia Media (CCM) who was working with IBM to develop educational software. What challenges did you face? How did IBM contribute?<\/h2>\n<p>In the early days I didn&rsquo;t deal with IBM. It was closed door meetings and dog\/pony shows.<\/p>\n<p>Our company had already shown early versions of the educational\nsoftware we had made on Monroe color computers, so IBM wanted that on PCs so\nthey could sell IBM PCs to schools. At some point I was in a meeting at IBM discussing what our ideal machine would be for educational software. I pushed hard for 1 byte per pixel so we could get a lot more colors, and much simpler graphics programming. 320x200 was a practical limit at 1 byte per pixel because the CPU could only address 64k at a time.<\/p>\n<p>Months after that meeting we got early access to a prototype PCjr, and although the graphics were a WONDERFUL step up, the rest of it was quite disappointing. It was like they had intentionally crippled the machine, and it was a flop. CCM had rapidly expanded, moving to two floors of a larger building across the street, and hiring more management, office staff, and developers. The PCjr flop further increased financial problems. IBM couldn&rsquo;t entirely fund CCM. We were expecting software sales, but the PC and PCjr never became popular for schools, particularly elementary and middle schools which was our focus. CCM went through multiple layoffs and reorganizations until they moved to a much smaller office with only a handful of people in 1987.<\/p>\n<p>At that time I made an offer to IBM to show off their replacement for the PCjr, the IBM Model 30. I had a proposal to do motion video, one quarter screen, unheard of then on a slow 8mhz PC. Unfortunately we couldn&rsquo;t use IBM hardware to make this demo because IBM had tried to push the entire industry away from the standard PC platform to the new PS2 platform using a new interface slot. The problem was a video capture card we needed to produce video was not available that would work on any IBM machines. So IBM paid for me to purchase a Compaq 386\/20 PC, and an AT&amp;T Targa16 video capture card. They also paid for an industrial controlled laserdisc player, and having the demo video pressed onto a laserdisc. It had to be laserdisc because the Targa16 could only capture still frames, and laserdisc was the only way to get single step random access to high quality video frames.<\/p>\n<p>The biggest challenges of this project were basically inventing all the software\/algorithms from scratch. This was before internet, and easy research, so I basically made it all up. When I needed to scale hi color images, when scaling down I would combine pixels with percentages of other pixels, or when scaling up, I&rsquo;d use percentages of nearby pixels to make new pixels. That became the TGASCALE utility. It wasn&rsquo;t ideal for scaling up, but worked nicely for scaling down. For reducing 15bit hi color down to palette images with 256 or fewer colors, I started with simple popularity, building a table of all colors used, and sorting by how often. Then picking the most popular for the palette. This worked fine for large smooth gradients, but left details with no decent color choices. I started combining colors, so very similar colors would be replaced with the most popular until enough color details could be rendered. I mostly came up with optimal default values for combine level vs total colors by using a set of test images and my eyes. Like I had some shots of Johnny Carson on tonight show which was reminiscent of the educational shots from IBM.  All that color reduction code became the TGAPIC utility.<\/p>\n<p>To reduce the huge amount of data required for video, I created some differential image compression code, where I stored only the pixels that changed between frames, using run length compression for the empty parts. For lots of complex parts without long skip runs, I used a bitmask to define which pixels changed. Some of this code eventually became the DFF animation format used in GRASP. Here is a letter I wrote to Paul Mace Software about the <a href=\"https:\/\/www.fileformat.info\/format\/grasp\/spec\/681209269aa74f7d90e826987f73ec75\/text.htm\">DFF format<\/a>.<\/p>\n<p>The playback software also ran into issues with the slow speed of palette and video memory writes, so I couldn&rsquo;t change palettes without extreme color flash. My workaround was to only use 120 colors for the video, reserving 8 colors for text and surround.  Then I&rsquo;d alternate using the top 128 vs the bottom 128 colors so I&rsquo;d never change the palette for any visible colors. All this got used later for the National Geographic Mammals CD-ROM.<\/p>\n<h2 id=\"q-in-the-spring-of-1984-you-and-doug-wolfram-develop-pcpaint-which-would-be-distributed-by-mouse-systems-with-their-mice-at-the-time-you-lived-in-new-york-and-doug-in-california-how-did-you-handle-geographically-distributed-development\">Q: In the spring of 1984, you and Doug Wolfram develop PCPaint which would be distributed by Mouse Systems with their mice. At the time, you lived in New York and Doug in California. How did you handle geographically distributed development?<\/h2>\n<p>Doug ran an Apple II based BBS which I was a member of at the time. It did not support files, only messages. Unfortunately we have no kind of archive of that BBS.\nDuring development we mostly communicated by telephone.<\/p>\n<p>This is before we had internet access, or Compuserve. So we generally sent files back and forth via modem. I used a Hayes 1200 baud Smartmodem with terminal programs like Qmodem.<\/p>\n<p>It was too painful moving files between Apple and IBM at that time, so I generally did PCPaint work and communications on the PC. Doug does not recall having a hard drive on his PC during PCPaint 1.0, so I mostly sent him final builds since compiling on floppies was grim. We had a friend do the fonts, and some of the graphics.<\/p>\n<p>PCPaint 1.5 was done the same way with me working on Staten Island at my parents house, and Doug in his home on Serang Place in Costa Mesa California.<\/p>\n<p>In fall of 1985 PCPaint 2.0 was running late&hellip; to speed up final development, the agreement with Mouse Systems was I would go work onsite at Mouse Systems if they covered costs. I ended up working at the Mouse Systems offices every day for over a month, taking long drives at night in a light blue rental Mercedes convertible. Great car! Sadly I had to give back the Mercedes because they had it reserved for James Belushi, and I had long overrun the original rental agreement. I ended up with an awful Nissan nx200 that overheated. I drove it down to Doug&rsquo;s house in Southern California when PCPaint 2.0 shipped.<\/p>\n<p>That development became such a grind that PCPaint 2.0 was probably the most bug free  commercial software I&rsquo;ve ever worked on. Such a grind that one of the testers died shortly after completion!<\/p>\n<p>A year later, I bought Doug&rsquo;s house on Serang Place, and he moved half an hour further south. I wasted a hunk of the PCPaint money on a new car. Was tempted by a Mercedes convertible, but interior space was too limited. I went with a white Porsche 928S4, which I foolishly crashed driving too fast in an empty damp part of Pacific Coast Highway after a few months.<\/p>\n<p>Glad I wasn&rsquo;t hurt.<\/p>\n<p>Once I had moved into Costa Mesa, I started a GRASP support BBS running on PCBoard, and used that for sending files. That was all dialup modems. That BBS eventually ran on the Compaq 386\/20 IBM paid a fortune for! I didn&rsquo;t get broadband of any kind until the late 1990s when PacificBell offered IDSL, which was 128kbps for $110 a month. I stayed in that house over 24 years, including after I met my wife on Match.com. This was before smartphones or &ldquo;swipe right&rdquo;. It was more like old fashioned matchmaking by correspondence mail. We left shortly after our first daughter was born for rural New England, and the people who purchased the house tore it down except one tiny corner.<\/p>\n<p>Most upsetting, I accidentally left behind my last Apple II computer with all the accessories. It was in the corner of the guest closet, and we somehow missed it as we frantically filled and closed boxes as the movers carried them away.<\/p>\n<p>On the same evening after the movers left we boxed up our last clothes and medical records, shipped them via the post office to ourselves. Except my wife was exhausted, and misunderstood where we were shipping the box. She shipped it to ourselves at our defunct California address where it sadly sat outside for days until some random person took it.<\/p>\n<p>My advice for moving out of a house after 24 years is rent a dumpster, and be brutal. We didn&rsquo;t, and I still have untouched old PCs we paid a fortune to move.<\/p>\n<h2 id=\"q-you-used-the-c86-c-compiler-from-computer-innovations-for-pcpaint-why-did-you-choose-c-how-did-you-find-and-select-your-programming-tools-in-the-1980s\">Q: You used the C86 C compiler from Computer Innovations for PCPaint. Why did you choose C? How did you find and select your programming tools in the 1980s?<\/h2>\n<p>Choosing C was a joint decision at Classroom Consortia Media. We had a manager who had experience with Bell Labs and recommended C. I remember preferring Basic at first because our developers were all familiar with Basic, but when we realized the poor state of Basic on IBM PCs we went with C. I was already very familiar with assembly language on several different CPUs, so 8086 Assembly was easy for me, and since C is really like a higher level 16bit assembly language I picked it up quickly.<\/p>\n<p>Why Computer Innovations? In 1983, it was the only reliable C compiler we found! I actually ran into a couple bugs in the compiler, but the main limit was lack of inline assembly, and support for other CPUs. We eventually needed to compile for 6502 to produce Apple II educational software. With Manx Aztec C, I could port over my graphics library and all the command line tools. We had so many problems doing development using Apple II computers with unreliable harddrives and quirky software, that we eventually made Apple II the only target. So we&rsquo;d do all the development and graphics on PCs, and then just do final visual testing on Apple II machines. I remember creating a special scaled mode in SuperDraw to simulate the strange aspect ratio and resolution of Apple graphics.<\/p>\n<p>I don&rsquo;t have exact records of which compilers were used for which versions of PCPaint. I know I switched to Microsoft C 6.0 exclusively around 1990 since that&rsquo;s what was used for all the example code I made public like VGAKIT. As for other programming tools: In the early days I used WordStar for code editing under CP\/M, even using a Z80 card on my Apple II, then the DOS version. I stuck with the old version, avoiding the newer WordStar versions because I was comfortable with the keyboard layout and program design. Then switched to Brief for editing around 1986. I used a modified keyboard layout simulating the Wordstar keys. For graphics editing, measuring\/scaling, I mostly used my own tools including SuperDraw.<\/p>\n<h2 id=\"q-to-develop-pcpaint-you-and-doug-would-need-to-understand-how-to-interface-with-a-dozen-or-so-video-modes-read-data-from-the-mouse-and-implement-graphical-primitives-such-as-curves-flood-fills-and-magnification-of-images-how-did-you-learn-or-develop-these-skills-and-stay-current-in-the-industry-for-instance-i-count-17-pc-compatible-video-hardware-standards-introduced-between-1981-and-1991\">Q: To develop PCPaint, you and Doug would need to understand how to interface with a dozen or so video modes, read data from the mouse, and implement graphical primitives such as curves, flood fills, and magnification of images. How did you learn or develop these skills and stay current in the industry? For instance, I count 17 PC compatible video hardware standards introduced between 1981 and 1991.<\/h2>\n<p>The first PCPaint was CGA graphics only, no unusual video modes, and the modes were available using BIOS calls to switch into them.<\/p>\n<p>I don&rsquo;t believe Doug had much to do with PCPaint after the first couple versions except delivering code to Mouse Systems, and handling the contracts. Doug was busy with his new company &ldquo;GRAFX Group&rdquo; which used GRASP to produce multimedia projects since 1984.<\/p>\n<p>DOS based graphics, even including SVGA graphics in high resolution in the early 1990s, used NO hardware assist. All the graphics was writing to a frame buffer.<\/p>\n<p>I believe the first hardware assist SVGA was from S3 in 1991, and was only for Windows acceleration.<\/p>\n<p>To figure out the layout of the frame buffer, and how to program the chipset into different modes I made contact with the largest chip makers and built up a ongoing relationship with companies like Tseng Labs, Hercules, Chips &amp; Tech, ATI, Genoa, NCR, Compaq, Cirrus, Everex, Trident and others. I&rsquo;d meet them at Comdex, make sure they sent me test hardware and documentation.<\/p>\n<p>This was only for a few years because they finally came together as VESA to create a standardized Video BIOS Extensions (VBE) to detect, switch video modes, and provide details on frame buffer. All this code was made public in VGAKIT, so the chip makers were generally enthusiastic about getting my support.<\/p>\n<p>For the other video modes in PCPaint, I just created sets of icons\/graphics\/borders for each major resolution, like 640x480, 800x600, and so on. For mouse, I used the standard MouseSystems and Microsoft DOS mouse drivers to talk to the mouse. For tablets I had to talk to the tablets directly via a serial port because there was no standard at that time for tablet drivers.<\/p>\n<p>As for all the graphics primitives, I wrote them myself pretty much making it up as I went along, and sometimes my work was not ideal. Like my early circle\/ellipse code was kind of crude with rough edges. For animation in video modes where pixels were not byte aligned, like 2 color, 4 color, and 16 color, I would create routines to pre-shift the graphics into each byte alignment. That way I only had to mask the edges, and then copy bytes without doing all the shifting real time (very slow).<\/p>\n<p>I remember struggling with the performance of my area fill. I made up several algorithms until I had one fast enough I was satisfied. My earliest was a simple recursion drawing points, it was slow and it could hang in complex areas since it didn&rsquo;t know which areas had been filled yet. I tried a list of scan lines, but that was slow on complex areas, so my final code had to use a bitmap memory of what areas had been filled to avoid bugs with complex patterns that included the same colors already in an area. The final code ended up with a dialog box to pick different kinds of fill including gradients.<\/p>\n<p>For scaling\/zoom, it was a simple pixel duplication, so I just wrote some special code to handle it quickly.<\/p>\n<p>The text editing for proportionally spaced text was surprisingly complicated in later versions of PCPaint.  And we always had problems with italic fonts and proportional spacing since you couldn&rsquo;t use the scanned pixel width. I had to eventually change the font format to include width tables to fix that.<\/p>\n<p>Keep in mind I failed trig in high school when I hated memorizing formulas and slacked off. Second time I took the class, a different teacher emphasized how everything was derived, and I easily got an A.<\/p>\n<p>As long as I could figure out the roots of something so that I understood how it was arrived at, I could take a crack at solving anything on my own.<\/p>\n<h3 id=\"q-can-you-provide-an-example-of-an-algorithm-you-developed\">Q: Can you provide an example of an algorithm you developed?<\/h3>\n<p>This algorithm I needed was for an IMGWARP utility that eventually became a command in GRASP\/GLPRO. I don&rsquo;t have the original source for either (still looking), but I do remember the story: It basically took an image, and mapped it to any four sided polygon.<\/p>\n<figure><img src=\"WarpedPeppers.png\"\n         alt=\"Picture of peppers stretched and skewed\"\/><figcaption>\n            <p>Screenshot of Peppers Image Warped (Orig. 4.2.07 in USC-SIPI)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Ideally, it would be all integer, no complex math since this would sometimes need to be done in real time in a running app on a slow computer running DOS. I wrote a few test algorithms such as simple ratios, but they looked strange and badly distorted, and fell apart in complex cases. I consulted with my older brother Tom who is a math professor, and he fiddled around with it, but the results didn&rsquo;t look right, and the math was too complex. I tried some math solving software by writing it all out long form, and asking it to simplify it. Didn&rsquo;t get anywhere.<\/p>\n<p>Finally asked a smart close friend of mine to do it as a project where I&rsquo;d pay him. I don&rsquo;t remember the amount of money, and sadly he passed a few years ago. He puzzled over it for a few weeks until he was very excited to show me his very simple answer. At one of our regular Indian restaurant meals he drew it out on a sheet of paper. He extended each of the two sides until they met, so it became a simple triangle, then did simple ratio based on that area. Special cased for parallel sides. When I described it to my brother he was a bit shocked at how simple it really was.<\/p>\n<figure><img src=\"quadexample.png\"\n         alt=\"quadrilateral x1,y1 to x4,y4, containing point xx,yy and triangle extending from x1,y1 x2,y2 to unnamed third point enclosing xx,yy\"\/><figcaption>\n            <p>Recreation of triangle diagram<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p><strong>Editor&rsquo;s note: although the original code is lost, Bridges believes the code below is very similar to the algorithm developed in the mid-80s.<\/strong><\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">42\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">43\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">44\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">45\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">46\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">47\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">48\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">49\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">50\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">51\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">52\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">53\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">54\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">55\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">56\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">57\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">58\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">59\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">60\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">61\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">62\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">63\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">64\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">65\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">66\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">67\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">68\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">69\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">70\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\"><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;math.h&gt;  <\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;stdio.h&gt;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;stdlib.h&gt;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#66d9ef\">typedef<\/span> <span style=\"color:#66d9ef\">struct<\/span> { <span style=\"color:#66d9ef\">int<\/span> x; <span style=\"color:#66d9ef\">int<\/span> y; } point;\n\n<span style=\"color:#66d9ef\">int<\/span> imgWidth <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1000<\/span>;\n<span style=\"color:#66d9ef\">int<\/span> imgHeight <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1000<\/span>;\n\n   <span style=\"color:#75715e\">\/\/ Define quad corners (screen space)\n<\/span><span style=\"color:#75715e\"><\/span>point quad[<span style=\"color:#ae81ff\">4<\/span>] <span style=\"color:#f92672\">=<\/span> {\n    { <span style=\"color:#ae81ff\">200<\/span>, <span style=\"color:#ae81ff\">100<\/span> },  <span style=\"color:#75715e\">\/\/ Top-left\n<\/span><span style=\"color:#75715e\"><\/span>    { <span style=\"color:#ae81ff\">600<\/span>, <span style=\"color:#ae81ff\">150<\/span> },  <span style=\"color:#75715e\">\/\/ Top-right\n<\/span><span style=\"color:#75715e\"><\/span>    { <span style=\"color:#ae81ff\">550<\/span>, <span style=\"color:#ae81ff\">500<\/span> },  <span style=\"color:#75715e\">\/\/ Bottom-right\n<\/span><span style=\"color:#75715e\"><\/span>    { <span style=\"color:#ae81ff\">180<\/span>, <span style=\"color:#ae81ff\">450<\/span> }   <span style=\"color:#75715e\">\/\/ Bottom-left\n<\/span><span style=\"color:#75715e\"><\/span>};\n\n<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#a6e22e\">TriangleArea<\/span>(point a, point b, point c) {\n    <span style=\"color:#66d9ef\">return<\/span> abs((a.x<span style=\"color:#f92672\">*<\/span>(b.y<span style=\"color:#f92672\">-<\/span>c.y) <span style=\"color:#f92672\">+<\/span> b.x<span style=\"color:#f92672\">*<\/span>(c.y<span style=\"color:#f92672\">-<\/span>a.y) <span style=\"color:#f92672\">+<\/span> c.x<span style=\"color:#f92672\">*<\/span>(a.y<span style=\"color:#f92672\">-<\/span>b.y)));\n}\n\n<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#a6e22e\">PointInTriangle<\/span>(point p, point a, point b, point c) {\n    <span style=\"color:#66d9ef\">int<\/span> areaTotal <span style=\"color:#f92672\">=<\/span> TriangleArea(a, b, c);\n    <span style=\"color:#66d9ef\">int<\/span> area1 <span style=\"color:#f92672\">=<\/span> TriangleArea(p, b, c);\n    <span style=\"color:#66d9ef\">int<\/span> area2 <span style=\"color:#f92672\">=<\/span> TriangleArea(a, p, c);\n    <span style=\"color:#66d9ef\">int<\/span> area3 <span style=\"color:#f92672\">=<\/span> TriangleArea(a, b, p);\n    <span style=\"color:#66d9ef\">return<\/span> abs(areaTotal <span style=\"color:#f92672\">-<\/span> (area1 <span style=\"color:#f92672\">+<\/span> area2 <span style=\"color:#f92672\">+<\/span> area3)) <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n}\n\n<span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#a6e22e\">SinglePixel<\/span>(<span style=\"color:#66d9ef\">int<\/span> x, <span style=\"color:#66d9ef\">int<\/span> y, <span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>rx, <span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>ry)  {\n    <span style=\"color:#66d9ef\">int<\/span> tx;\n    <span style=\"color:#66d9ef\">int<\/span> ty;\n\n    point p <span style=\"color:#f92672\">=<\/span> { (<span style=\"color:#66d9ef\">double<\/span>)x, (<span style=\"color:#66d9ef\">double<\/span>)y };\n    <span style=\"color:#75715e\">\/\/ First triangle: quad[0], quad[1], quad[2]\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">if<\/span> (PointInTriangle(p, quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>])) {\n\n        <span style=\"color:#66d9ef\">int<\/span> areaABC <span style=\"color:#f92672\">=<\/span> TriangleArea(quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPBC <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPCA <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">0<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPAB <span style=\"color:#f92672\">=<\/span> areaABC <span style=\"color:#f92672\">-<\/span> (areaPBC <span style=\"color:#f92672\">+<\/span> areaPCA);\n\n        <span style=\"color:#f92672\">*<\/span>rx <span style=\"color:#f92672\">=<\/span> ((areaPCA <span style=\"color:#f92672\">+<\/span> areaPAB) <span style=\"color:#f92672\">*<\/span> (imgWidth <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n        <span style=\"color:#f92672\">*<\/span>ry <span style=\"color:#f92672\">=<\/span> (areaPAB <span style=\"color:#f92672\">*<\/span> (imgHeight <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n\n    }\n    <span style=\"color:#75715e\">\/\/ Second triangle: quad[0], quad[2], quad[3]\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">else<\/span> {\n        <span style=\"color:#66d9ef\">int<\/span> areaABC <span style=\"color:#f92672\">=<\/span> TriangleArea(quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">3<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPBC <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">3<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPCA <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">3<\/span>], quad[<span style=\"color:#ae81ff\">0<\/span>]);\n        <span style=\"color:#66d9ef\">int<\/span> areaPAB <span style=\"color:#f92672\">=<\/span> areaABC <span style=\"color:#f92672\">-<\/span> (areaPBC <span style=\"color:#f92672\">+<\/span> areaPCA);\n\n        <span style=\"color:#f92672\">*<\/span>rx <span style=\"color:#f92672\">=<\/span> (areaPCA <span style=\"color:#f92672\">*<\/span> (imgWidth <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n        <span style=\"color:#f92672\">*<\/span>ry <span style=\"color:#f92672\">=<\/span> ((areaPCA <span style=\"color:#f92672\">+<\/span> areaPAB) <span style=\"color:#f92672\">*<\/span> (imgHeight <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n    }\n}\n\n<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#a6e22e\">main<\/span>() {\n\n    <span style=\"color:#75715e\">\/\/ Point inside quad\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">int<\/span> xx <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">200<\/span>;\n    <span style=\"color:#66d9ef\">int<\/span> yy <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">250<\/span>;\n\n    <span style=\"color:#66d9ef\">int<\/span> rx, ry;\n\n    SinglePixel(xx, yy, <span style=\"color:#f92672\">&amp;<\/span>rx, <span style=\"color:#f92672\">&amp;<\/span>ry);\n    printf(<span style=\"color:#e6db74\">&#34;Mapped point GOOD: (%d, %d)<\/span><span style=\"color:#ae81ff\">\\n<\/span><span style=\"color:#e6db74\">&#34;<\/span>, rx, ry);\n    <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n}\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>Sadly I don&rsquo;t use any of that code anymore. Instead I built a much faster set of code in 2003 that recursively divided each successive area into quarters until one pixel in size. I modified that in late 2004 to anti-alias by oversampling, which looked much nicer. It looks like it was based on an algorithm from someone else I copied since it has comments that are not my style (formatted and don&rsquo;t match the actual code), but I can&rsquo;t find the original using a web search on any of these comments.<\/p>\n<p>I did find my old polygon fill with line matching code. That&rsquo;s the original code from PCPaint, since it had to handle complex many segment polygons with overlap\/crossovers. I was able to use stdlib qsort instead of my ancient &ldquo;myqsort&rdquo;.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">  9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 42\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 43\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 44\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 45\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 46\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 47\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 48\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 49\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 50\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 51\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 52\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 53\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 54\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 55\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 56\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 57\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 58\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 59\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 60\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 61\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 62\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 63\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 64\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 65\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 66\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 67\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 68\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 69\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 70\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 71\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 72\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 73\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 74\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 75\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 76\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 77\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 78\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 79\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 80\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 81\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 82\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 83\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 84\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 85\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 86\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 87\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 88\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 89\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 90\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 91\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 92\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 93\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 94\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 95\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 96\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 97\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 98\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 99\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">100\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">101\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">102\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">103\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">104\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">105\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">106\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">107\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">108\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">109\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">110\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">111\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">112\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">113\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">114\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">115\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">116\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">117\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">118\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">119\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">120\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">121\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">122\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">123\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">124\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">125\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">126\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">127\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">128\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">129\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">130\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">131\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">132\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">133\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">134\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">135\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">136\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">137\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">138\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">139\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">140\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">141\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">142\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">143\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">144\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">145\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">146\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">147\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">148\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">149\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">150\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">151\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">152\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">153\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">154\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">155\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">156\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">157\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">158\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">159\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">160\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">161\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">162\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">163\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">164\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">165\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">166\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">167\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">168\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">169\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">170\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">171\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">172\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">173\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">174\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">175\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">176\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">177\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">178\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">179\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">180\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">181\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">182\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">183\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">184\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">185\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">186\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">187\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">188\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">189\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">190\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">191\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">192\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">193\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">194\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">195\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">196\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">197\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">198\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">199\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">200\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">201\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">202\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">203\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">204\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">205\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">206\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">207\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">208\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">209\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">210\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">211\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">212\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">213\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">214\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">215\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">216\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">217\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">218\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">219\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">220\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">221\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">222\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">223\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">224\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">225\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">226\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">227\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">228\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">229\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">230\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">231\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">232\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">233\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">234\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">235\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">236\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">237\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">238\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">239\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">240\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">241\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">242\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">243\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">244\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">245\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">246\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">247\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">248\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">249\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">250\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">251\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">252\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">253\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">254\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">255\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">256\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">257\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">258\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">259\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">260\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">261\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">262\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">263\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">264\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">265\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">266\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">267\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">268\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">269\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">270\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">271\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">272\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">273\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">274\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">275\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">276\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">277\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">278\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">279\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">280\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">281\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">282\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">283\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">284\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">285\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">286\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">287\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">288\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">289\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">290\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">291\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">292\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">293\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">294\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">295\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">296\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">297\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">298\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">299\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">300\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">301\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">302\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">303\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">304\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">305\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">306\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">307\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">308\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">309\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">310\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">311\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">312\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">313\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">314\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">315\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">316\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">317\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">318\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">319\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">320\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">321\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">322\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">323\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">324\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">325\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">326\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">327\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">328\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">329\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">330\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">331\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">332\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">333\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">334\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">335\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">336\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">337\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">338\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">339\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">340\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">341\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">342\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">343\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">344\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">345\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">346\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">347\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">348\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">349\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">350\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">351\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">352\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">353\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">354\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">355\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">356\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">357\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">358\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">359\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">360\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">361\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">362\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">363\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">364\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">365\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">366\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">367\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">368\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">369\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">370\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\"><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&#34;raylib.h&#34;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;math.h&gt;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;stdio.h&gt;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&lt;stdlib.h&gt;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">#include<\/span> <span style=\"color:#75715e\">&#34;resource_dir.h&#34;<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#66d9ef\">typedef<\/span> <span style=\"color:#66d9ef\">struct<\/span> { <span style=\"color:#66d9ef\">int<\/span> x; <span style=\"color:#66d9ef\">int<\/span> y; } point;\n\n<span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#66d9ef\">int<\/span> screenWidth <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">800<\/span>;\n<span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#66d9ef\">int<\/span> screenHeight <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">600<\/span>;\n\n<span style=\"color:#66d9ef\">static<\/span> Image img;\n<span style=\"color:#66d9ef\">static<\/span> Color <span style=\"color:#f92672\">*<\/span>pixels;\n<span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">int<\/span> imgWidth;\n<span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">int<\/span> imgHeight;\n\n\t<span style=\"color:#75715e\">\/\/ Define quad corners (screen space)\n<\/span><span style=\"color:#75715e\"><\/span>point quad[<span style=\"color:#ae81ff\">4<\/span>] <span style=\"color:#f92672\">=<\/span> {\n\t{ <span style=\"color:#ae81ff\">200<\/span>, <span style=\"color:#ae81ff\">100<\/span> },  <span style=\"color:#75715e\">\/\/ Top-left\n<\/span><span style=\"color:#75715e\"><\/span>\t{ <span style=\"color:#ae81ff\">600<\/span>, <span style=\"color:#ae81ff\">150<\/span> },  <span style=\"color:#75715e\">\/\/ Top-right\n<\/span><span style=\"color:#75715e\"><\/span>\t{ <span style=\"color:#ae81ff\">550<\/span>, <span style=\"color:#ae81ff\">500<\/span> },  <span style=\"color:#75715e\">\/\/ Bottom-right\n<\/span><span style=\"color:#75715e\"><\/span>\t{ <span style=\"color:#ae81ff\">180<\/span>, <span style=\"color:#ae81ff\">450<\/span> }   <span style=\"color:#75715e\">\/\/ Bottom-left\n<\/span><span style=\"color:#75715e\"><\/span>};\n\n\n<span style=\"color:#75715e\">\/\/ Helper: Calculate area of triangle\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#a6e22e\">TriangleArea<\/span>(point a, point b, point c) {\n\t<span style=\"color:#66d9ef\">return<\/span> abs((a.x<span style=\"color:#f92672\">*<\/span>(b.y<span style=\"color:#f92672\">-<\/span>c.y) <span style=\"color:#f92672\">+<\/span> b.x<span style=\"color:#f92672\">*<\/span>(c.y<span style=\"color:#f92672\">-<\/span>a.y) <span style=\"color:#f92672\">+<\/span> c.x<span style=\"color:#f92672\">*<\/span>(a.y<span style=\"color:#f92672\">-<\/span>b.y)));\n}\n\n<span style=\"color:#75715e\">\/\/ Helper: Check if a point is inside a triangle\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">bool<\/span> <span style=\"color:#a6e22e\">PointInTriangle<\/span>(point p, point a, point b, point c) {\n\t<span style=\"color:#66d9ef\">int<\/span> areaTotal <span style=\"color:#f92672\">=<\/span> TriangleArea(a, b, c);\n\t<span style=\"color:#66d9ef\">int<\/span> area1 <span style=\"color:#f92672\">=<\/span> TriangleArea(p, b, c);\n\t<span style=\"color:#66d9ef\">int<\/span> area2 <span style=\"color:#f92672\">=<\/span> TriangleArea(a, p, c);\n\t<span style=\"color:#66d9ef\">int<\/span> area3 <span style=\"color:#f92672\">=<\/span> TriangleArea(a, b, p);\n\t<span style=\"color:#66d9ef\">return<\/span> abs(areaTotal <span style=\"color:#f92672\">-<\/span> (area1 <span style=\"color:#f92672\">+<\/span> area2 <span style=\"color:#f92672\">+<\/span> area3)) <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n}\n\n<span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#a6e22e\">SinglePixel<\/span>(<span style=\"color:#66d9ef\">int<\/span> x, <span style=\"color:#66d9ef\">int<\/span> y)  {\n\t<span style=\"color:#66d9ef\">int<\/span> tx;\n\t<span style=\"color:#66d9ef\">int<\/span> ty;\n\n\tpoint p <span style=\"color:#f92672\">=<\/span> { (<span style=\"color:#66d9ef\">double<\/span>)x, (<span style=\"color:#66d9ef\">double<\/span>)y };\n\t<span style=\"color:#75715e\">\/\/ First triangle: quad[0], quad[1], quad[2]\n<\/span><span style=\"color:#75715e\"><\/span>\t<span style=\"color:#66d9ef\">if<\/span> (PointInTriangle(p, quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>])) {\n\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaABC <span style=\"color:#f92672\">=<\/span> TriangleArea(quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPBC <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">1<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPCA <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">0<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPAB <span style=\"color:#f92672\">=<\/span> areaABC <span style=\"color:#f92672\">-<\/span> (areaPBC <span style=\"color:#f92672\">+<\/span> areaPCA);\n\n\t\ttx <span style=\"color:#f92672\">=<\/span> ((areaPCA <span style=\"color:#f92672\">+<\/span> areaPAB) <span style=\"color:#f92672\">*<\/span> (imgWidth <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n\t\tty <span style=\"color:#f92672\">=<\/span> (areaPAB <span style=\"color:#f92672\">*<\/span> (imgHeight <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n\n\t}\n\t<span style=\"color:#75715e\">\/\/ Second triangle: quad[0], quad[2], quad[3]\n<\/span><span style=\"color:#75715e\"><\/span>\t<span style=\"color:#66d9ef\">else<\/span> {\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaABC <span style=\"color:#f92672\">=<\/span> TriangleArea(quad[<span style=\"color:#ae81ff\">0<\/span>], quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">3<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPBC <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">2<\/span>], quad[<span style=\"color:#ae81ff\">3<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPCA <span style=\"color:#f92672\">=<\/span> TriangleArea(p, quad[<span style=\"color:#ae81ff\">3<\/span>], quad[<span style=\"color:#ae81ff\">0<\/span>]);\n\t\t<span style=\"color:#66d9ef\">int<\/span> areaPAB <span style=\"color:#f92672\">=<\/span> areaABC <span style=\"color:#f92672\">-<\/span> (areaPBC <span style=\"color:#f92672\">+<\/span> areaPCA);\n\n\t\ttx <span style=\"color:#f92672\">=<\/span> (areaPCA <span style=\"color:#f92672\">*<\/span> (imgWidth <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n\t\tty <span style=\"color:#f92672\">=<\/span> ((areaPCA <span style=\"color:#f92672\">+<\/span> areaPAB) <span style=\"color:#f92672\">*<\/span> (imgHeight <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>)) <span style=\"color:#f92672\">\/<\/span> areaABC;\n\t}\n\t\n\t<span style=\"color:#66d9ef\">if<\/span>(tx<span style=\"color:#f92672\">&gt;=<\/span><span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#f92672\">&amp;&amp;<\/span> tx<span style=\"color:#f92672\">&lt;<\/span>imgWidth <span style=\"color:#f92672\">&amp;&amp;<\/span> ty<span style=\"color:#f92672\">&gt;=<\/span><span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#f92672\">&amp;&amp;<\/span> ty<span style=\"color:#f92672\">&lt;<\/span>imgHeight) {\n\t\tColor color <span style=\"color:#f92672\">=<\/span> pixels[ty <span style=\"color:#f92672\">*<\/span> imgWidth <span style=\"color:#f92672\">+<\/span> tx];\n\t\tDrawPixel(x, y, color);\n\t}\n}\n\n<span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#a6e22e\">fillrect<\/span>(<span style=\"color:#66d9ef\">int<\/span> x1, <span style=\"color:#66d9ef\">int<\/span> y1, <span style=\"color:#66d9ef\">int<\/span> x2, <span style=\"color:#66d9ef\">int<\/span> y2) {\n\t<span style=\"color:#66d9ef\">int<\/span> x;\n\t<span style=\"color:#66d9ef\">int<\/span> y;\n\n\t<span style=\"color:#66d9ef\">if<\/span> (x1<span style=\"color:#f92672\">&gt;<\/span>x2) {\n\t\t\t<span style=\"color:#66d9ef\">int<\/span> xtmp <span style=\"color:#f92672\">=<\/span> x1;\n\t\t\tx1 <span style=\"color:#f92672\">=<\/span> x2;\n\t\t\tx2 <span style=\"color:#f92672\">=<\/span> xtmp;\n\t}\n\n\t<span style=\"color:#66d9ef\">if<\/span> (y1<span style=\"color:#f92672\">&gt;<\/span>y2) {\n\t\t\t<span style=\"color:#66d9ef\">int<\/span> ytmp <span style=\"color:#f92672\">=<\/span> y1;\n\t\t\ty1 <span style=\"color:#f92672\">=<\/span> y2;\n\t\t\ty2 <span style=\"color:#f92672\">=<\/span> ytmp;\n\t}\n\n\t<span style=\"color:#66d9ef\">for<\/span> (y <span style=\"color:#f92672\">=<\/span> y1; y<span style=\"color:#f92672\">&lt;=<\/span>y2; <span style=\"color:#f92672\">++<\/span>y) {\n\t\t<span style=\"color:#66d9ef\">for<\/span> (x <span style=\"color:#f92672\">=<\/span> x1; x<span style=\"color:#f92672\">&lt;=<\/span>x2; <span style=\"color:#f92672\">++<\/span>x) {\n\t\t\tSinglePixel(x, y);\n\t\t}\n\t}\n}\n\n<span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#a6e22e\">mtline<\/span>(<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>x1, <span style=\"color:#66d9ef\">int<\/span> y1, <span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>x2, <span style=\"color:#66d9ef\">int<\/span> y2, <span style=\"color:#66d9ef\">int<\/span> matchy)\n{\n\t<span style=\"color:#66d9ef\">int<\/span> deltax, deltay;\n\t<span style=\"color:#66d9ef\">int<\/span> dirx, diry;\n\t<span style=\"color:#66d9ef\">int<\/span> acc;\n\t<span style=\"color:#66d9ef\">int<\/span> cnt;\n\t<span style=\"color:#66d9ef\">int<\/span> xx, yy;\n\n\tacc <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n\n\txx <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">*<\/span>x1;\n\tyy <span style=\"color:#f92672\">=<\/span> y1;\n\tdeltax <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">*<\/span>x2 <span style=\"color:#f92672\">-<\/span> xx;\n\t<span style=\"color:#66d9ef\">if<\/span> (deltax <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n\t\t<span style=\"color:#66d9ef\">return<\/span>;\n\n\tdirx <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n\t<span style=\"color:#66d9ef\">if<\/span> (deltax <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n\t{\n\t\tdeltax <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">-<\/span>deltax;\n\t\tdirx <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>;\n\t}\n\n\tdeltay <span style=\"color:#f92672\">=<\/span> y2 <span style=\"color:#f92672\">-<\/span> yy;\n\tdiry <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n\t<span style=\"color:#66d9ef\">if<\/span> (deltay <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n\t{\n\t\tdeltay <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">-<\/span>deltay;\n\t\tdiry <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>;\n\t}\n\n\t<span style=\"color:#66d9ef\">if<\/span> (deltay <span style=\"color:#f92672\">&lt;<\/span> deltax)\n\t{\n\t\tcnt <span style=\"color:#f92672\">=<\/span> deltax;\n\t\t<span style=\"color:#66d9ef\">while<\/span> (yy <span style=\"color:#f92672\">!=<\/span> matchy)\n\t\t{\n\t\t\t<span style=\"color:#66d9ef\">while<\/span> (<span style=\"color:#f92672\">--<\/span>cnt <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n\t\t\t{\n\t\t\t\txx <span style=\"color:#f92672\">+=<\/span> dirx;\n\t\t\t\tacc <span style=\"color:#f92672\">+=<\/span> deltay;\n\t\t\t\t<span style=\"color:#66d9ef\">if<\/span> (deltax <span style=\"color:#f92672\">&lt;=<\/span> acc)\n\t\t\t\t\t<span style=\"color:#66d9ef\">break<\/span>;\n\t\t\t}\n\t\t\tacc <span style=\"color:#f92672\">-=<\/span> deltax;\n\t\t\tyy <span style=\"color:#f92672\">+=<\/span> diry;\n\t\t}\n\n\t\t<span style=\"color:#f92672\">*<\/span>x1 <span style=\"color:#f92672\">=<\/span> xx;\n\t\t<span style=\"color:#66d9ef\">while<\/span> (<span style=\"color:#f92672\">--<\/span>cnt <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n\t\t{\n\t\t\txx <span style=\"color:#f92672\">+=<\/span> dirx;\n\t\t\tacc <span style=\"color:#f92672\">+=<\/span> deltay;\n\t\t\t<span style=\"color:#66d9ef\">if<\/span> (deltax <span style=\"color:#f92672\">&lt;=<\/span> acc)\n\t\t\t\t<span style=\"color:#66d9ef\">break<\/span>;\n\t\t}\n\t\txx <span style=\"color:#f92672\">-=<\/span> dirx;\n\t\t<span style=\"color:#f92672\">*<\/span>x2 <span style=\"color:#f92672\">=<\/span> xx;\n\t\t<span style=\"color:#66d9ef\">return<\/span>;\n\t}\n\n\n\t<span style=\"color:#66d9ef\">while<\/span> (yy <span style=\"color:#f92672\">!=<\/span> matchy)\n\t{\n\t\tyy <span style=\"color:#f92672\">+=<\/span> diry;\n\t\tacc <span style=\"color:#f92672\">+=<\/span> deltax;\n\t\t<span style=\"color:#66d9ef\">if<\/span> (acc <span style=\"color:#f92672\">&gt;=<\/span> deltay)\n\t\t{\n\t\t\tacc <span style=\"color:#f92672\">-=<\/span> deltay;\n\t\t\txx <span style=\"color:#f92672\">+=<\/span> dirx;\n\t\t}\n\t}\n\t<span style=\"color:#f92672\">*<\/span>x1 <span style=\"color:#f92672\">=<\/span> xx;\n\t<span style=\"color:#f92672\">*<\/span>x2 <span style=\"color:#f92672\">=<\/span> xx;\n\t<span style=\"color:#66d9ef\">return<\/span>;\n}\n\n<span style=\"color:#75715e\">#define isgn(i) ((i) &lt; 0 ? -1 : 1)\n<\/span><span style=\"color:#75715e\"><\/span>\n\n<span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">bool<\/span> <span style=\"color:#a6e22e\">cmpit<\/span>(<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>a, <span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>b)\n{\n    <span style=\"color:#66d9ef\">return<\/span> (<span style=\"color:#f92672\">*<\/span>a <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#f92672\">*<\/span>b <span style=\"color:#f92672\">?<\/span> <span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#f92672\">:<\/span> <span style=\"color:#f92672\">*<\/span>a <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">*<\/span>b <span style=\"color:#f92672\">?<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#f92672\">:<\/span> <span style=\"color:#ae81ff\">0<\/span>);\n}\n\n<span style=\"color:#75715e\">\/*************************************************\n<\/span><span style=\"color:#75715e\">fillpoly(xya,numxy,func)\n<\/span><span style=\"color:#75715e\">\n<\/span><span style=\"color:#75715e\">input   : array of coordinates, number of elements in array, fillrect function\n<\/span><span style=\"color:#75715e\">output  : nothing\n<\/span><span style=\"color:#75715e\">utility : fills polygon defined by points in array\n<\/span><span style=\"color:#75715e\">      xya. number of points in array is numxy.\n<\/span><span style=\"color:#75715e\">      calls func to fill area.\n<\/span><span style=\"color:#75715e\">**************************************************\/<\/span>\n<span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#a6e22e\">fillpoly<\/span>(point <span style=\"color:#f92672\">*<\/span>xya, <span style=\"color:#66d9ef\">unsigned<\/span> <span style=\"color:#66d9ef\">int<\/span> numxy, <span style=\"color:#66d9ef\">void<\/span> (<span style=\"color:#f92672\">*<\/span>func) (<span style=\"color:#66d9ef\">int<\/span> x1, <span style=\"color:#66d9ef\">int<\/span> y1, <span style=\"color:#66d9ef\">int<\/span> x2, <span style=\"color:#66d9ef\">int<\/span> y2))\n{\n    <span style=\"color:#66d9ef\">int<\/span> i;\n    <span style=\"color:#66d9ef\">unsigned<\/span> <span style=\"color:#66d9ef\">int<\/span> cnt;\n    <span style=\"color:#66d9ef\">int<\/span> hr, hf;\n    <span style=\"color:#66d9ef\">int<\/span> nx1, nx2;\n    <span style=\"color:#66d9ef\">int<\/span> xa[<span style=\"color:#ae81ff\">256<\/span>];\n    <span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#f92672\">*<\/span>xpnt;\n    <span style=\"color:#66d9ef\">int<\/span> df2, df1, df0;\n    <span style=\"color:#66d9ef\">int<\/span> x1, y1, x2, y2, zy;\n    <span style=\"color:#66d9ef\">int<\/span> tpy, bty;\n    <span style=\"color:#66d9ef\">unsigned<\/span> <span style=\"color:#66d9ef\">int<\/span> ni;\n    <span style=\"color:#66d9ef\">int<\/span> box;\n    <span style=\"color:#66d9ef\">int<\/span> eqx, eqy;\n\n    tpy <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n    bty <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">32767<\/span><span style=\"color:#f92672\">*<\/span><span style=\"color:#ae81ff\">32767<\/span>;\n\n    box <span style=\"color:#f92672\">=<\/span> (numxy <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">4<\/span>);\n\n<span style=\"color:#75715e\">\/* find minimum and maximum y values *\/<\/span>\n    <span style=\"color:#66d9ef\">for<\/span> (i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> (<span style=\"color:#66d9ef\">int<\/span>) numxy; <span style=\"color:#f92672\">++<\/span>i)\n    {\n\n        zy <span style=\"color:#f92672\">=<\/span> xya[i].y;\n        <span style=\"color:#66d9ef\">if<\/span> (zy <span style=\"color:#f92672\">&lt;<\/span> bty)\n            bty <span style=\"color:#f92672\">=<\/span> zy;\n        <span style=\"color:#66d9ef\">if<\/span> (zy <span style=\"color:#f92672\">&gt;<\/span> tpy)\n            tpy <span style=\"color:#f92672\">=<\/span> zy;\n\n        <span style=\"color:#66d9ef\">if<\/span> (box)\n        {\n            ni <span style=\"color:#f92672\">=<\/span> i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n            <span style=\"color:#66d9ef\">if<\/span> (ni <span style=\"color:#f92672\">&gt;=<\/span> numxy)\n                ni <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n\n            eqx <span style=\"color:#f92672\">=<\/span> (xya[i].x <span style=\"color:#f92672\">==<\/span> xya[ni].x);\n            eqy <span style=\"color:#f92672\">=<\/span> (xya[i].y <span style=\"color:#f92672\">==<\/span> xya[ni].y);\n            <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#f92672\">!<\/span>((eqx <span style=\"color:#f92672\">&amp;&amp;<\/span> <span style=\"color:#f92672\">!<\/span>eqy) <span style=\"color:#f92672\">||<\/span> (<span style=\"color:#f92672\">!<\/span>eqx <span style=\"color:#f92672\">&amp;&amp;<\/span> eqy)))\n                box <span style=\"color:#f92672\">=<\/span> false;\n        }\n    }\n\n\n    <span style=\"color:#66d9ef\">if<\/span> (tpy <span style=\"color:#f92672\">==<\/span> bty)     <span style=\"color:#75715e\">\/* all points on one horizontal *\/<\/span>\n    {\n        <span style=\"color:#66d9ef\">for<\/span> (i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> (<span style=\"color:#66d9ef\">int<\/span>) numxy; <span style=\"color:#f92672\">++<\/span>i)\n            (<span style=\"color:#f92672\">*<\/span>func) (xya[i <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>].x, tpy, xya[i].x, tpy);\n        <span style=\"color:#66d9ef\">return<\/span>;\n    }\n\n    <span style=\"color:#66d9ef\">if<\/span> (box)\n    {\n        (<span style=\"color:#f92672\">*<\/span>func) (xya[<span style=\"color:#ae81ff\">0<\/span>].x, xya[<span style=\"color:#ae81ff\">0<\/span>].y, xya[<span style=\"color:#ae81ff\">2<\/span>].x, xya[<span style=\"color:#ae81ff\">2<\/span>].y);\n        <span style=\"color:#66d9ef\">return<\/span>;\n    }\n\n    <span style=\"color:#66d9ef\">for<\/span> (zy <span style=\"color:#f92672\">=<\/span> bty; zy <span style=\"color:#f92672\">&lt;=<\/span> tpy; zy<span style=\"color:#f92672\">++<\/span>)\n    {\n        xpnt <span style=\"color:#f92672\">=<\/span> xa;\n        x1 <span style=\"color:#f92672\">=<\/span> xya[numxy <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>].x;\n        y1 <span style=\"color:#f92672\">=<\/span> xya[numxy <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>].y;\n        df1 <span style=\"color:#f92672\">=<\/span> isgn(zy <span style=\"color:#f92672\">-<\/span> y1);\n        df0 <span style=\"color:#f92672\">=<\/span> isgn(zy <span style=\"color:#f92672\">-<\/span> xya[<span style=\"color:#ae81ff\">0<\/span>].y);\n        <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#f92672\">!<\/span>df0 <span style=\"color:#f92672\">&amp;&amp;<\/span> <span style=\"color:#f92672\">!<\/span>df1)   <span style=\"color:#75715e\">\/* first point in middle of horizontal *\/<\/span>\n        {\n            i <span style=\"color:#f92672\">=<\/span> numxy <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n            <span style=\"color:#66d9ef\">while<\/span> (i <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">0<\/span>)\n            {\n                <span style=\"color:#f92672\">--<\/span>i;\n                hf <span style=\"color:#f92672\">=<\/span> isgn(zy <span style=\"color:#f92672\">-<\/span> xya[i].y);\n                <span style=\"color:#66d9ef\">if<\/span> (hf)\n                    <span style=\"color:#66d9ef\">break<\/span>;\n            }\n            hr <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n            <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> xya[<span style=\"color:#f92672\">++<\/span>i].x;\n        }\n        <span style=\"color:#66d9ef\">else<\/span>\n            hr <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n\n        <span style=\"color:#66d9ef\">for<\/span> (i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> (<span style=\"color:#66d9ef\">int<\/span>) numxy; i<span style=\"color:#f92672\">++<\/span>)\n        {\n            x2 <span style=\"color:#f92672\">=<\/span> x1;\n            y2 <span style=\"color:#f92672\">=<\/span> y1;\n            df2 <span style=\"color:#f92672\">=<\/span> df1;\n            x1 <span style=\"color:#f92672\">=<\/span> xya[i].x;\n            y1 <span style=\"color:#f92672\">=<\/span> xya[i].y;\n            df1 <span style=\"color:#f92672\">=<\/span> df0;\n\n            ni <span style=\"color:#f92672\">=<\/span> i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n            <span style=\"color:#66d9ef\">if<\/span> (ni <span style=\"color:#f92672\">&gt;=<\/span> numxy)\n                ni <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n\n            df0 <span style=\"color:#f92672\">=<\/span> isgn(zy <span style=\"color:#f92672\">-<\/span> xya[ni].y);\n\n            <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#f92672\">!<\/span>df1)   <span style=\"color:#75715e\">\/* this vertex on zy *\/<\/span>\n            {\n                <span style=\"color:#66d9ef\">if<\/span> (df2)    <span style=\"color:#75715e\">\/* unique point or beginning of horizontal *\/<\/span>\n                {\n                    <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> x1;\n                    <span style=\"color:#66d9ef\">if<\/span> (<span style=\"color:#f92672\">!<\/span>df0)\n                    {\n                        hf <span style=\"color:#f92672\">=<\/span> df2;\n                        hr <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n                    }\n                    <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#66d9ef\">if<\/span> (df2 <span style=\"color:#f92672\">==<\/span> df0)\n                        <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> x1;\n                }\n            }\n            <span style=\"color:#66d9ef\">else<\/span>\n            {\n                <span style=\"color:#66d9ef\">if<\/span> (hr)\n                {\n                    hr <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n                    <span style=\"color:#66d9ef\">if<\/span> (df1 <span style=\"color:#f92672\">==<\/span> hf)\n                        <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> x2;\n                }\n                <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#66d9ef\">if<\/span> ((df1 <span style=\"color:#f92672\">!=<\/span> df2) <span style=\"color:#f92672\">&amp;&amp;<\/span> df2)\n                {\n                    nx1 <span style=\"color:#f92672\">=<\/span> x1;\n                    nx2 <span style=\"color:#f92672\">=<\/span> x2;\n                    mtline(<span style=\"color:#f92672\">&amp;<\/span>nx2, y2, <span style=\"color:#f92672\">&amp;<\/span>nx1, y1, zy);\n                    <span style=\"color:#66d9ef\">if<\/span> (nx1 <span style=\"color:#f92672\">!=<\/span> nx2)\n                        (<span style=\"color:#f92672\">*<\/span>func) (nx1, zy, nx2, zy);\n                    <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> nx1;\n                }\n            }\n        }\n        cnt <span style=\"color:#f92672\">=<\/span> xpnt <span style=\"color:#f92672\">-<\/span> xa;\n        <span style=\"color:#66d9ef\">if<\/span> (cnt <span style=\"color:#f92672\">&amp;<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n        {\n            <span style=\"color:#f92672\">*<\/span>xpnt<span style=\"color:#f92672\">++<\/span> <span style=\"color:#f92672\">=<\/span> x1;\n            cnt<span style=\"color:#f92672\">++<\/span>;\n        }\n        qsort(xa, cnt, <span style=\"color:#66d9ef\">sizeof<\/span>(xa[<span style=\"color:#ae81ff\">0<\/span>]), (<span style=\"color:#66d9ef\">int<\/span> (<span style=\"color:#f92672\">*<\/span>)(<span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#f92672\">*<\/span>, <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#66d9ef\">void<\/span> <span style=\"color:#f92672\">*<\/span>)) cmpit);\n        xpnt <span style=\"color:#f92672\">=<\/span> xa;\n        <span style=\"color:#66d9ef\">for<\/span> (i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> (<span style=\"color:#66d9ef\">int<\/span>) cnt; i <span style=\"color:#f92672\">+=<\/span> <span style=\"color:#ae81ff\">2<\/span>)\n        {\n            (<span style=\"color:#f92672\">*<\/span>func) (xa[i], zy, xa[i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>], zy);\n        }\n    }\n}\n\n<span style=\"color:#66d9ef\">int<\/span> <span style=\"color:#a6e22e\">main<\/span>() {\n\n\tInitWindow(screenWidth, screenHeight, <span style=\"color:#e6db74\">&#34;Manual Quad Texture Mapping&#34;<\/span>);\n\n\tSearchAndSetResourceDir(<span style=\"color:#e6db74\">&#34;resources&#34;<\/span>);\n\n\timg <span style=\"color:#f92672\">=<\/span> LoadImage(<span style=\"color:#e6db74\">&#34;.\/4.2.07.png&#34;<\/span>);\n\tpixels <span style=\"color:#f92672\">=<\/span> LoadImageColors(img);\n\timgWidth <span style=\"color:#f92672\">=<\/span> img.width;\n\timgHeight <span style=\"color:#f92672\">=<\/span> img.height;\n\n\tSetTargetFPS(<span style=\"color:#ae81ff\">6000<\/span>);\n\t\n\t<span style=\"color:#66d9ef\">int<\/span> frames <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">120<\/span>;\n\n\t<span style=\"color:#66d9ef\">while<\/span> (<span style=\"color:#f92672\">!<\/span>WindowShouldClose() <span style=\"color:#f92672\">&amp;&amp;<\/span> <span style=\"color:#f92672\">--<\/span>frames<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#ae81ff\">0<\/span>) {\n\t\tBeginDrawing();\n\t\tClearBackground(RAYWHITE);\n\n\t\tfillpoly(quad, <span style=\"color:#ae81ff\">4<\/span>, fillrect);\n\n\t\t<span style=\"color:#75715e\">\/\/ Optional: Draw the quad outline\n<\/span><span style=\"color:#75715e\"><\/span>\t\t<span style=\"color:#66d9ef\">for<\/span> (<span style=\"color:#66d9ef\">int<\/span> i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">4<\/span>; i<span style=\"color:#f92672\">++<\/span>) {\n\t\t\tDrawLine(quad[i].x, quad[i].y, quad[(i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#f92672\">%<\/span> <span style=\"color:#ae81ff\">4<\/span>].x, quad[(i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#f92672\">%<\/span> <span style=\"color:#ae81ff\">4<\/span>].y, BLACK);\n\t\t}\n\n\t\tEndDrawing();\n\t\tquad[<span style=\"color:#ae81ff\">0<\/span>].x <span style=\"color:#f92672\">-=<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n\t\tquad[<span style=\"color:#ae81ff\">2<\/span>].x <span style=\"color:#f92672\">+=<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n\t}\n\n\t<span style=\"color:#75715e\">\/\/ Cleanup\n<\/span><span style=\"color:#75715e\"><\/span>\tUnloadImageColors(pixels);\n\tUnloadImage(img);\n\tCloseWindow();\n\t<span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n}\n\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>[&hellip;] I thought I&rsquo;d mention why the code looks so complex for what you&rsquo;d think wasn&rsquo;t a particularly complex problem. I had numerous issues with the outline matching the fill with no pixel gaps. I remember Mouse Systems being insistent that there be no gaps or holes. In the end, the only foolproof solution was to duplicate the draw line code. I made a version of my line draw that just gave the starting and ending X coordinates that overlapped at a specific Y location so it would match a drawn line exactly with no gaps. That&rsquo;s the <code>mtline<\/code> function (<code>void mtline(int *x1, int y1, int *x2, int y2, int matchy<\/code>) That&rsquo;s also why my line draw always starts drawing pixels from <code>x1,y1<\/code> instead of swapping endpoints to simplify the code. That&rsquo;s also the reason why this warpimage demo won&rsquo;t have perfect matching edges because the drawn quad afterward is using the raylib drawline instead of my own draw line code, so it&rsquo;s not absolutely identical.<\/p>\n<p>The other reason the polygon fill is so complex is that it handles extreme cases like this with multiple crossings and confusing inside vs outside.<\/p>\n<figure><img src=\"PCPaint-CrazyPolygon-Fill.png\"\n         alt=\"Highly convex and overlapping polygon filled with pattern\"\/><figcaption>\n            <p>Complex polygon fill example in PCPaint 3.1 (source John Bridges)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<h2 id=\"q-how-was-the-experience-working-with-early-svga-cards\">Q: How was the experience working with early SVGA cards?<\/h2>\n<p>There were HUGE performance differences between different brands!<\/p>\n<p>I wrote a benchmark VIDSPEED in 1987, which is still <a href=\"https:\/\/vetusware.com\/download\/VIDSPEED%203.0\/?id=17399\">found online<\/a> For instance the IBM Model 30 8086 PC used for the early Video project was only around 680 bytes per millisecond in all video modes. That means it took around 1\/10th of a second to clear the entire 320x200 256 color screen, doing nothing else but writing data. That&rsquo;s why I had to do the video 1\/4 screen for the IBM Demo.<\/p>\n<p>Sadly, the IBM Model 70 PS\/2 80386 PC was still only around 611 bytes per millisecond in 320x200x256 color mode. Pretty shockingly poor showing. While a Tseng ET4000 board on a 386 PC ran at over 5000 bytes per millisecond, over 7 times faster.<\/p>\n<p>This made a HUGE difference in animation and game performance. IBM video writes were generally slow for years in my benchmarks. This person has a very nice table of results including some cards made after the leap from ISA to EISA, then PCI, then AGP, and then the now popular PCIe. Their top speed is a Tseng ET6100 PCI card that tests at 33436 bytes per millisecond. Since VIDSPEED is a DOS program, it only tests 16-bit writes. I&rsquo;m sure it would test faster on PCI cards using 32-bit or 64-bit writes. <a href=\"https:\/\/thandor.net\/benchmark\/73\">https:\/\/thandor.net\/benchmark\/73<\/a><\/p>\n<p>This person has a 386 running with recent benchmarks. I&rsquo;m surprised they found working hardware since many my SVGA sample cards eventually failed. <a href=\"https:\/\/www.os2museum.com\/wp\/more-isa-vga-benchmarks\/\">https:\/\/www.os2museum.com\/wp\/more-isa-vga-benchmarks\/<\/a><\/p>\n<p>I forgot the VESA local bus, which was an extension of ISA from 1992 which came before PCI. It was used on fast 486 PCs before Pentium was affordable, so fairly common for a few years even after PCI started to show up.<\/p>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/VESA_Local_Bus\">https:\/\/en.wikipedia.org\/wiki\/VESA_Local_Bus<\/a><\/p>\n<p>Although I have a few VLB few cards, I don&rsquo;t have any working 486 machines except an IBM 701C laptop. In fact all my machines from late 1990s through early 2000s were wiped out by the Capacitor Plague, very upsetting at the time.<\/p>\n<h2 id=\"q-pcpaints-native-image-format-was-pic-or-pictor-pc-paint-the-encyclopedia-of-graphics-file-formats-portrays-version-10-of-pic-as-a-small-wrapper-around-bsave-format-or-a-rather-literal-translation-of-the-framebuffer-to-an-array-of-bytes-however-later-versions-add-additional-metadata-and-run-length-encoding-compression-how-did-you-evolve-the-file-format-what-drove-the-changes-and-your-design-choices-was-there-any-formal-interchange-of-ideas-or-lessons-learned-between-the-creators-of-file-formats-at-the-time-eg-pcx-gif\">Q: PCPaint&rsquo;s native image format was PIC or Pictor PC Paint. The <em>Encyclopedia of Graphics File Formats<\/em> portrays version 1.0 of PIC as a small wrapper around BSAVE format, or a rather literal translation of the framebuffer to an array of bytes. However, later versions add additional metadata and run-length encoding compression. How did you evolve the file format? What drove the changes and your design choices? Was there any formal interchange of ideas or lessons learned between the creators of file formats at the time (e.g. PCX, GIF)?<\/h2>\n<p>PCPaint saved in BSAVE because that was literally the ONLY standard for images on PCs at that time. Yes it sucked, but particularly in BASIC that was the &ldquo;standard&rdquo;.<\/p>\n<p>One of the reasons BSAVE became useless was EGA graphics where it was no longer a simple buffer, but rather bank switch for planes. You couldn&rsquo;t BSAVE a EGA image!<\/p>\n<p>GIF was created in-house at CompuServe in 1987, and didn&rsquo;t involve outside developers until 1988. TGA was only for high color images, had no compression, and was niche at that time, didn&rsquo;t really come into use until 1986. TIFF came out in 1986, and frankly was far too versatile, trying to be everything to everyone. I never supported more than a subset. I don&rsquo;t know if anyone ever supported all of TIFF. PCX was initially tied to PCPaintBrush, a competitor licensed to Microsoft Mice. PCPaint did eventually support PCX files in PCPaint 3.1. I had no contact with the Zsoft developers.<\/p>\n<p>There was better compression from ARC than PKARC in 1985\/1986, but it wasn&rsquo;t easily integrated into other programs. Internally PCPaint 1.0 already used the run-length compressed images, all the artwork is in the newer format with size\/offset\ninformation, but Mouse Systems didn&rsquo;t want to save them, for good reason, what would you use them with? You could only load them back into PCPaint. Offering other formats to save would have required another dialogbox and complexity.<\/p>\n<p>The OVR file included with PCPaint 1.0 is an image library with a 2 byte count of header size, followed by 16 bytes for each image. 4 bytes for the image file offset, and then 12 bytes for the null terminated filename without extension. So making a RLE compressed PIC format was by far the easiest option at that time &mdash; the code was already in there, it was just changing the save to call a different function.<\/p>\n<p>As for evolving PIC format, I had to add more palette types, and then highcolor, so the header changed slightly. I never added more compression types since GIF\/PNG\/JPEG all filled those rolls better. It&rsquo;s funny that GIF is mostly known for animation now since I didn&rsquo;t support any of the animation features until decades later! PICEM, the free images viewer I gave out, didn&rsquo;t support GIF animation at all. I mostly just used it as a better compressing 256 color format.<\/p>\n<h2 id=\"q-starting-in-1984-and-continuing-over-a-series-of-decades-you-developed-a-series-of-presentation-software-programs-grasp-glpro-and-aftergrasp-these-tools-have-created-games-example-1httpsarchiveorgdetailsmsdos_festival_alphabet-and-example-2httpswebarchiveorgweb20110611122609httpwwwasciliteorgauaset-archivesconfsiims1994qzwhitehtml-demoshttpswwwyoutubecomwatchvq464nmiooq0-screensavers-and-commercial-animations-what-has-kept-you-interested-in-this-domain\">Q: Starting in 1984 and continuing over a series of decades, you developed a series of &ldquo;presentation software&rdquo; programs: GRASP, GLPro, and AfterGrasp. These tools have created games (<a href=\"https:\/\/archive.org\/details\/msdos_festival_ALPHABET\">example 1<\/a> and <a href=\"https:\/\/web.archive.org\/web\/20110611122609\/http:\/\/www.ascilite.org.au\/aset-archives\/confs\/iims\/1994\/qz\/white.html\">example 2<\/a>), <a href=\"https:\/\/www.youtube.com\/watch?v=q464Nmiooq0\">demos<\/a>, screensavers, and commercial animations. What has kept you interested in this domain?<\/h2>\n<p>The actual use of these tools changed a great deal over time. From multimedia hobby, to presentations, kiosks, browser helper, and as a server side scripting language.<\/p>\n<p>GRASP was initially intended for hobbyists as a very simple language to play around with graphics. It was used for multi-media projects and to a limited degree for early presentations, like an EXE distributed as an animated advertisement for a business. It was also used for information kiosks and this is where the early support for touchscreens and sound in GRASP were used. GLPro was the same thing as GRASP except it also supported Windows, and was intended to support Linux and Mac.<\/p>\n<p>We had hired a Mac specialist to do the MacOS version, but due to conflicts over what the goal was, we had to fire them, which led to some very bad feelings. He wanted to start fresh with a completely new object oriented syntax and then backport that to Windows. My only goal was to get the existing code running on Mac. He was working in the Gmedia offices in England, and I was in California, so that made it all the more difficult to resolve these conflicts. I think he had been misled by those who interviewed and hired him, and was very upset he had left a reliable position to work at Gmedia on a project he thought he&rsquo;d control.<\/p>\n<p>It finally came down to, will you do what I need done or do we have to say goodbye.  He left. I ended up doing the first MacOS version myself (I had to learn MacOS programming as part of that process), and I had something working in a few months that was in early testing and expansion with a new Mac developer when the company closed.<\/p>\n<p>Gmedia GLPro was only around for a few years, but some of the largest short leaps were made then, a rewriting of the command structure, experiments with a completely different expression syntax, and adapting to other platforms. Gmedia had to close mostly because internal legal fighting, and huge legal bills in spring 2001. The assets were sold to a customer of GLPro Alabama who eventually went their own way when we couldn&rsquo;t come to an agreement.<\/p>\n<p>Several commercial customers of GLpro felt abandoned, and I was in talks with a some of them to try and help. I wrote AfterGRASP for Digi-products based in England, one of those users of GLPro. I felt partly responsible for this upheaval, and was excited to do my first complete rewrite to help some of the GLPro community. Because of legal issues with GLPro, AfterGRASP had to be started from scratch, and this allowed me to completely redesign how it worked internally while maintaining enough backwards compatibility to allow Digi-products to use it for their existing projects.<\/p>\n<p>Because I started from scratch, and was doing a new internal design it took years until most of GLpro was replicated, some features were never supported. It had a compiler AGCOMP which produced a Forth like language which was tokenized into AGC binary files. The actual interpreter worked quite differently supporting limited threads, event driven features, and far faster execution.I started AfterGRASP in March of 2002,  Windows only, and I had early drawing\/images\/text after a few months.<\/p>\n<p>AfterGRASP was free to use, but we couldn&rsquo;t really handle providing real support. The focus was almost entirely on Digi-products needs, and we never sold any commercial licenses.  It may have started based on the ideas in GLPro, but became an internal platform for all kinds of different projects inside Digi-products. Eventually AfterGRASP was adapted to run as a BrowserHelperObject for InternetExplorer so that GL files could be run from a webpage. I used the BHO version of AfterGRASP to create our email and advertising PDF editing system for direct email and distribution. A lot of the server side image manipulation, Postscript\/PDF manipulation was done in AfterGRASP.<\/p>\n<p>As Internet Explorer started dying out, and support for BHO waned, I rewrote that entire system in JavaScript. I even adapted AGCOMP to allow a JavaScript style syntax so I could mix AfterGRASP and Javascript in the same project. The JavaScript based email system grew and I eventually adapted it to a new style of modular email that could be used for mobile email. The last builds of AfterGRASP were from just before COVID19 started, late 2019, and the email editing system slowly declined as direct email became less of a viable business.<\/p>\n<p>Here are the <a href=\"https:\/\/www.aftergrasp.com\/comefrom.html\">notes I wrote back in 2002<\/a> on where AfterGRASP came from:<\/p>\n<blockquote>\n<p>Where did AfterGRASP come from?<\/p>\n<p>Aside from John Bridges (me!), Digi-products Ltd., and the small group\nof Alpha testers who have been in the loop since the end of July 2002,\nthe list of\npeople who even know there is an AfterGRASP project is tiny.<\/p>\n<p>The list of those who knew about AfterGRASP did not include anyone\ninvolved in Gmedia or IMS or Paul Mace Software.  In particular, this\nproject\nhas nothing to do with any of my past associates or Interactive Homes.<\/p>\n<p>The AfterGRASP project started in earnest the last couple days of Feb 2002.  That date does not coincide with any\n&ldquo;event&rdquo;; it just happened to be when\neveryone was comfortable enough with the details to say &ldquo;OK, let&rsquo;s start&rdquo;.\nHowever, it does happen to be the one year anniversary of the closing of Gmedia&rsquo;s doors,\nwhich occurred the last few days in Feb 2001.<\/p>\n<p>Rick Franklin of Interactive Homes knows this project exists.  He was\ncalled a few days before any testers were invited to sign an NDA in July\n2002.  Rick and Interactive Homes know who is writing it, and who is\nfunding it, and why.  There is no ill will between Interactive Homes and\nDigi-products.<\/p>\n<p>Excluding public code (like ZLIB), AfterGRASP contains no source code\nfrom GLPRO.  AfterGRASP is a completely new project started from\nscratch.  GLPRO was\nlike an organic growth with new features being attached over the years,\nmany now completely obscure and replaced by newer\nfeatures.  This means AG will likely NEVER match the entire GLPRO\nfeature set.   For example, the complex\n&ldquo;DATA&rdquo; commands will likely never be replicated since they are so\nfreeform, making\ncompilation almost impossible.<\/p>\n<\/blockquote>\n<h2 id=\"q-a-differentiator-between-the-grasp-line-of-tools-and-other-presentation-software-is-the-use-of-an-imperative-command-script-to-control-animations-rather-than-manipulating-them-within-a-user-interface-why-did-you-choose-this-route-how-did-you-approach-the-design-and-evolution-of-the-language\">Q: A differentiator between the GRASP line of tools and other presentation software is the use of an imperative command script to control animations rather than manipulating them within a user interface. Why did you choose this route? How did you approach the design and evolution of the language?<\/h2>\n<p>Paul Mace Software was the publisher for GRASP, and pushed hard for a GUI version of GRASP. They wanted some kind of UI like I had done for PCPaint, except to create simple projects. They were right. There was still time to do it, when competitors like Macromedia Director and Adobe Flash were not around or established yet. Due to my own pigheadedness, laziness, and conflict over the direction of marketing\/sales, it languished in the early 1990s. I eventually came out with MMGRASP which was really a bundling of a bunch of tools like a relabeled PCPaint as PICTOR, and other image utilities, but no GUI at all. Part of the reinforcement was the GRASP users who I had regular contact with were often doing complex projects that couldn&rsquo;t be done in a simple GUI. They wanted advanced features like addons, and control of peripherals. This reinforced my own lack of will to &ldquo;just do something&rdquo;. I believe if I had started something it would have grown, and the entire thing would have gone in perhaps a different direction.<\/p>\n<p>I believe part of the problem was my lack of a conception of how this would work beyond place some pictures, some text, and move it around. I didn&rsquo;t have a clear vision for how you would integrate this GUI layout with scripting in a way that allowed you to edit the\nresulting script and yet still be able to use the GUI to adjust layout after you had made changes to the script.<\/p>\n<p>Jason Gibbs who later founded Gmedia was 100% responsible for the syntax rewrite from GRASP to GLPro. He was really irritated by the seemly random command naming, and wanted to make it all consistent. He was a heavy user of GRASP when he was part of IMS Communications Ltd in Twyford UK, and sold some addons for GRASP via IMS for years before wanting to release a successor to GRASP called GLPro. Jason&rsquo;s enthusiasm for GRASP and doing GLPro is what got the whole ball rolling on GLPro. This got GLPro on Windows, and a lot of advances, but Jason had little interest in any kind of simple GUI tool.<\/p>\n<p>Unfortunately Jason&rsquo;s conflicts with the management brought in by investors, with investors, and lawyers is what ultimately led to Gmedia collapsing. Jason hoped to buy up the assets in bankruptcy, but was outbid by customers who were worried about whether Jason would support existing GLPro uses. A legitimate concern. As far as I know Jason left to move to Belize and sadly I&rsquo;ve not had any contact with him since nor been able to locate him. IMS Communications run by his parents shut down a few years ago.<\/p>\n<p>With AfterGRASP there was little discussion of making GUI tools to generate scripts. I did produce GUI products for layout\/editing documents, webpages, email and other content, but never anything to produce new AfterGRASP script\/projects. In fact for years we used the GLPro script editor to edit AfterGRASP scripts, and I designed a frontend so it could be used to compile\/execute AfterGRASP projects. The one area I did consider a new GUI was for mathematical image transformations. I designed a whole system of performing transformations on images in realtime to do effects like a fisheye lens, or other distortion effects. It was like a DFF image difference sort of engine where each pixel value was defined by a list of expressions done to nearby pixels in the source image. Would be perfect for the sort of parallel processing you get in modern GPUs! The problem was we had no tools to generate these computation masks. I did some command line tools, but it was conceptually too difficult for our graphic artists or script authors at Digi-products to understand what was going on. I did some design work on a GUI to edit these math mask projects, but it got pushed to the wayside when we didn&rsquo;t have any paying projects that needed any effects it could produce.<\/p>\n<h2 id=\"q-in-1991-dr-dobbs-journal-published-an-articlehttpsjacobfilippcomdrdobbsarticlesddj199191029102c9102chtm-you-wrote-on-differential-image-compression-which-can-be-used-to-compress-animations-or-video-i-understand-some-of-the-techniques-you-explore-in-the-article-came-from-a-research-project-you-did-for-ibm-and-ccm-what-were-the-outcomes-of-that-project\">Q: In 1991, Dr. Dobb&rsquo;s Journal published <a href=\"https:\/\/jacobfilipp.com\/DrDobbs\/articles\/DDJ\/1991\/9102\/9102c\/9102c.htm\">an article<\/a> you wrote on differential image compression which can be used to compress animations or video. I understand some of the techniques you explore in the article came from a research project you did for IBM and CCM. What were the outcomes of that project?<\/h2>\n<p>The final outcome was a demo video of children learning using one of the new IBM Model 30 computers which were a little faster than the original IBM PC. They used the 8mhz 8086 instead of the PC&rsquo;s 5mhz 8088.<\/p>\n<p>The Model 30 did not have sampled sound, that took an add on card. In early 1987 when I produced the demo there were no popular PCM sound cards for PC yet. The AdLib card which did FM synthesis for effects and music was first released in 1987, and the popular SoundBlaster didn&rsquo;t ship until 1990.<\/p>\n<p>Simple pixel difference animation&rsquo;s real benefit was very low CPU overhead and encoding of palette-limited images. MPEG wasn&rsquo;t around at all until about 6 years later, and couldn&rsquo;t be played back on such a slow CPU.<\/p>\n<p>The main use for that code was the DFF animation format which was used for GRASP\/MMGRASP.<\/p>\n<p>I was told the National Geographic Mammals CD-ROM from 1990 was produced for\/by IBM using my DFF video code. It was produced with <a href=\"https:\/\/www2.iath.virginia.edu\/elab\/hfl0139.html\">LinkWay<\/a>. I don&rsquo;t know how the video was integrated. Whether LinkWay had support for the video playback directly, or ran an outside EXE for video playback.<\/p>\n<h2 id=\"q-can-you-describe-your-general-approach-to-software-development-and-testing-for-aftergrasp-you-maintained-an-updatetxthttpswwwaftergraspcomupdatetxt-file-that-seems-to-be-a-daily-log-of-your-activities-and-design-changes-how-did-you-intend-yourself-and-others-to-use-that-file\">Q: Can you describe your general approach to software development and testing? For AfterGrasp, you maintained an <a href=\"https:\/\/www.aftergrasp.com\/update.txt\">update.txt<\/a> file that seems to be a daily log of your activities and design changes. How did you intend yourself and others to use that file?<\/h2>\n<p>I&rsquo;ve developed several styles of development depending on the team size, and who I was working with.<\/p>\n<p>When I first worked on Graphics Libraries\/Tools at CCM I would keep a master copy of all tools and libraries, and update everyone by floppy to their harddrive at the end of the day when I wouldn&rsquo;t interrupt anyone&rsquo;s work. After a mishap where I broke something, and caused a couple application developers to lose several hours, I started a staged update process. I had a couple developers work in my office, and I would hand them the updates first. They would test it out on their own projects first. Then I handed it to the more advanced developers to test. Finally after a day I would pass it out to everyone. There were a few hard feelings about some developers not getting the &ldquo;New Fixes&rdquo;, but this staged update kept everyone working.<\/p>\n<p>For PCPaint 1.0 I was almost entirely working on my own, and hearing distant feedback. Because PCPaint 1.0 was the simplest version of PCPaint, we got away with shipping it with quite a few small bugs. PCPaint 1.5 got delayed a bit because of the distant testing chain. Mouse Systems insisted in the contract that for PCPaint 2.0 if it wasn&rsquo;t ready to ship by late summer 1985 I would fly out and live at the Mouse Systems campus until it was &ldquo;bug free&rdquo;. This was the only time I&rsquo;ve ever worked with a whole staff of people testing and managing the development process. We had regular meetings on the most critical problems, and testers constantly pounding on it finding new problems, and testing for regression bugs. This ended up dragging on over a month where I lived out of a motel with no kitchen, and ate out all meals.<\/p>\n<p>For GRASP I started a beta testing BBS where regular testers got access to the latest code and could request new features\/fixes. We also communicated through Compuserve in the PICS forum where there was a GRASP area, and also where GIF developers not employed by Compuserve would discuss graphics coding. These places were where I got in the habit of documenting every significant change as a running commentary. I&rsquo;d discuss fixes, new features, and provide examples in a running log of messages.<\/p>\n<p>For GLPRO I ended up with five primary means of communications:<\/p>\n<ol>\n<li>Phone calls for critical business\/personal issues with Gmedia management, Investors and Developers. I had gotten in the habit of long walks starting around 3am when it was cool and quiet. Some of the longest calls were made on those walks, such as when I had a cathartic conversation with the investors as Gmedia was falling apart.<\/li>\n<li>I created an email list for beta testers\/users. I couldn&rsquo;t discuss all details like I had with GRASP beta users. So for instance specifics of the MacOS, Linux, PlayStation and other ports were not made public. An archive of those messages from 1996 to 2003 is found here: <a href=\"https:\/\/www.aftergrasp.com\/glprolist\/\">https:\/\/www.aftergrasp.com\/glprolist\/<\/a>.<\/li>\n<li>For GLPRO I got more into this habit of a single text file that kept growing since it was so useful as a record of my thoughts as I added features.<\/li>\n<li>Source control. I setup a server over a IDSL link with Perforce server so all the developers working on GLPRO could access the most recent versions. Sadly we only were using this a few months before Gmedia fell apart, so we didn&rsquo;t have any kind of reliable backup system working yet.<\/li>\n<li>Constant email communication, largely between Jason Gibbs and myself as well as the other developers.  At that time I was using a mixture of OS\/2 and Windows 2000, and PMMail as my email client.<\/li>\n<\/ol>\n<p>For AfterGRASP things shrunk down, I was developing mostly by myself with just some graphic artists and application developers who used AfterGRASP, but didn&rsquo;t do any programming on the C code. I also switched over to C++ instead of C. Still only using a limited subset of C++, but taking advantage of huge improvements in scope\/types. By then I also stopped coding any ASM files. Any limited assembly was done inline in the C code. I stopped using a source control system, and instead starting doing automated source backups with differences so I could trace any significant changes and identify where a bug came from.<\/p>\n<p>The UPDATE.TXT for AfterGRASP became far more important because there were no realistic plans to ever produce a retail product or written manual. In the past all this documenting was intended to be processed into a end user manual. That&rsquo;s what happened with GRASP, and GLPRO. With AfterGRASP since we eventually gave up on selling producing a manual for a rapidly advancing product that development log became CRITICAL to track how features worked. Often the best information was the runnable examples which could be copied and modified to figure out how to use a complex feature.<\/p>\n<p>When I started working more on actually using AfterGRASP to produce out other products at Digi-products, I ended up documenting my work more in email writing long descriptions of what I was working on, what was fixed, what was planned, and my thoughts on what we should do next.<\/p>\n<p>To this day, I still rely on repeated full backups with differences. Storage is so cheap that keeping everything costs almost nothing on source files. I still use source control occasionally when projects involve other people so we can document each change specifically and track each other&rsquo;s work.<\/p>\n<h2 id=\"q-youve-worked-with-different-companies-and-in-different-roles-do-you-have-any-advice-for-other-developers-on-the-business-side-of-the-job\">Q: You&rsquo;ve worked with different companies and in different roles. Do you have any advice for other developers on the business-side of the job?<\/h2>\n<p>Have the conversations you avoid or even dread.<\/p>\n<p>My largest regrets are when I was avoiding talking to people, using intermediaries. I&rsquo;ve always hated confrontations, or being drawn into conflicts, or talking about subjects I&rsquo;m embarrassingly ignorant of. An example was I avoided talking directly to the investors funding Gmedia\/GLpro. I literally NEVER spoke with them seriously until the company was falling apart. I left all that business to our CEO Jason, and didn&rsquo;t get involved when he was replaced as head with someone from the investors. When I finally had the BIG conversation with the investors I learned important things about what had been going on, and told them critical information that had been hidden from them. I believe if I had been in regular communication with them, just discussing business and plans, then things would have turned out differently.<\/p>\n<p>This raises another issue. You can take modesty too far. I hated &ldquo;tooting my own horn&rdquo;, and allowed others to step up and take credit. This is partly what happened at Gmedia where the investors hadn&rsquo;t realized how important I was, and how much knowledge I had about all aspects of our business.<\/p>\n<p>Only communicating with those you are comfortable with is a disaster. It has happened a couple times where I avoided talking to people I felt were usurping my authority, or were opposed to my plans. The worst case was when I was still quite young, around 21, and some new middle managers were hired at CCM to manage development. Both were older than me and had traditional computer science education, and it was an awkward situation where it wasn&rsquo;t clear who was under my supervision. Bob was friendly, and easy to talk to. Tom was fairly stiff and traditional. The conflict with Tom brewed for a while with me avoiding him, and dividing the office. I would only speak with the founders of the company avoiding Tom. Developers working directly under me followed this habit of avoiding Tom, and bad mouthing him in an unprofessional way. I even got into at least one embarrassing shouting argument. Once things were falling apart and it was clear most of the company was being let go I finally got to have a real conversation with Tom. It was a shock. I was the jerk. Tom was trying to make it all work. Tom was dealing with programmers who often slacked off, and he was struggling with me breaking the tools. I really wish I had a few conversations with Tom early on to build up a connection and understanding. None of it mattered from a business stand point, but from a personal standpoint I regret that adversarial relationship. I hope Tom did okay after all that.<\/p>\n<p>Although things have changed a great deal in 40 years, some advice that I would give to anyone starting out is to make connections. Go to events, go to dinners, stay in touch with classmates, offer to help people with problems, be the free tech support, show people not only that you are smart, but also HELPFUL. That you have a reputation for solving problems and getting things done. In the age where AI reviews resumes, and HR often has non-technical requirements for roles, often the only way into some positions is a personal connection. I only regret not keeping some of those connections going, sending a Christmas card, or doing a zoom call is very low cost and you never know where it may lead. Virtually every job I&rsquo;ve ever done was through a personal connection. Resumes and interviews got me nowhere.<\/p>\n<p>Bend over backwards to not offend those who can help you! An example was my wife and I didn&rsquo;t want extra kids at our wedding. This was a HUGE mistake. Partly because we had preconceived notions about children that were completely wrong. Now that we have our own children that were dis-invited to weddings\/events we understand what a huge inconvenience and insult this can be. We did allow closely related children, so friends who had children were quite upset there would be children, just not theirs. We now realize this implied that we thought their children would be a problem, and took this quite personally. We did back down, and invite all kids, but the small rift never went away, and likely led to me not being called for a couple interesting jobs.<\/p>\n<p>Some final advice I didn&rsquo;t take, but recommend to others: This is from long ago, from several accomplished people. Do not pursue computer science, but rather pursue some kind of science or engineering you are interested in where you can use your interest in computers. Although I didn&rsquo;t take this path, if I had gotten a college degree, I think that would have been a better path to more career options, and avoided some of the rapid turnover, aging out, and churn in computer work.<\/p>\n<h2 id=\"q-any-other-thoughts\">Q: Any other thoughts?<\/h2>\n<p>I&rsquo;ve always been a hermit, going a week or more without stepping out the door, and was quite lucky to meet my wife on an early version of Match.com where email was the primary communication method. I&rsquo;ve often found myself slipping into random sleeping hours, staying up late, and rolling forward, being up later and later. We had several years where I wasn&rsquo;t on a fixed schedule, and she wasn&rsquo;t working. She calls them &ldquo;The Lost Years&rdquo;. We watched a lot of TV, played video games, and time just disappeared. We sort of withdrew from the world, gained weight, and got little exercise.<\/p>\n<p>All that changed when we decided to start a family. We both lost a lot of weight, and had our first child in 2009. We regret that lost time, one of my few serious regrets. We would have had more children, and moved to the countryside sooner if we&rsquo;d had the will to just act! Having children has led to so many social connections that it&rsquo;s sort of crazy when I think back on how isolated we had become. Even during COVID our social net grew. My wife used the remote education options to get associates and then bachelors degrees, doing so well that they asked her to remote tutor multiple subjects. We tried homeschooling for the kids, but gave up, instead enrolling our children in a tiny private school that was still open for in-person classes. Through that little school our family joined the associated church, and my wife has become the full time assistant-teacher\/assistant-principal. I still consider myself a hermit living in the countryside, often going a week without leaving our property, but with a longer list of people we call friends. I&rsquo;m even forced to take part in occasional social events, which I grudgingly admit is good for me.<\/p>\n<h2 id=\"notes\">Notes<\/h2>\n<p>This interview was conducted via email during April and May 2025.<\/p>"},{"title":"The Seymour Cray Era of Supercomputers (Review)","link":"https:\/\/ztoz.blog\/posts\/cray-era-supercomputers\/","pubDate":"Wed, 11 Jun 2025 13:41:24 -0600","guid":"https:\/\/ztoz.blog\/posts\/cray-era-supercomputers\/","description":"<p><em>The Seymour Cray Era of Supercomputers: From Fast Machines to Fast Codes<\/em> is a technical and business history of the roughly three-decades when Seymour Cray dominated the development of a class of computer called the &ldquo;supercomputer&rdquo;. The book covers the development of the major supercomputer models, the technical decisions and trade-offs involved, and changes to the market. The book ends with SGI&rsquo;s purchase of Cray&rsquo;s assets and the transition to massively parallel processing.<\/p>\n<figure><img src=\"cover.jpg\"\n         alt=\"Book cover of &#39;The Seymour Cray Era of Supercomputers&#39;\" width=\"75%\"\/>\n<\/figure>\n\n<p><strong>The Seymour Cray Era of Supercomputers: From Fast Machines to Fast Codes<\/strong>. By Boelie Elzen and Donald MacKenzie. ACM Books. ISBN 979-8-4007-1369-9.\nDOI <a href=\"https:\/\/doi.org\/10.1145\/3705551\">10.1145\/3705551<\/a>.<\/p>\n<p>Early on, computer designs were bifurcated between the domains of &ldquo;business computing&rdquo; and &ldquo;scientific computing.&rdquo; Business computing was almost exclusively fixed-point, dealt with categorical and string data, often I\/O bound, and had a broad base of less technical users with often similar problems (e.g. payroll). Scientific computing heavily used floating point, dealt with complex numerical data such as matrices and chains of differential equations, often CPU bound, and featured a very small base of highly\ntechnical users with specialized and often unique problems (e.g. national laboratories).<\/p>\n<p>In 1964, this bifurcation was exemplified by the release of two products. IBM launched the IBM 360, a new computing platform that came to dominate business management. Control Data Corporation (CDC), co-founded by Seymour Cray, released the CDC 6600. While IBM sold more than 1,000 in its first month, the CDC took a decade to sell 100. However, the 6600 claimed the moniker &ldquo;fastest computer in the world&rdquo; and only relinquished the claim to its successor, the 6700. Although IBM could take solace in finding greater monetary success in a much larger market, Thomas J. Watson Jr. was mortified that IBM had lost the performance crown to a tiny team with far fewer resources.<\/p>\n<p>Over ten chapters, the authors chronicle the technical developments that kept supercomputers at the performance edge for the next three decades. The book covers the development of the CRAY-1 and the other major models that came out of Cray&rsquo;s companies, as well as CDC, IBM, Texas Instruments, and Fujitsu efforts to compete in the area. Supercomputers expanded from a small pool of government and university customers into industries such as aerospace, petroleum, and automotive and the authors discuss each industry&rsquo;s &ldquo;hero problems&rdquo; and the computer vendors approached the new markets. The book ends as economics shift and massively parallel processing becomes the dominant approach.<\/p>\n<p>Technology development is inherently risky and the book goes into detail on the many false starts and near misses. Although the focus of the book is usually on hardware and physical concerns such as wire propagation delay, the book pays attention to software concerns in parallel to market changes. Early on, supercomputer customers wrote almost all of their own software but, as the market grew, customers expected more software support from vendors. The latter half of the book includes discussion of operating system development, evolution of compilers (mostly Fortran) and their interplay with instruction sets, and specialized tools for specific industries (e.g. computational fluid dynamics). I was unaware of the early resistance to Unix and the concern that it was &ldquo;slow.&rdquo;<\/p>\n<p>As a criticism, the text ignores aesthetics and personalities of the people involved. We learn very little about Seymour Cray or any other designer&rsquo;s personality or what it was like to work for them. Yet, supercomputers were deliberately designed to beautiful or at least evocative. This sterility of the text and its approach misses an important aspect of why people were drawn to supercomputers and their impact in the culture.<\/p>\n<p>The book is available digitally, in paperback, and hardbound. I read the paperback version which is printed on demand. The text and photos were crisp and highly readable, although the ink did bleed through the pages. Additionally, the left-hand internal margin was tighter than comfortable, although not so tight as to obscure any of the text.<\/p>\n<p>Recommended for this interested in the history of scientific computing, computer architecture, and high-value\/low-volume business strategy.<\/p>"},{"title":"Improving Product Discovery of Tabletop RPG Maps (Preliminary Investigation)","link":"https:\/\/ztoz.blog\/posts\/map-ml\/","pubDate":"Wed, 04 Jun 2025 09:50:00 -0600","guid":"https:\/\/ztoz.blog\/posts\/map-ml\/","description":"<p>As catalogs for digital marketplaces grow in size, customers have greater difficulty in finding products that meet their needs. Marketplace owners might improve product discovery by adding new categories, tags, or other searchable and filterable metadata to product descriptions. As it is often difficult to get product owners to update old product descriptions, a marketplace owner may use automated methods to &ldquo;backfill&rdquo; product data. We investigate using multi-modal computer models to extract data from digital products, specifically, digital maps for tabletop role playing games. Across five tasks, we find that models perform comparatively with their ranking on the Open VLM Leaderboard. State-of-the-commercial art models perform well with zero-shot feature extraction and image association\/captioning tasks, but all models perform  poorly with reasoning and quantitative tasks.<\/p>\n<h2 id=\"introduction\">Introduction<\/h2>\n<p>Tabletop role playing game maps primarily serve two roles: a way for players and the game master to track positions of characters and creatures via miniatures or tokens (see battlemap below), or a depiction of an area for the game master&rsquo;s use as they run an adventure which denotes buildings, catacombs, traps, and locations of creatures (see dungeon map below). The former is either printed and placed in the center of the table during play or displayed using virtual tabletop (VTT) software.<\/p>\n<figure><img src=\"example_vtt_map.jpg\"\n         alt=\"view of a building with interior garden and rooms with chairs tables and benches\"\/><figcaption>\n            <p>Example Battlemap\/VTT-style Map (source: Owlette Publications LLC)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<figure><img src=\"example_gm_dungeon.png\"\n         alt=\"a dungeon with three rooms and two caverns\"\/><figcaption>\n            <p>Example Game Master Dungeon Map<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Game masters need to marry a map with their intended adventure and campaign setting, so the choice of a map is multi-factor decision. Although game masters often make their own maps for an adventure, high-quality maps require skill, creativity, and can be time-consuming to make, so game masters may instead purchase maps. Multiple marketplaces exist to sell maps created by independent publishers. The marketplaces support product discovery via categorization filters, free-form text search, and limited recommendation systems.<\/p>\n<p>Since maps are digital goods and marketplaces have access to the product (marketplaces store the product within their databases), a marketplace may use computer vision technology to extract product data and augment their product discovery tools and database. As a means to maintain accuracy, publishers could approve any additional data being added to the product description or database. A possible activity flow for this process could be:<\/p>\n<ol>\n<li>Publisher creates product description and uploads product to Marketplace<\/li>\n<li>Model (mediated by the Marketplace) processes the product and computes additional metadata<\/li>\n<li>Publisher examines and approves additional metadata<\/li>\n<li>Customers use tools provided by the Marketplace to discover products<\/li>\n<\/ol>\n<p>Computer vision technology has improved greatly in the past few years, particularly in its ability to identify and extract objects and other features from an image. The combination of disparate neural network architectures allows model to interpret image data with semantically-laden text, greatly improving performance on tasks such as image captioning and visual question and answering.<\/p>\n<h2 id=\"methodology\">Methodology<\/h2>\n<p>As a preliminary study, our objective is to gain a sense of the current quality of existing models for this domain, gain insight on what engineering efforts will be required for a large scale project, and characterize some of the likely challenges such a project would encounter.<\/p>\n<p>We evaluated zero-shot performance of five existing multi-modal neural networks using five prompts or tasks. The prompts were:<\/p>\n<ol>\n<li><code>List locations or spaces featured within the map. Format as a bullet list.<\/code><\/li>\n<li><code>List items featured within the map. Format as a bullet list.<\/code><\/li>\n<li><code>In units of the listed map scale or map grids, if they exist, what are the dimensions of this map?<\/code><\/li>\n<li><code>Is movement within this map mostly linear or mostly branching?<\/code><\/li>\n<li><code>List types of adventures or scenarios this map is suitable for. Format as a bullet list.<\/code><\/li>\n<\/ol>\n<p>The first two prompts focus on identifying and extracting features, the former focusing on locations where characters may be placed, while the latter on items the characters might pick up or interact with. The third prompt involves interpreting, quantitatively, a map grid or scale. The fourth prompt involves semi-subjective interpretation of the map features along with the external concept of movement. The fifth prompt tests themed open-ended captioning.<\/p>\n<p>We selected five multi-modal LLMs for the evaluation (see table below). Since we would like to determine a necessary &ldquo;minimum strength,&rdquo; we selected models across a variety of benchmark levels. The Avg Score column in the table acts as a proxy for the model&rsquo;s general strength (source: <a href=\"https:\/\/huggingface.co\/spaces\/opencompass\/open_vlm_leaderboard\">OpenVLM Leaderboard<\/a>).<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Short Name<\/th>\n<th style=\"text-align:left\">Model<\/th>\n<th style=\"text-align:right\">Avg Score<\/th>\n<th style=\"text-align:left\">Type<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Gemini<\/td>\n<td style=\"text-align:left\">Gemini-2.5 Pro Preview 05-06<\/td>\n<td style=\"text-align:right\">80.1<\/td>\n<td style=\"text-align:left\">API<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">GPT<\/td>\n<td style=\"text-align:left\">GPT-4.1 20250414<\/td>\n<td style=\"text-align:right\">75.9<\/td>\n<td style=\"text-align:left\">API<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Pixtral<\/td>\n<td style=\"text-align:left\">Pixtral-12B<\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:left\">OS<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Gemma3<\/td>\n<td style=\"text-align:left\">Gemma3-4B<\/td>\n<td style=\"text-align:right\">55.4<\/td>\n<td style=\"text-align:left\">OS<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LLaVA<\/td>\n<td style=\"text-align:left\">LLaVA-v1.5-7B<\/td>\n<td style=\"text-align:right\">36.9<\/td>\n<td style=\"text-align:left\">OS<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>All models were evaluated using the same core prompts. We adjusted the prompts to fit model requirements (e.g. adding &lsquo;USER&rsquo; and &lsquo;ASSISTANT&rsquo; breaks), but the core prompt was constant.<\/p>\n<p>We did not supply contextual data to any of the models. Each prompt was presented as a stand-alone invocation rather than a conversation to avoid interacting effects. Some models required converting images to RGB mode first, but we otherwise did not scale, crop, or convert the images for a specific model.<\/p>\n<p>For our dataset, we randomly selected products from a digital marketplace. All products were created between 2022 and 2024, inclusive. We dropped products from our selection if they were city or world maps or were not maps (e.g. collections of images meant to build new maps). Since most products include multiple maps and map variants (e.g. day and night), we selected a single map from each product, preferring ones meant for VTT use and variants with labeled content.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Processed File<\/th>\n<th style=\"text-align:right\">Size (MB)<\/th>\n<th style=\"text-align:center\">Width x Height (pixels)<\/th>\n<th style=\"text-align:left\">Type<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">M001.jpg<\/td>\n<td style=\"text-align:right\">1.2<\/td>\n<td style=\"text-align:center\">1800 x 3648<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M002.jpg<\/td>\n<td style=\"text-align:right\">10.0<\/td>\n<td style=\"text-align:center\">6000 x 4800<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M003.jpg<\/td>\n<td style=\"text-align:right\">1.5<\/td>\n<td style=\"text-align:center\">2380 x 2380<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M004.jpg<\/td>\n<td style=\"text-align:right\">0.9<\/td>\n<td style=\"text-align:center\">2160 x 2880<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M005.jpg<\/td>\n<td style=\"text-align:right\">8.1<\/td>\n<td style=\"text-align:center\">8400 x 8400<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M006.jpg<\/td>\n<td style=\"text-align:right\">0.9<\/td>\n<td style=\"text-align:center\">2523 x 3220<\/td>\n<td style=\"text-align:left\">GM<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M007.jpg<\/td>\n<td style=\"text-align:right\">2.1<\/td>\n<td style=\"text-align:center\">2523 x 3220<\/td>\n<td style=\"text-align:left\">GM<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M008.jpg<\/td>\n<td style=\"text-align:right\">0.6<\/td>\n<td style=\"text-align:center\">1400 x 1400<\/td>\n<td style=\"text-align:left\">VTT<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We ran the three open-source models on a 16-core, 64-gb virtual machine without an attached GPU. Our code used the HuggingFace transformer library. We called the two API models using their commercial endpoints.<\/p>\n<h2 id=\"results-and-evaluation\">Results and Evaluation<\/h2>\n<p>For each of the five prompts, we evaluated the quality of output with prompt-specific measures and human judgement. Although we selected the sample images randomly, this study does not include sufficient data points for a signficance test. Instead, as a preliminary study, we want a rough measure of baseline capability in order to identify potential research directions and likely necessary data engineering efforts.<\/p>\n<p>For transparency, we report the output of the models for each task within a table. However, the output has been edited for uniformity and space. The models often output filler sentences, repeat themselves, and change formatting. Any large-scale application of these models will require an engineering effort to standardize the outputs.<\/p>\n<h3 id=\"inference-time-and-money\">Inference Time and Money<\/h3>\n<p>Inference times in seconds, not including any file upload times, were:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Model<\/th>\n<th style=\"text-align:right\">Min. Inf Time<\/th>\n<th style=\"text-align:right\">Med. Inf Time<\/th>\n<th style=\"text-align:right\">Max. Inf Time<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Gemma3<\/td>\n<td style=\"text-align:right\">118<\/td>\n<td style=\"text-align:right\">122<\/td>\n<td style=\"text-align:right\">189<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LLaVA<\/td>\n<td style=\"text-align:right\">235<\/td>\n<td style=\"text-align:right\">261<\/td>\n<td style=\"text-align:right\">326<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Pixtral<\/td>\n<td style=\"text-align:right\">905<\/td>\n<td style=\"text-align:right\">1193<\/td>\n<td style=\"text-align:right\">1425<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Gemini<\/td>\n<td style=\"text-align:right\">31<\/td>\n<td style=\"text-align:right\">49<\/td>\n<td style=\"text-align:right\">290<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">GPT<\/td>\n<td style=\"text-align:right\">33<\/td>\n<td style=\"text-align:right\">39<\/td>\n<td style=\"text-align:right\">50<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Our intent here is not to rigourously benchmark inference performance but to give a sense of the scale of latencies. The APIs, which are presumably tuned and run on high-end machines, needed thirty to sixty seconds to evaluate five prompts on a image. Run locally on a high-end machine that lacks a GPU, inferences took multiple minutes to complete.<\/p>\n<p>The marginal cost of calling the API services was between $0.002 to $0.003 per image. The APIs use a variety of factors to calculate a cost in terms of tokens, but this calculation is dominated by the size of the image in pixels.<\/p>\n<p>To compute the marginal cost using for local inference, we use the cost of running a virtual machine per hour and processing throughput. The virtual machine used for this study costs $0.54\/hour. This translates to costs from $0.018 to $0.18 per image processed, depending on the model.<\/p>\n<p>We expect an operator could process 10,000 images for less than $100.<\/p>\n<h3 id=\"prompt-1-locations\">Prompt 1: Locations<\/h3>\n<blockquote>\n<p>List locations or spaces featured within the map. Format as a bullet list.<\/p>\n<\/blockquote>\n<p>This task tests the model&rsquo;s performance at finding map features that are conceptually a &ldquo;room&rdquo; or defined space. Characters should be able to occupy a location. For each response, we apply a subjective criteria that the list describes rooms that are accurate and meaningful in a product query.<\/p>\n<p>We&rsquo;ve summarized the results into the table below. Both Gemini and GPT extracted useful details consistently from the images, although tuning or other filters may be required to make the outputs useful. The three hosted models typically had poor quality. When labels were present, all models preferred to extract those labels rather than interpret details in the image.<\/p>\n<table>\n<thead>\n<tr>\n<th>Processed File<\/th>\n<th>Gemma3<\/th>\n<th>LLaVA<\/th>\n<th>Pixtral<\/th>\n<th>Gemini<\/th>\n<th>GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M001.jpg<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<\/tr>\n<tr>\n<td>M002.jpg<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<\/tr>\n<tr>\n<td>M003.jpg<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Fail<\/td>\n<\/tr>\n<tr>\n<td>M004.jpg<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<\/tr>\n<tr>\n<td>M005.jpg<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<\/tr>\n<tr>\n<td>M006.jpg<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<\/tr>\n<tr>\n<td>M007.jpg<\/td>\n<td>Fail<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<td>Fail<\/td>\n<\/tr>\n<tr>\n<td>M008.jpg<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Fail<\/td>\n<td>Pass<\/td>\n<td>Pass<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4 id=\"details\">Details<\/h4>\n<p>M001 shows the deck of a ship. There is one room to the stern with a hammock, table, barrles and a crate. There is also a small room to the stern holding an anchor. The middle area hosts four cannon, two to a side, along with more barrels, crates, a ladder, and a grate. The aft is separated into two rooms, each with a hammock and a table with chair. All the models pick up on the naval theme. Pixtral picks up on the contents of rooms, but none of the rooms themselves. Gemma3 reports a &ldquo;Captain&rsquo;s Quarters&rdquo; and a &ldquo;Navigation Station,&rdquo; but it is unclear what room is affiliated with either description. Similarly, Gemini uses general terms to refer to sections of a boat, without reference to specific rooms on the map, and reports &ldquo;lifeboats\/rafts&rdquo;. GPT fares better by referring to &ldquo;Crew quarters&rdquo; and &ldquo;Private cabins&rdquo; without suggesting specific roles.<\/p>\n<p>M002 is a cliff-face, dominated by a large skull where the mouth opens into a cave. The cliff face is embedded with single-room buildings connected by stairs. The ground features a small camp and some streams crossed with bridges. None of the models pick up on the skull imagery, although all models except Pixtral report an entrance to a cave. Gemma3 fails to identify the buildings and the camp while LLaVA combines these features into a generic &ldquo;small village&rdquo; label as well as scales the small rocks in the streams into islands. GPT provides the most complete and detailed list, followed by Gemini.<\/p>\n<p>M003 is a large tree with a small opening that leads to an interior, hollowed-out room that is likely a lair for a creature. LLaVA directly extracts the hollowed-out nature of the room (albeit encounters a model loop and keeps repeating the fact). Gemini reports the fact somewhat indirectly, referring to it as a &ldquo;central clearing\/nest&rdquo;. The other three models report the existence of the tree and note other elements of the image, but not the interesting part of the image &mdash; the hallow.<\/p>\n<p>M004 is a complex map, featuring a bridge leading to a fortress of rooms, with internal paths and stairways connecting detailed rooms. LLaVA fails and simply repeats &ldquo;A dungeon with a staircase&rdquo; repeatedly. Other than Gemini, the other models associate rooms with functions in a non-apparent manner. Is a room with a table and three seating places the &ldquo;dining hall&rdquo;? Which room is a &ldquo;jail&rdquo; or &ldquo;prison&rdquo;?<\/p>\n<p>M005 is a difficult map for a human to interpret since the color range is low contrast and the buildings are largely collapsed with few distinguishing marks. Of the open source models, Pixtral does best by listing &ldquo;structures&rdquo;. None of the open source models identify the stream or river in the picture, but both API models do. Gemini&rsquo;s descriptions are matter-of-fact, while GPT&rsquo;s are more detailed and add color (not necessarily supported by the image).<\/p>\n<p>Both M006 and M007 use traditional RPG cartography as found in TSR publications in the 80s and 90s (similar to the GM map above). Rooms are labeled and, while doors, traps, and secret passages are marked, rooms are not populated with images of their contents. Gemini and GPT extracted all the room labels and other markers from the map, however, GPT also added labels to each room number with descriptions that fail to follow the actual room layouts. Of the three opensource models, Pixtral performed the best by extracting 80% in M006&rsquo;s case and one-third of the rooms in M007&rsquo;s case. From a product discovery, extracting room labels is not useful by itself, as customers are not searching for a map that includes &ldquo;room 33&rdquo;. However, customers may be interested in maps with a certain number or range of rooms, so the output may be useful as part of an analysis pipeline.<\/p>\n<p>Map M008 is a octagular bar or tavern. Gemma3 and Pixtral extract elements of the image, but fail to extract the meaning of the elements. LLaVA, Gemini, and GPT identify the bar and the other dining\/drinking elements.<\/p>\n<h4 id=\"responses\">Responses<\/h4>\n<table>\n<thead>\n<tr>\n<th>Processed File<\/th>\n<th>Ground Truth<\/th>\n<th>Gemma3<\/th>\n<th>LLaVA<\/th>\n<th>Pixtral<\/th>\n<th>Gemini<\/th>\n<th>GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M001.jpg<\/td>\n<td>(5 locations)<br>Three living areas\/berths<br>Midship\/Cannon firing<br>Anchor storage<\/td>\n<td>Deck<br>Captain&rsquo;s Quarters<br>Navigation Station<\/td>\n<td>Cabin<br>Storage area<br>Wooden planks<br>Sail<br>Anchor<br>Rope<br>Sword<br>Ship<br>Cannon<br>Barrel<br>Chest<\/td>\n<td>Sail<br>Barrel<br>Anchor<br>Rope<br>Cannon<br>Ladder<br>Grating<br>Wheel<br>Oar<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<br>Barrel<\/td>\n<td>Deck<br>Cannon emplacements<br>Central grate\/hatch<br>Bow (forecastle area with anchor, ropes, and supplies)<br>Stern (aftcastle area with cargo)<br>Lifeboats\/rafts<\/td>\n<td>Forecastle deck<br>Storage areas<br>Weapon\/armory area<br>Cannon stations<br>Main deck<br>Stairwell\/ladder access<br>Crew quarters<br>Private cabins or enclosed spaces<br>Miscellaneous deck equipment<\/td>\n<\/tr>\n<tr>\n<td>M002.jpg<\/td>\n<td>House\/Bedroom (5)<br>Bridge (3)<br>Entry to cave<br>Camp<br>Rivers<\/td>\n<td>Large Mountain<br>Cave Entrance<br>River (Two connected streams)<\/td>\n<td>A river flowing through the middle of the map<br>A waterfall in the middle of the map<br>A cave with a waterfall inside<br>A bridge over the river<br>A small village with buildings and a water source<br>A large mountain with a waterfall on it<br>A small island in the middle of the river<br>A large island in the middle of the river<br>A small island in the middle<\/td>\n<td>House with multiple windows<br>Open area with a well<br>House with a large door<br>House with a small entrance<br>House with a large entrance<br>House with a small window<br>House with a large entrance and a small window<br>House with a small entrance and a small window<br>House with a large entrance and a small window<br>House with a small entrance and a large window<br>House with a large<\/td>\n<td>Cave entrance<br>River<br>Stone bridges<br>Buildings\/Huts<br>Staircases<br>Tents\/Makeshift shelters<br>Grassy areas\/Vegetation<br>Rocky cliffs\/Cave walls<br>Small pond\/Pool of water<br>Central plaza\/Open area<\/td>\n<td>Cave entrance in the cliff face<br>Rocky cliffside and surrounding rock formations<br>Multiple interior rooms built into the cliffside<br>Room with beds and tables<br>Room with chests and barrels<br>Room with beds<br>Room with beds and a table<br>Room with crates and supplies<br>Water streams and river systems running through the map<br>Multiple stone bridges crossing the water<br>A central tent camp area with several tents<br>Grassland and forested areas surrounding the water<br>Scattered trees, shrubs, and stones throughout the landscape<\/td>\n<\/tr>\n<tr>\n<td>M003.jpg<\/td>\n<td>Room inside tree hollow<\/td>\n<td>Central massive tree with a large root system<br>Small patches of grass<\/td>\n<td>A large tree with a hole in the center<br>A small tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the center<br>A tree with a hole in the<\/td>\n<td>Central Tree Trunk<br>Upper Branches<br>Lower Branches<br>Surrounding Bushes<br>Ground Cover<br>Small Rocks<br>Fallen Leaves<br>Small Plants<br>Grass Patches<\/td>\n<td>The Great Tree<br>Central Clearing\/Nest<br>Surrounding Forest Floor<br>Tree Stumps<br>Bone Pile<br>Debris\/Wreckage<br>Thickets\/Bushes<br>Smaller Trees<br>Fallen Logs\/Branches<\/td>\n<td>Large central tree stump<br>Exposed and sprawling tree roots area<br>Central pool or depression with muddy water<br>Broken wooden debris in the central depression<br>Small clusters of bushes and shrubs around the roots<br>Scattered tree stumps throughout the map<br>Various fallen logs and branches<br>Surrounding dense forest\/wooded edge<br>Patches of colorful foliage<br>Open grassy clearings between roots and trees<\/td>\n<\/tr>\n<tr>\n<td>M004.jpg<\/td>\n<td>Bridge<br>Gate<br>Magic defense rings<br>10 Rooms<\/td>\n<td>Main Entrance<br>Grand Hall<br>Library<br>Armory<\/td>\n<td>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<br>A dungeon with a staircase<\/td>\n<td>Entrance\/Exit<br>Hallway<br>Room with Red Light<br>Control Room<br>Storage Room<br>Laboratory<br>Prison Cells<br>Torture Chamber<br>Dungeon<br>Bridge<br>Chasm<\/td>\n<td>Chasm\/Abyss<br>Fortified walkways\/ledges along the chasm<br>Balconies with glowing teal orbs and spikes<br>Rooms (possibly barracks, studies, or chambers)<br>Stairways<br>Areas with barrels and crates (storage)<br>Circular platforms with magical symbols<br>Wooden bridges or walkways connecting upper levels<br>Corridors\/Passages<br>Small alcoves or guard posts along the walls<\/td>\n<td>Entry hallway\/corridor<br>Main hall or crossroads<br>Stairs<br>Small bedrooms or chambers<br>Armory or storage rooms<br>Dining area or mess hall<br>Office or study<br>Ritual or summoning room<br>Jail or holding cells<br>Guard posts or sentry rooms<br>Bridge<br>Balcony\/ledge overlooking a chasm<br>Arcane or crystal chambers<br>Tentacle-filled chasm\/abyss<br>Storage alcoves with barrels and sacks<\/td>\n<\/tr>\n<tr>\n<td>M005.jpg<\/td>\n<td>Rooms<br>Stream<\/td>\n<td>Central Chamber<br>Collapsed Tunnel 1<br>Collapsed Tunnel 2<\/td>\n<td>A large, dark room<br>A small, dark room<br>A large, dark room with a clock<br>A small, dark room with a clock<br>A large, dark room with a clock and a small, dark room with a clock<br>A large, dark room with a clock and a small, dark room with a clock<br>A large, dark room with a clock and a small, dark room with a clock<\/td>\n<td>Top-left circular structure<br>Top-right circular structure<br>Top-center rectangular structure<br>Middle-left square structure<br>Middle-right square structure<br>Bottom-left square structure<br>Bottom-right square structure<br>Central irregularly shaped area<\/td>\n<td>Glowing Purple River<br>Star-Patterned Circular Chamber<br>Staired Circular Dais\/Platform<br>Northern Ruins<br>Southern Ruins<br>Rocky Island<\/td>\n<td>Circular ritual or summoning chamber<br>Large ruined rooms or halls<br>Central underground river or stream<br>Rocky outcroppings and cave formations along the river<br>Isolated rocky island or promontory (center of river)<br>Small ruined structure or shrine<br>Passageways and corridors connecting various rooms and spaces<br>Arched or circular entrance\/exit<br>Crumbled and partially-collapsed areas throughout map<\/td>\n<\/tr>\n<tr>\n<td>M006.jpg<\/td>\n<td>Rooms (60)<\/td>\n<td>11<br>12<br>13<br>14<\/td>\n<td>10..27<\/td>\n<td>1..19a<\/td>\n<td>1..58 (all room labels)<\/td>\n<td>Room 1  .. Room 60<br>Multiple staircases<br>Multiple well\/circle rooms<br>Several small antechambers and corridors<br>Spaces marked &quot;b&quot; and &quot;c&quot; (possibly doors or barriers)<br>Decorative\/statue symbols in some rooms<\/td>\n<\/tr>\n<tr>\n<td>M007.jpg<\/td>\n<td>Rooms (24)<br>Forest area<\/td>\n<td>24<br>23<br>22<br>21<\/td>\n<td>23<br>24<br>Lists 25..40 (none exist)<\/td>\n<td>1..20<\/td>\n<td>1..24 (all room labels)<\/td>\n<td>24 entries described like so. Room descriptions do not match the map<br>1. Cave entrance (northernmost entry)<br>2. Small cavern<br>3. Room with barrier\/door<br>4. Chamber with central structure<br>5. Large winding corridor<\/td>\n<\/tr>\n<tr>\n<td>M008.jpg<\/td>\n<td>Outside table (4)<br>Inside bar<\/td>\n<td>Octagonal Table<br>Stone Floor<br>Walls<br>Chairs<\/td>\n<td>A large wooden structure<br>A dining area with tables and chairs<br>A bar with a variety of drinks<br>A large wooden table<br>A large wooden table<br>A large wooden table<br>A large wooden table<br>A large wooden table<\/td>\n<td>Central Table<br>Barrels<br>Benches<br>Tables<br>Floor<\/td>\n<td>Central Hexagonal Building\/Bar<br>Bar Counter<br>Service Area (behind the bar)<br>Spiral Staircase<br>Seating Area (inside)<br>Awnings<br>Outdoor Paved Area<br>Outdoor Tables<\/td>\n<td>Central main building\/structure<br>Central bar or serving counter<br>Multiple colored seating areas (tables and chairs) around the main room<br>Four exterior market stalls or vendor stands<br>Four exterior round tables with benches<br>Surrounding open plaza or paved area<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"prompt-2-items\">Prompt 2: Items<\/h3>\n<blockquote>\n<p>List items featured within the map. Format as a bullet list.<\/p>\n<\/blockquote>\n<p>This task is intended to evaluate sub-image identification within a larger image, focusing on images that are &ldquo;items&rdquo; &mdash; things characters could interact with, versus rooms the characters could inhabit. This is a feature associated with &ldquo;battle maps&rdquo; or maps that will be seen by players, versus maps only seen by the game master. Thus, we remove M005, M006, and M007 from the evaluation of this task. M005 is excluded because the maps lacks details of items and M006 and M007 are not battle maps.<\/p>\n<p>This task led the LLaVA model to repeat itself. For clarity, we have removed duplicate entries.<\/p>\n<p>For this task, we use Jaccard similarity, also called intersection over union, to quantify how effectively the models are finding all the items. The score ranges from zero to one, where one would indicate the model replied with all of the items in the ground truth and without any extras. Mathematically:<\/p>\n<p>$$\nJ(A, B) = \\frac{|A \\cap B|}{|A \\cup B|}\n$$<\/p>\n<p>For our evaluation, we ignore plurality and counts. For example, &ldquo;cannon&rdquo; is treated as a correct answer for M001, as well as &ldquo;cannons&rdquo; and &ldquo;cannon (6)&rdquo;, even though there are only four.<\/p>\n<p>The scores for the models on M001, M002, M003, M004, and M008:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Processed File<\/th>\n<th style=\"text-align:right\">Gemma3<\/th>\n<th style=\"text-align:right\">LLaVA<\/th>\n<th style=\"text-align:right\">Pixtral<\/th>\n<th style=\"text-align:right\">Gemini<\/th>\n<th style=\"text-align:right\">GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">M001.jpg<\/td>\n<td style=\"text-align:right\">2 \/ 14 (0.14)<\/td>\n<td style=\"text-align:right\">5 \/ 15 (0.33)<\/td>\n<td style=\"text-align:right\">6 \/ 16 (0.38)<\/td>\n<td style=\"text-align:right\">10 \/ 14 (0.71)<\/td>\n<td style=\"text-align:right\">12 \/ 14 (0.86)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M002.jpg<\/td>\n<td style=\"text-align:right\">0 \/ 7 (0)<\/td>\n<td style=\"text-align:right\">0 \/ 11 (0)<\/td>\n<td style=\"text-align:right\">1 \/ 10 (0.10)<\/td>\n<td style=\"text-align:right\">4 \/ 16 (0.25)<\/td>\n<td style=\"text-align:right\">4 \/ 13 (0.31)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M003.jpg<\/td>\n<td style=\"text-align:right\">0 \/ 4 (0)<\/td>\n<td style=\"text-align:right\">0 \/ 4 (0)<\/td>\n<td style=\"text-align:right\">1 \/ 11 (0.09)<\/td>\n<td style=\"text-align:right\">3 \/ 11 (0.27)<\/td>\n<td style=\"text-align:right\">2 \/ 11 (0.18)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M004.jpg<\/td>\n<td style=\"text-align:right\">0 \/ 20 (0)<\/td>\n<td style=\"text-align:right\">0 \/ 20 (0)<\/td>\n<td style=\"text-align:right\">2 \/ 20 (0.10)<\/td>\n<td style=\"text-align:right\">20 \/ 20 (1.00)<\/td>\n<td style=\"text-align:right\">20 \/ 20 (1.00)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M008.jpg<\/td>\n<td style=\"text-align:right\">2 \/ 7 (0.29)<\/td>\n<td style=\"text-align:right\">2 \/ 5 (0.4)<\/td>\n<td style=\"text-align:right\">3 \/ 7 (0.43)<\/td>\n<td style=\"text-align:right\">5 \/ 9 (0.56)<\/td>\n<td style=\"text-align:right\">4 \/ 11 (0.36)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For the open source models, their errors were often ones of omission &ndash; not reporting elements of an image. In contrast, the API model scores were damaged by reporting non-existent elements or reporting elements that did not fit within the definition of an item. This is a case where further prompt engineering might significantly improve the quality of the answers.<\/p>\n<h4 id=\"responses-1\">Responses<\/h4>\n<table>\n<thead>\n<tr>\n<th>Processed File<\/th>\n<th>Ground Truth<\/th>\n<th>Gemma3<\/th>\n<th>LLaVA<\/th>\n<th>Pixtral<\/th>\n<th>Gemini<\/th>\n<th>GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M001.jpg<\/td>\n<td>Hammock<br>Barrels<br>Crates<br>Sword<br>Table<br>Bottles<br>Anchor<br>Rope coil<br>Cannon (4)<br>Ladder<br>Grate<br>Mast bases<\/td>\n<td>Wooden Deck<br>Anchor<br>Two Cannons<br>Cannon Balls<\/td>\n<td>Wooden barrels<br>Sword<br>Anchor<br>Book<br>Rope<br>Clock<br>Chest<br>Barrels<\/td>\n<td>Barrels<br>Anchor<br>Ladder<br>Grating<br>Ropes<br>Cannon<br>Oars<br>Chests<br>Wooden Planks<br>Wheel<\/td>\n<td>Cannons<br>Wooden barrels<br>Wooden crates<br>Lifeboats (or folded sails)<br>Ladder<br>Grate<br>Mast bases (circular objects)<br>Anchor<br>Rope coil<br>Bottles<br>Cutlass\/Sword<br>Planks\/Decking (as part of the ship structure)<\/td>\n<td>Cannons (6)<br>Barrels (several)<br>Wooden crates (several)<br>Hammocks (4)<br>Anchor<br>Rope coil<br>Table with a sword and bottles<br>Ladders (2)<br>Grate\/hatch<br>Round hatches\/openings (3)<br>Wooden planks (flooring)<br>Small chest\/box<br>Small round table or stool<\/td>\n<\/tr>\n<tr>\n<td>M002.jpg<\/td>\n<td>Tent<br>Bed<br>Table<br>Crates<\/td>\n<td>A large cave<br>A river<br>Stone buildings<\/td>\n<td>Waterfall<br>River<br>Water source<br>Cave<br>Mountain<br>Buildings<br>Trees<\/td>\n<td>Several buildings, some with windows and doors<br>Multiple bridges crossing over waterways<br>A winding river or stream<br>Various patches of green vegetation<br>Rocky terrain with large boulders and outcrops<br>A few small structures resembling huts or shelters<\/td>\n<td>Cave entrance<br>River\/Stream<br>Waterfalls<br>Stone bridges (two)<br>Houses\/Buildings<br>Stairs<br>Trees\/Bushes<br>Grassy areas<br>Rocks\/Boulders<br>Tents\/Tarpaulins<br>Pond\/Small body of water<br>Crates and Barrels<br>Beds<br>Tables and Chairs<br>Fireplaces<br>Pathways\/Dirt tracks<\/td>\n<td>Mountain\/cliff face with a large cave entrance<br>Flowing river\/stream starting from the cave and branching out<br>Stone bridges crossing the river\/stream<br>Rocky terrain with patches of grass and greenery<br>Multiple tents set up near the river<br>Various stone buildings\/rooms carved into the mountain\/cliff sides<br>Furnishings inside the buildings such as beds, tables, barrels, crates, and sacks<br>Stairs leading up to the stone buildings\/rooms<br>Ruined or broken bridge<br>Small pond or pool<br>Freestanding boulders and rocks scattered throughout<br>Trees and bushes dotting the landscape<br>Pathways and trails winding between features<\/td>\n<\/tr>\n<tr>\n<td>M003.jpg<\/td>\n<td>Shrubs<br>Tree stumps<br>Pile of bones<\/td>\n<td>A massive, ancient tree with a wide, gnarled root system<\/td>\n<td>A large tree with a hole in the center<\/td>\n<td>Large tree with extensive roots<br>Small plants<br>Bushes<br>Rocks<br>Fallen leaves<br>Branches<br>Moss<br>Flowers<br>Grass patches<\/td>\n<td>Giant tree roots<br>Campfire<br>Pile of bones<br>Trees<br>Bushes<br>Tree stumps<br>Fallen logs<br>Mushrooms<br>Red berries or fruit<br>Grassy ground<br>Dirt\/muddy patch<\/td>\n<td>Giant tree stump with extensive roots<br>Central hollow area within the stump, partially filled with water or mud<br>Broken wooden cart or debris in the central hollow<br>Various bushes and shrubs scattered around the roots<br>Tree stumps from felled trees<br>Fallen logs and branches<br>Rocks and boulders<br>Diverse trees with autumn-colored foliage surrounding the area<br>Small patches of flowers or undergrowth<br>Leaf litter and scattered debris<\/td>\n<\/tr>\n<tr>\n<td>M004.jpg<\/td>\n<td>Special<\/td>\n<td>A large, damp, stone room, reminiscent of a cavern<\/td>\n<td>A dungeon with a staircase<\/td>\n<td>Multiple rooms with various symbols and colors<br>Red symbol in a room at the top left<br>Yellow symbol in a room at the top center<br>Blue symbol in a room at the top right<br>Central corridor with two prominent rooms<br>Left room with a yellow symbol<br>Right room with a red symbol<br>Mechanical elements<br>Tentacle-like structures with green gems<br>Chains and<\/td>\n<td>Stone walls and floors<br>Wooden bridges<br>Chains<br>Large, tentacle-like appendages<br>Barrels<br>Crates<br>Tables<br>Chairs<br>Rugs\/Carpets<br>Beds or cots<br>Stairs<br>Torches or lanterns providing light<br>Books, scrolls, or papers<br>A pentagram or similar magical symbol on the floor<br>An abyss or chasm below the structures<br>Wooden planking or scaffolding<br>Doors or archways<br>Chests or coffers<br>Tapestries or cloth hangings<br>Pillars or columns<br>Balconies or ledges<br>Large, spiked, metallic objects<br>A circular stone platform or area<\/td>\n<td>Wooden stairs and ladders<br>Stone staircases<br>Various tables and chairs<br>Rugs and carpets<br>Chests and crates<br>Beds<br>Bookcases<br>Barrels<br>Wall torches<br>Stone archways<br>Mounted shields and wall decorations<br>Desk with books and papers<br>Magical circles or glowing symbols<br>Large chains<br>Stone bridge over a chasm<br>Giant, glowing crystals embedded in fixtures<br>Massive tentacles emerging from the darkness below<br>Closed chests and cabinets<br>Armories or weapon racks<br>Sectioned and barred rooms\/cells or vaults<\/td>\n<\/tr>\n<tr>\n<td>M008.jpg<\/td>\n<td>Bar table<br>Chairs<br>Barrels<br>Bottle rack<br>Tables (round)<\/td>\n<td>Octagonal Table<br>Wooden Chairs<br>Stone Flooring<br>Lantern<\/td>\n<td>Tables<br>Chairs<\/td>\n<td>Tables<br>Barrels<br>Mugs<br>Bench<br>Wooden Bar<\/td>\n<td>Central hexagonal bar\/kiosk structure<br>Wooden flooring<br>Cobblestone paving<br>Awnings<br>Barrels<br>Spiral staircase<br>Shelves\/racks<br>Round wooden tables<br>Chairs\/stools<\/td>\n<td>Large central octagonal structure (building\/tavern layout)<br>Hexagonal bar counter in the center<br>Barrels and mugs on the bar counter<br>Bar stools in red, blue, yellow, and green arranged around the bar counter<br>Booths or low couches\/benches around the interior edge of the building<br>Semi-circular stairwell behind the bar<br>Four large market stalls\/tents surrounding the octagonal building<br>Four round wooden tables with four chairs each located outside the building<br>Wooden plank floor inside the building<br>Cobblestone pavement outside the building and under the stalls\/tables<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"prompt-3-grid-dimensions\">Prompt 3: Grid Dimensions<\/h3>\n<blockquote>\n<p>In units of the listed map scale or map grids, if they exist, what are the dimensions of this map?<\/p>\n<\/blockquote>\n<p>This task involves identifying a map feature (a grid) and then counting squares. If a model performs well at this task, a marketplace might be able to offer geographical information system-like filters.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Processed File<\/th>\n<th style=\"text-align:left\">Ground Truth<\/th>\n<th style=\"text-align:left\">Gemma3<\/th>\n<th style=\"text-align:left\">LLaVA<\/th>\n<th style=\"text-align:left\">Pixtral<\/th>\n<th style=\"text-align:left\">Gemini<\/th>\n<th style=\"text-align:left\">GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">M001.jpg<\/td>\n<td style=\"text-align:left\">4x11<\/td>\n<td style=\"text-align:left\">20x20<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">10x20<\/td>\n<td style=\"text-align:left\">5x10<\/td>\n<td style=\"text-align:left\">6x16<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M002.jpg<\/td>\n<td style=\"text-align:left\">40x30<\/td>\n<td style=\"text-align:left\">800 squares<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">40x30<\/td>\n<td style=\"text-align:left\">30x20<\/td>\n<td style=\"text-align:left\">40x22<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M003.jpg<\/td>\n<td style=\"text-align:left\">34x34<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">10x10<\/td>\n<td style=\"text-align:left\">10x10<\/td>\n<td style=\"text-align:left\">20x20<\/td>\n<td style=\"text-align:left\">28x28<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M004.jpg<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">1000x1000<\/td>\n<td style=\"text-align:left\">10x15<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M005.jpg<\/td>\n<td style=\"text-align:left\">60x60<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">1000x1000<\/td>\n<td style=\"text-align:left\">40x40<\/td>\n<td style=\"text-align:left\">30x30<\/td>\n<td style=\"text-align:left\">28x20<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M006.jpg<\/td>\n<td style=\"text-align:left\">34x44<\/td>\n<td style=\"text-align:left\">60x60<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">40x50<\/td>\n<td style=\"text-align:left\">32x30<\/td>\n<td style=\"text-align:left\">34x43<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M007.jpg<\/td>\n<td style=\"text-align:left\">34x44<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">24x32<\/td>\n<td style=\"text-align:left\">20x25<\/td>\n<td style=\"text-align:left\">30x25<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M008.jpg<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">non answer<\/td>\n<td style=\"text-align:left\">10x10<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<td style=\"text-align:left\">no grid<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>All of the models failed this task. The two most advanced models, Gemini and GPT, successfully gave a &ldquo;cannot be determined&rdquo; answer when the maps lacked a grid or any scale marker. (Contrast this with &ldquo;non answer&rdquo; which uses filler phrases.) However, they were otherwise highly unreliable with their grid counts, even if they were close (numerical difference) in some cases.<\/p>\n<p>This task will be difficult for LLMs. First, since LLMs work on a tiling of the source images, the image will usually be distorted in order to be standardized. Second, this task involves counting which neural networks are not ideal for. If this was an important element for product discovery, we expect a specialized model using a convolutional neural network (CNN) to extract the grids, or perhaps a more direct technique (e.g. Hough transform), would be efficient and reliable.<\/p>\n<h3 id=\"prompt-4-linear-or-branching-movement\">Prompt 4: Linear or Branching Movement<\/h3>\n<blockquote>\n<p>Is movement within this map mostly linear or mostly branching?<\/p>\n<\/blockquote>\n<p>This tasks involves a higher-order understanding of the graph or connected nature of rooms, along with an understanding of how characters move along the network. Practically, this distinction is an important element of adventure design.<\/p>\n<table>\n<thead>\n<tr>\n<th>Processed File<\/th>\n<th>Ground Truth<\/th>\n<th>Gemma3<\/th>\n<th>LLaVA<\/th>\n<th>Pixtral<\/th>\n<th>Gemini<\/th>\n<th>GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M001.jpg<\/td>\n<td>Linear<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly linear<\/td>\n<\/tr>\n<tr>\n<td>M002.jpg<\/td>\n<td>Branching<\/td>\n<td>Non answer<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<tr>\n<td>M003.jpg<\/td>\n<td>Linear<\/td>\n<td>Non answer<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<tr>\n<td>M004.jpg<\/td>\n<td>Branching<\/td>\n<td>Mix<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<tr>\n<td>M005.jpg<\/td>\n<td>Branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly linear<\/td>\n<\/tr>\n<tr>\n<td>M006.jpg<\/td>\n<td>Branching<\/td>\n<td>Non answer<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<tr>\n<td>M007.jpg<\/td>\n<td>Linear (Complex)<\/td>\n<td>Non answer<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<tr>\n<td>M008.jpg<\/td>\n<td>Linear<\/td>\n<td>Primarily linear<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly linear<\/td>\n<td>Mostly branching<\/td>\n<td>Mostly branching<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we treat this task as a boolean classifier, Pixtral and Gemini both scored the best with 5 correct out of 8. GPT scored next with 4 correct out of 8 and Gemma3 and LLaVA scored 3 out of 8. (We graded Gemma3&rsquo;s non answers as incorrect answers, but they could be graded more severely.) Overall, the results suggest performance no better than a coin flip. However, this is a subjective task and we expect that if more humans were involved to make a labeled dataset inter-rater reliability would be low.<\/p>\n<p>The reasoning text provided by Gemini and GPT suggests the models are recognizing the graph-like nature of the maps and how movement flows through passages and doors. With further experimentation and a more developed dataset, higher performance may be unlocked.<\/p>\n<h3 id=\"prompt-5-scenarios\">Prompt 5: Scenarios<\/h3>\n<blockquote>\n<p>List types of adventures or scenarios this map is suitable for. Format as a bullet list.<\/p>\n<\/blockquote>\n<p>This task is an open-ended generation of adventure ideas or classifications. This is similar to image caption generation. Alternatively, a marketplace owner could turn this into a classification problem by restricting the list of scenarios to a defined set.  Publishers, of course, could use this technology to augment their own product descriptions with lists of creative ways the maps could be used in games.<\/p>\n<p>As an open-ended generation task, we will use a metric modeled after signal-to-noise. (A method like <a href=\"https:\/\/arxiv.org\/abs\/1411.5726\">CIDEr<\/a> would be more rigorous, but more expensive to implement.) Each adventure or scenario line will be assigned to either the &ldquo;signal&rdquo; or &ldquo;noise&rdquo; category. The score will be the ratio between the count of the &ldquo;signal&rdquo; responses over the count of the &ldquo;noise&rdquo; responses plus one (to avoid division by zero). Mathematically:<\/p>\n<p>$$\nh = \\frac{|s|}{|n| + 1}\n$$<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Processed File<\/th>\n<th style=\"text-align:right\">Gemma3<\/th>\n<th style=\"text-align:right\">LLaVA<\/th>\n<th style=\"text-align:right\">Pixtral<\/th>\n<th style=\"text-align:right\">Gemini<\/th>\n<th style=\"text-align:right\">GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">M001.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">2\/1 (2)<\/td>\n<td style=\"text-align:right\">4\/1 (4)<\/td>\n<td style=\"text-align:right\">11\/1 (11)<\/td>\n<td style=\"text-align:right\">15\/1 (15)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M002.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">1\/14 (0.07)<\/td>\n<td style=\"text-align:right\">2\/3 (0.7)<\/td>\n<td style=\"text-align:right\">12\/2 (6)<\/td>\n<td style=\"text-align:right\">12\/1 (12)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M003.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">6\/5 (1.2)<\/td>\n<td style=\"text-align:right\">3\/2 (1.5)<\/td>\n<td style=\"text-align:right\">8\/1 (8)<\/td>\n<td style=\"text-align:right\">15\/1 (15)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M004.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">10\/1 (10)<\/td>\n<td style=\"text-align:right\">5\/1 (5)<\/td>\n<td style=\"text-align:right\">8\/1 (8)<\/td>\n<td style=\"text-align:right\">15\/1 (15)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M005.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">1\/14 (0.07)<\/td>\n<td style=\"text-align:right\">4\/1 (4)<\/td>\n<td style=\"text-align:right\">11\/1 (11)<\/td>\n<td style=\"text-align:right\">13\/1 (13)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M006.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">1\/14 (0.07)<\/td>\n<td style=\"text-align:right\">3\/1 (3)<\/td>\n<td style=\"text-align:right\">12\/1 (12)<\/td>\n<td style=\"text-align:right\">20\/1 (20)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M007.jpg<\/td>\n<td style=\"text-align:right\">0\/1 (0)<\/td>\n<td style=\"text-align:right\">6\/1 (6)<\/td>\n<td style=\"text-align:right\">4\/1 (4)<\/td>\n<td style=\"text-align:right\">11\/1 (11)<\/td>\n<td style=\"text-align:right\">20\/1 (20)<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">M008.jpg<\/td>\n<td style=\"text-align:right\">1\/1 (1)<\/td>\n<td style=\"text-align:right\">4\/9 (0.4)<\/td>\n<td style=\"text-align:right\">0\/12 (0)<\/td>\n<td style=\"text-align:right\">10\/1 (10)<\/td>\n<td style=\"text-align:right\">12\/1 (12)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Gemini and GPT both excel at this task. Gemini&rsquo;s responses tend to be more &ldquo;classification&rdquo;-oriented while GPT tends to be more of a story prompt, although neither are fully consistent. Gemma3 scores low because the model outputs few responses. Pixtral  has more consistently higher signal than LLaVA.<\/p>\n<table>\n<thead>\n<tr>\n<th>Processed File<\/th>\n<th>Portrayed Area<\/th>\n<th>Gemma3<\/th>\n<th>LLaVA<\/th>\n<th>Pixtral<\/th>\n<th>Gemini<\/th>\n<th>GPT<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M001.jpg<\/td>\n<td>Ship<\/td>\n<td>Pirate Ship Heist<\/td>\n<td>Pirate Adventure<br>Treasure hunt<\/td>\n<td>Pirate Adventures<br>Historical Seafaring Missions<br>Underwater Exploration<br>Mutiny and Intrigue<\/td>\n<td>Pirate Attack\/Boarding Action<br>Naval Combat Encounter<br>Smuggling Operation<br>Mutiny<br>Escaping a Sinking Ship<br>Treasure Hunt\/Retrieval<br>Assassination\/Kidnapping Target on Board<br>Investigation\/Mystery<br>Defending Against a Sea Monster Attack<br>Transport Mission<br>Stealth\/Infiltration Mission<\/td>\n<td>Pirate ship battles at sea<br>Naval skirmishes between rival crews<br>Boarding and raiding encounters<br>Ship-to-ship chases and cannon duels<br>Exploring a derelict or haunted vessel<br>Smuggling or contraband hiding missions<br>Mutiny or shipboard betrayal scenarios<br>Prisoner escape from the brig<br>Rescue mission for captured allies<br>Treasure transport and theft<br>Monster or sea creature attacks on the ship<br>Diplomatic or parley meetings on neutral waters<br>Undercover infiltration of an enemy ship<br>Investigating sabotage or sabotage attempts<br>Storm survival or shipwreck scenarios<\/td>\n<\/tr>\n<tr>\n<td>M002.jpg<\/td>\n<td>Rooms on cliff face<\/td>\n<td>Bandit Ambush<\/td>\n<td>Dungeon crawl<br>Fantasy adventure<br>Exploration game<br>Role-playing game<br>Stealth game<br>Survival game<br>Puzzle-solving game<br>Strategy game<br>Horror game<br>Science fiction game<br>Post-apocalyptic game<br>Historical reenactment game<br>Cultural exploration game<br>Educational game<\/td>\n<td>Fantasy Exploration<br>River Trade<br>Dungeon Crawl<br>Survival<\/td>\n<td>Bandit Hideout\/Raider Camp<br>Cultist Stronghold<br>Lost Tribe\/Hidden Settlement<br>Monster Lair<br>Siege\/Defense Scenario<br>Ambush Site<br>Exploration of Ancient Ruins<br>Rescue Mission<br>Source of a Problem<br>Negotiation\/Diplomacy Hub<br>Secret Passage\/Escape Route<br>Guardian of a Pass<br>Resource Scramble<\/td>\n<td>Exploring a bandit or outlaw hideout nestled in the mountains<br>Assaulting a fortified goblin or orc camp in search of stolen goods or hostages<br>Investigating ancient ruins or a dwarven outpost hidden within a cliff face<br>Defending a mountain pass against an invading force or monster attack<br>Navigating a river and its cave system to reach a lost treasure or artifact<br>Mounting a rescue mission for prisoners held in one of the outbuildings or caves<br>Tracing the source of a corrupted river affecting nearby villages<br>Delving into a cave system rumored to house a dangerous creature or valuable resource<br>Delivering supplies or messages between isolated outposts during a time of war<br>Negotiating peace or trade with a faction that controls the mountain stronghold<br>Escorting a group of refugees or travelers through treacherous mountainous terrain<br>Uncovering a secret cult or forbidden ritual taking place within the caves<\/td>\n<\/tr>\n<tr>\n<td>M003.jpg<\/td>\n<td>Room inside huge tree<\/td>\n<td>Monster Lair<\/td>\n<td>Fantasy adventure<br>Exploration of a magical forest<br>Quest for hidden treasure<br>Encounter with mythical creatures<br>Survival in a dangerous environment<br>Escape from a cursed tree<br>Battle between good and evil forces<br>Puzzle-solving in a mysterious setting<br>Cooperation between characters to overcome challenges<br>Role-playing game<\/td>\n<td>Fantasy Exploration<br>Ambush Scenario<br>Treasure Hunt<br>Nature-based Quests<\/td>\n<td>Monster Lair<br>Ritual Site<br>Sacred Grove\/Ancient Tree Guardian<br>Ambush Point<br>Hidden Entrance<br>Clue to a Mystery<br>Survival Challenge\/Rest Stop Gone Wrong<br>Negotiation with Fey or Nature Spirits<\/td>\n<td>Ancient tree ritual site: Place for druidic ceremonies or summoning rituals.<br>Feywild crossing: Portal or gateway to the Feywild or another mystical plane.<br>Forest guardian&rsquo;s lair: Home of a treant, dryad, or nature spirit the party must parley with or defeat.<br>Bandit or cultist camp: Group using the tree&rsquo;s shelter for nefarious purposes.<br>Quest objective: Recover a magical item hidden among the roots or in the tree&rsquo;s hollow.<br>Ambush encounter: Enemies attack as the party navigates the tangled roots.<br>Puzzle or trap location: Roots conceal clues or dangerous mechanisms.<br>Gathering place: Neutral meeting ground for woodland creatures, druids, or tribes.<br>Corrupted grove: The tree is dying or cursed, and restoring it is the mission.<br>Nature&rsquo;s trial: Test for rangers, druids, or nature-themed characters to prove themselves.<br>Fey or nature festival: The site hosts celebrations or strange happenings.<br>Hidden entrance: Secret trapdoor or passage beneath the roots leading to an underground dungeon.<br>Monster lair: Home of giant spiders, ankhegs, blights, or similar forest-dwelling threats.<br>Vision or dream sequence: The tree appears in a prophetic dream or mystical vision.<br>Healing or wish-granting site: The tree bestows boons, heals, or answers deep questions.<\/td>\n<\/tr>\n<tr>\n<td>M004.jpg<\/td>\n<td>Fortress and bridge<\/td>\n<td>Dungeon Crawl<\/td>\n<td>Dungeon crawl<br>Exploration of a haunted castle<br>Stealthy infiltration of a fortress<br>Escaping a prison or dungeon<br>A quest for hidden treasure<br>A battle between rival factions<br>A race against time to save a kingdom<br>A mission to rescue a captured princess<br>A journey through a magical portal<br>A test of courage<\/td>\n<td>Dungeon Crawl<br>Escape Room<br>Stealth Mission<br>Investigation<br>Trap Disarmament<\/td>\n<td>Infiltrating an Evil Cult&rsquo;s Lair<br>Rescuing a Prisoner from a Fortified Dungeon<br>Assaulting a Cliffside Fortress<br>Stopping a Ritual to Summon\/Control a Monster<br>Exploring an Abandoned, Monster-Infested Stronghold<br>Sabotaging an Enemy War Machine or Operation<br>Escaping from a Perilous Prison<br>Investigating the Source of a Strange Magical Phenomenon<\/td>\n<td>Infiltrating a cultist stronghold hidden deep underground<br>Rescuing prisoners from a dungeon surrounded by monstrous tentacles<br>Investigating an ancient, forgotten fortress sitting atop a slumbering creature<br>Defending the fortress from a monstrous attack breaking through the floor<br>Escaping from a collapsing stronghold as tentacles breach the walls<br>Searching for magical artifacts hidden within a forbidden underground sanctum<br>Negotiating or allying with a cult that worships an eldritch being<br>Conducting a heist on a fortress protected by unnatural guardians<br>A ritual gone wrong unleashing tentacled horrors from beneath<br>Exploring a fortress built on the remains of a titanic, ancient creature<br>A final boss encounter with an eldritch, tentacled entity<br>Reclaiming a corrupted outpost for a noble house or order<br>Uncovering the secrets behind the fortress&rsquo;s construction and its hidden dangers<br>Escorting an NPC through a treacherous, monster-infested stronghold<br>Sealing a rift to another dimension opening beneath the fortress<\/td>\n<\/tr>\n<tr>\n<td>M005.jpg<\/td>\n<td>Ruins with stream<\/td>\n<td>Dungeon Crawl<\/td>\n<td>Dungeon crawl<br>Fantasy adventure<br>Horror-themed game<br>Steampunk setting<br>Science fiction adventure<br>Mystery solving<br>Puzzle-solving game<br>Exploration game<br>Survival game<br>Role-playing game<br>Board game<br>Strategy game<br>Tactical game<br>Turn-based game<\/td>\n<td>Dungeon Crawl<br>Escape Room<br>Underground Exploration<br>Investigative Missions<\/td>\n<td>Ancient Ruin Exploration<br>Cultist Lair Assault<br>Underdark\/Subterranean Journey<br>Monster Hunt<br>Lost Civilization Discovery<br>Necromancer&rsquo;s Stronghold<br>Shadowfell Excursion<br>Retrieval Mission<br>Guardian of a Forbidden Place<br>Escape from a Collapsing Dungeon<br>Ritual Disruption<\/td>\n<td>Exploration of Ancient Ruins<br>Dungeon Delve<br>Underdark Adventure<br>Magical River Crossing<br>Rescue Mission<br>Cultist Lair<br>Haunted Expedition<br>Artifact Retrieval<br>Fugitive Chase<br>Environmental Challenge<br>Secret Society Base<br>Waterborne Encounter<br>Interdimensional Portal<\/td>\n<\/tr>\n<tr>\n<td>M006.jpg<\/td>\n<td>Dungeon complex<\/td>\n<td>Escape Room<\/td>\n<td>Dungeon crawl<br>Fantasy adventure<br>Science fiction adventure<br>Steampunk adventure<br>Horror adventure<br>Mystery adventure<br>Puzzle-solving adventure<br>Exploration game<br>Role-playing game<br>Board game<br>Strategy game<br>Tactical game<br>City-building game<br>Survival game<\/td>\n<td>Dungeon Crawl<br>Escape Room<br>Mystery Investigation<\/td>\n<td>Classic Dungeon Crawl<br>Lost Temple or Sanctuary<br>Ancient Vault or Treasury<br>Wizard&rsquo;s Labyrinth or Magical Research Facility<br>Cultist Stronghold\/Lair<br>Prison Complex<br>Exploration of an Abandoned Fortress\/City Core<br>Infiltration and Espionage Mission<br>Monster Lair Complex<br>Quest to Activate\/Deactivate an Ancient Device<br>Gauntlet\/Trial Site<br>Underground Bunker\/Shelter<\/td>\n<td>Classic dungeon crawl with traps, monsters, and treasure<br>Exploring an ancient, long-forgotten underground city<br>Infiltrating the lair of a villain or cult<br>Recovering a lost artifact from a crypt or tomb<br>Navigating a labyrinth filled with puzzles and challenges<br>Mapping out an old, abandoned wizard&rsquo;s stronghold<br>Escaping from imprisonment in an underground complex<br>Investigating rumors of hauntings or strange occurrences<br>Conducting a rescue mission for captured allies or townsfolk<br>Searching for the hidden vault of a powerful sorcerer<br>Battling rival adventurers or treasure hunters<br>Encountering and negotiating with a secret underground society<br>Uncovering the mystery behind a sealed magical portal<br>Fighting through the defenses of an undead king&rsquo;s throne room<br>Delving into the ruins to learn secrets of an ancient civilization<br>Clearing out a den of monsters threatening the surface<br>Preventing the summoning of a dangerous extraplanar entity<br>Discovering and dealing with a living dungeon (sentient architecture)<br>Retrieving lost knowledge from a hidden library or archive<br>Surviving a supernatural cataclysm that has engulfed the dungeon<\/td>\n<\/tr>\n<tr>\n<td>M007.jpg<\/td>\n<td>Cave network and forest<\/td>\n<td>D&amp;D 5e<\/td>\n<td>Exploring a mysterious underwater cave system<br>Navigating through a maze of underwater tunnels and chambers<br>Searching for hidden treasures or artifacts within the cave system<br>Encountering dangerous sea creatures and overcoming challenges<br>Solving puzzles and riddles to progress through the cave system<br>Escaping from a submerged temple or ruins<\/td>\n<td>Exploration Quests<br>Navigation Challenges<br>Pirate Adventures<br>Maritime Commerce<\/td>\n<td>Dungeon Crawl<br>Coastal Cave Exploration<br>Smuggler&rsquo;s Den\/Pirate Hideout<br>Monster Lair<br>Search and Rescue<br>Infiltration Mission<br>Treasure Hunt<br>Escape from Captivity<br>Geological Survey\/Exploration<br>Ritual Site<br>Amphibious Assault<\/td>\n<td>Hidden pirate or smuggler cave base<br>Lost treasure hunt on a remote island<br>Monster lair or nest exploration<br>Shipwreck survivors seeking shelter<br>Secret cult or ritual site<br>Rescue mission for captured NPCs<br>Forgotten tomb or ancient ruins<br>Lair of aquatic or amphibious creatures<br>Bandit stronghold or hideout<br>Gathering rare herbs or resources for a quest<br>Underwater or coastal infiltration scenario<br>Battle between rival pirate factions<br>Investigation of mysterious disappearances<br>Protecting island inhabitants from an invading force<br>Discovery of a natural wonder or magical anomaly<br>Navigating a maze of tunnels to escape pursuers<br>First-contact adventure with isolated island dwellers<br>A haunted or cursed cave system<br>Race to claim a newly discovered cave&rsquo;s riches<br>Expedition to study unique island flora or fauna<\/td>\n<\/tr>\n<tr>\n<td>M008.jpg<\/td>\n<td>Circular bar<\/td>\n<td>Diplomacy\/Negotiation<\/td>\n<td>Pirate adventure<br>Treasure hunt<br>Medieval fantasy<br>Dungeon crawl<br>Board game<br>Role-playing game<br>Strategy game<br>Party game<br>Themed event<br>Restaurant or bar setting<br>Historical reenactment<br>Cultural celebration<\/td>\n<td>Role-Playing Campaigns<br>Dungeons &amp; Dragons adventures<br>Pathfinder scenarios<br>Fantasy and medieval role-playing games<br>Board Games<br>Strategy games with tactical combat<br>Miniature wargames<br>Fantasy-themed board games<br>Tabletop Simulations<br>Virtual tabletop role-playing sessions<br>Online gaming scenarios<br>Digital strategy games<\/td>\n<td>Tavern Brawl<br>Information Gathering<br>Ambush\/Assassination Attempt<br>Stealth\/Heist<br>Secret Meeting<br>Festival Stall<br>Defending the Establishment<br>Investigation Hub<br>Recruitment Drive<br>Celebration\/Gathering Spot<\/td>\n<td>Meeting an important contact or informant in a lively tavern or public house<br>Gathering information or rumors from locals and travelers<br>Holding a negotiation, peace talk, or clandestine meeting<br>Participating in or breaking up a heated tavern brawl<br>Investigating suspicious activity, theft, or a murder at a busy establishment<br>Entertaining a crowd or taking part in a storytelling or bardic competition<br>Setting up an ambush for a villainous NPC<br>Surviving an unexpected attack or raid unfolding inside the tavern<br>Hosting a festival, market day, or other community celebration<br>Recruiting adventurers or mercenaries for a quest<br>Solving a puzzle or riddle as part of a secretive society&rsquo;s initiation<br>Escaping from guards or pursuing enemies by blending into the crowd<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>Multi-modal large language models exhibit a range of quality. Product discovery applications will likely require, at least, models with late 2024 or 2025 state-of-the-art performance. For VTT-style maps, models can yield lists of rooms and items found within a map, although individual entries may be irrelevant for purpose and the lists often contain hallucinated or misinterpreted items. For GM-style maps, models extract labels with high-recall and precision, but other details may be imprecise. Our tests showed all models were poor at understanding scale or grid systems, although some could detect the existence of a grid. Similarly, reasoning about movement through the map may be beyond current LLMs, although we suspect this area needs more rigorous study. LLMs scored highly on generating scenario or story ideas tied to a map, suggesting this might be a straight-forward product discovery mechanism.<\/p>\n<h2 id=\"notes\">Notes<\/h2>\n<p>The dataset was not incorporated as training material into any model.<\/p>"},{"title":"Connect Four Ply Dataset","link":"https:\/\/ztoz.blog\/posts\/connect4-ply-dataset\/","pubDate":"Tue, 22 Apr 2025 09:29:08 -0600","guid":"https:\/\/ztoz.blog\/posts\/connect4-ply-dataset\/","description":"<p><em>Connect 4<\/em> is a solved game in the <em>m,n,k<\/em> family. On a 7 column, 6 row board, players alternately drop a token into a column, attempting to establish four tokens of the same color along any row, column, or diagonal. John Tromp created a labeled dataset of all valid eight ply positions with their perfect play win results. We present a dataset that presents all sequences of plies that lead to each position along with their associated win result class.<\/p>\n<h2 id=\"download-link\">Download Link<\/h2>\n<p><a href=\"connect4_plies.csv.gz\">csv.gz (~8m)<\/a> sha1sum: <code>1e683b7af9496f171ad7f1bed67491708f7de55b<\/code><\/p>\n<h2 id=\"background\">Background<\/h2>\n<p><em>Connect Four<\/em> is a two-player, perfect knowledge game invented by Howard Wexler and sold by the Milton Bradley firm starting in 1974. As of 2025, the game is <a href=\"https:\/\/instructions.hasbro.com\/en-us\/instruction\/connect-4-game-instructions\">sold by Hasbro<\/a>. The game has been solved both weakly (James D. Allen and Victor Allis, independently, in 1988) and strongly (John Tromp in 1995). Tromp donated his <a href=\"https:\/\/doi.org\/10.24432\/C59P43\">dataset<\/a> of &ldquo;all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced.&rdquo;<\/p>\n<p>A given board position may have multiple sequences of plies (individual moves) that result in that configuration. For example, the board:<\/p>\n<pre tabindex=\"0\"><code>6       o       \n5       x       \n4       o       \n3       x       \n2     o o       \n1     x x       \n  a b c d e f g\n<\/code><\/pre><p>may be created with any of the four sequences:<\/p>\n<pre tabindex=\"0\"><code>x:c1 o:c2 x:d1 o:d2 x:d3 o:d4 x:d5 o:d6 \nx:d1 o:d2 x:c1 o:c2 x:d3 o:d4 x:d5 o:d6 \nx:d1 o:d2 x:d3 o:d4 x:c1 o:c2 x:d5 o:d6 \nx:d1 o:d2 x:d3 o:d4 x:d5 o:d6 x:c1 o:c2 \n<\/code><\/pre><p>The constraint is that, since pieces fall to the lowest row, a piece underneath another piece must have been dropped earlier in the sequence. Thus, <code>c1<\/code> must be before any <code>c2<\/code>, but <code>c1<\/code> may be before or after <code>d1<\/code>. Within this dataset, <code>x<\/code> always indicates the first player and <code>o<\/code> the second player.<\/p>\n<p>As another example, this board configuration features an <code>o<\/code> move that is not dependent on an <code>x<\/code> move:<\/p>\n<pre tabindex=\"0\"><code>6       o       \n5       x       \n4       o       \n3       x       \n2       o       \n1   x   x   o   \n  a b c d e f g\n<\/code><\/pre><p>Within the list of sequences, the <code>f1<\/code> move can fall into any of the <code>o<\/code> moves (2nd, 4th, 6th, or 8th):<\/p>\n<pre tabindex=\"0\"><code>x:b1 o:f1 x:d1 o:d2 x:d3 o:d4 x:d5 o:d6 \nx:d1 o:f1 x:b1 o:d2 x:d3 o:d4 x:d5 o:d6 \nx:d1 o:d2 x:b1 o:f1 x:d3 o:d4 x:d5 o:d6 \nx:d1 o:d2 x:d3 o:f1 x:b1 o:d4 x:d5 o:d6 \nx:d1 o:d2 x:d3 o:d4 x:b1 o:f1 x:d5 o:d6 \nx:d1 o:d2 x:d3 o:d4 x:d5 o:f1 x:b1 o:d6 \nx:d1 o:d2 x:d3 o:d4 x:d5 o:d6 x:b1 o:f1 \n<\/code><\/pre><h2 id=\"dataset-information\">Dataset Information<\/h2>\n<p>The dataset consists of 2,031,148 data rows and one header row. The columns are separated by commas (<code>,<\/code>) and the columns consist of:<\/p>\n<ul>\n<li>row [1, 67557] one-indexed source row identifier for the <a href=\"https:\/\/doi.org\/10.24432\/C59P43\">Connect-4<\/a> dataset. Numbers do not include any separators.<\/li>\n<li>ply1 [a-g][1-6] two characters identifying the column and row of the ply<\/li>\n<li>ply2 <em>same<\/em><\/li>\n<li>ply3<\/li>\n<li>ply4<\/li>\n<li>ply5<\/li>\n<li>ply6<\/li>\n<li>ply7<\/li>\n<li>ply8 <em>same<\/em><\/li>\n<li>x_result one of: <code>win<\/code>, <code>loss<\/code>, or <code>draw<\/code>, indicating the first player&rsquo;s perfect play result<\/li>\n<\/ul>\n<p>The board is configured with the lower-left position being a1 and the upper-right position being g6. Pictorially:<\/p>\n<pre tabindex=\"0\"><code>6\n5\n4\n3\n2\n1\n  a b c d e f g\n<\/code><\/pre><h2 id=\"license\">License<\/h2>\n<p>This dataset is licensed under a <a href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/legalcode\">Creative Commons Attribution 4.0 International<\/a> (CC BY 4.0) license.<\/p>\n<p>This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.<\/p>\n<h2 id=\"methodology\">Methodology<\/h2>\n<p>To construct this dataset, we transform each row of <a href=\"https:\/\/doi.org\/10.24432\/C59P43\">Tromp&rsquo;s eight ply position dataset<\/a> into a series of ply sequences. We use a breadth-first search to build all possible sequences using the position board as a constraint. Pieces from the first player along the bottom are used as the initial plies. As pieces are played, we add additional possible plies that were dependent on previous plies.<\/p>\n<p>We wrote the program in Rust. The key data structures for modeling the game are:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#75715e\">#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]<\/span>\n<span style=\"color:#66d9ef\">pub<\/span> <span style=\"color:#66d9ef\">enum<\/span> <span style=\"color:#a6e22e\">Player<\/span> {\n    X,\n    O\n}\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#75715e\">#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]<\/span>\n<span style=\"color:#66d9ef\">pub<\/span> <span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">Placement<\/span> {\n    <span style=\"color:#75715e\">\/\/ zero-indexed; printed as 0 =&gt; row 1 is at the bottom, row 6 is the top row\n<\/span><span style=\"color:#75715e\"><\/span>    row: <span style=\"color:#66d9ef\">u8<\/span>,\n    <span style=\"color:#75715e\">\/\/ piece is placed on top of this column, [0-6] corresponding to [a-g]\n<\/span><span style=\"color:#75715e\"><\/span>    column: <span style=\"color:#66d9ef\">u8<\/span>\n}\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#75715e\">#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]<\/span>\n<span style=\"color:#66d9ef\">pub<\/span> <span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">Ply<\/span> {\n    player: <span style=\"color:#a6e22e\">Player<\/span>,\n    placement: <span style=\"color:#a6e22e\">Placement<\/span>\n}\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#75715e\">#[derive(Debug, Clone, PartialEq, Eq)]<\/span>\n<span style=\"color:#66d9ef\">pub<\/span> <span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">Board<\/span> {\n    <span style=\"color:#75715e\">\/\/ columns[0] is column &#39;a&#39; while columns[6] is &#39;g&#39;, with plies extending from row 1..6\n<\/span><span style=\"color:#75715e\"><\/span>    columns: [Vec<span style=\"color:#f92672\">&lt;<\/span>Player<span style=\"color:#f92672\">&gt;<\/span>; <span style=\"color:#ae81ff\">7<\/span>]\n}\n<\/code><\/pre><\/div><p>Although memory consumption for each of these key structures could be reduced further (e.g. a <code>Placement<\/code> and <code>Ply<\/code> could be represented within a single <code>u8<\/code>), memory is not a constraint and convenience wins out. The <code>Copy<\/code> traits tell the compiler that these structures can be cheaply copied.<\/p>\n<p>After building a <code>Board<\/code> from the Tromp dataset, we create a <code>Constraints<\/code> instance to record the potential first plies for each player that have no predecessor requirements and a map of the predecessor requirements for other plies.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">Constraints<\/span> {\n    starts_x: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span>,\n    starts_o: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span>,\n    precedes: <span style=\"color:#a6e22e\">HashMap<\/span><span style=\"color:#f92672\">&lt;<\/span>Ply, Ply<span style=\"color:#f92672\">&gt;<\/span>\n}\n<\/code><\/pre><\/div><p>Idiomatically, the initial potential plies for the two players would be better represented as a slice than as a vector.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">impl<\/span> Constraints {\n    <span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">from_board<\/span>(board: <span style=\"color:#a6e22e\">Board<\/span>) -&gt; <span style=\"color:#a6e22e\">Constraints<\/span> {\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> starts_x: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> vec![];\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> starts_o: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> vec![];\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> precedes: <span style=\"color:#a6e22e\">HashMap<\/span><span style=\"color:#f92672\">&lt;<\/span>Ply, Ply<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> HashMap::new();\n\n        <span style=\"color:#66d9ef\">for<\/span> (col, column) <span style=\"color:#66d9ef\">in<\/span> board.columns.iter().enumerate() {\n            <span style=\"color:#66d9ef\">for<\/span> (row, player) <span style=\"color:#66d9ef\">in<\/span> column.iter().enumerate() {\n                <span style=\"color:#66d9ef\">if<\/span> row <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span> {\n                    <span style=\"color:#66d9ef\">if<\/span> <span style=\"color:#f92672\">*<\/span>player <span style=\"color:#f92672\">==<\/span> Player::X {\n                        starts_x.push(Ply::new(Player::X, row, col))\n                    } <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#66d9ef\">if<\/span> <span style=\"color:#f92672\">*<\/span>player <span style=\"color:#f92672\">==<\/span> Player::O {\n                        starts_o.push(Ply::new(Player::O, row, col))\n                    }\n                } <span style=\"color:#66d9ef\">else<\/span> {\n                    <span style=\"color:#66d9ef\">let<\/span> prev <span style=\"color:#f92672\">=<\/span> board.at(row<span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>, col).unwrap();\n                    precedes.insert(Ply::new(<span style=\"color:#f92672\">*<\/span>prev, row<span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>, col), Ply::new(<span style=\"color:#f92672\">*<\/span>player, row, col));\n                }\n            }\n        }\n\n        Constraints { starts_x, starts_o, precedes }\n    }\n}\n<\/code><\/pre><\/div><p>We iteratively build up solutions one layer at a time. The <code>PartialSolution<\/code> struct stores the chosen plies so far and <code>x_moves<\/code> and <code>o_moves<\/code> tracks potential plies for each player.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">PartialSolution<\/span> {\n    chosen: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span>,\n    x_moves: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span>,\n    o_moves: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span>\n}\n<\/code><\/pre><\/div><p><code>PartialSolution<\/code> features two methods to simplify state management. The <code>moves<\/code> method returns a slice of potential plies, calculating the current player based on the number of plies already taken. The <code>next<\/code> method creates a new <code>PartialSolution<\/code> instance based on applying a ply to the current state. The ply must be removed from the lists of available plies and plies freed up must be added to the potential list.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">impl<\/span> PartialSolution {\n\n    <span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">moves<\/span>(self: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">PartialSolution<\/span>) -&gt; <span style=\"color:#66d9ef\">&amp;<\/span>[Ply] {\n        <span style=\"color:#66d9ef\">if<\/span> self.chosen.len() <span style=\"color:#f92672\">%<\/span> <span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span> {\n            self.x_moves.as_slice()\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            self.o_moves.as_slice()\n        }\n    }\n\n    <span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">next<\/span>(self: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">PartialSolution<\/span>, ply: <span style=\"color:#a6e22e\">Ply<\/span>, precedes: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">HashMap<\/span><span style=\"color:#f92672\">&lt;<\/span>Ply, Ply<span style=\"color:#f92672\">&gt;<\/span>) -&gt; <span style=\"color:#a6e22e\">PartialSolution<\/span> {\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> new_chosen <span style=\"color:#f92672\">=<\/span> Vec::from_iter(self.chosen.iter().cloned());\n        new_chosen.push(ply);\n\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> new_x_moves: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> Vec::from_iter(self.x_moves.iter().cloned());\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> new_o_moves: Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> Vec::from_iter(self.o_moves.iter().cloned());\n        <span style=\"color:#66d9ef\">match<\/span> precedes.get(<span style=\"color:#f92672\">&amp;<\/span>ply) {\n            Some(p) <span style=\"color:#66d9ef\">if<\/span> p.player <span style=\"color:#f92672\">==<\/span> Player::X <span style=\"color:#f92672\">=&gt;<\/span> new_x_moves.push(<span style=\"color:#f92672\">*<\/span>p),\n            Some(p) <span style=\"color:#f92672\">=&gt;<\/span> new_o_moves.push(<span style=\"color:#f92672\">*<\/span>p),\n            None <span style=\"color:#f92672\">=&gt;<\/span> ()\n        }\n        <span style=\"color:#66d9ef\">if<\/span> ply.player <span style=\"color:#f92672\">==<\/span> Player::X {\n            <span style=\"color:#66d9ef\">let<\/span> pos <span style=\"color:#f92672\">=<\/span> new_x_moves.iter().position(<span style=\"color:#f92672\">|<\/span>p<span style=\"color:#f92672\">|<\/span> <span style=\"color:#f92672\">*<\/span>p <span style=\"color:#f92672\">==<\/span> ply).unwrap();\n            new_x_moves.remove(pos);\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            <span style=\"color:#66d9ef\">let<\/span> pos <span style=\"color:#f92672\">=<\/span> new_o_moves.iter().position(<span style=\"color:#f92672\">|<\/span>p<span style=\"color:#f92672\">|<\/span> <span style=\"color:#f92672\">*<\/span>p <span style=\"color:#f92672\">==<\/span> ply).unwrap();\n            new_o_moves.remove(pos);\n        }\n\n        PartialSolution { chosen: <span style=\"color:#a6e22e\">new_chosen<\/span>, x_moves: <span style=\"color:#a6e22e\">new_x_moves<\/span>, o_moves: <span style=\"color:#a6e22e\">new_o_moves<\/span> }\n    }\n}\n<\/code><\/pre><\/div><p>The function <code>all_solutions<\/code> implements the breadth-first search and returns a vector of a sequence of plies (<code>Vec&lt;Ply&gt;<\/code>). For a given board, the number of possible sequences is fairly small (an average of 30 solutions and max of 360 for each board), so we are not obligated to conserve memory and use an iterator or depth-first search approach to reduce storage of intermediate data.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">pub<\/span> <span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">all_solutions<\/span>(board: <span style=\"color:#a6e22e\">Board<\/span>) -&gt; Vec<span style=\"color:#f92672\">&lt;<\/span>Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;&gt;<\/span> {\n    <span style=\"color:#66d9ef\">let<\/span> constraints <span style=\"color:#f92672\">=<\/span> Box::new(Constraints::from_board(board));\n    <span style=\"color:#66d9ef\">let<\/span> base <span style=\"color:#f92672\">=<\/span> Box::new(PartialSolution::new(constraints.starts_x.as_slice(), constraints.starts_o.as_slice()));\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> prev_partials: Vec<span style=\"color:#f92672\">&lt;<\/span>PartialSolution<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> vec![];\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> curr_partials: Vec<span style=\"color:#f92672\">&lt;<\/span>PartialSolution<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> vec![];\n\n    <span style=\"color:#66d9ef\">for<\/span> initial <span style=\"color:#66d9ef\">in<\/span> constraints.starts_x.iter() {\n        <span style=\"color:#66d9ef\">let<\/span> partial <span style=\"color:#f92672\">=<\/span> base.next(<span style=\"color:#f92672\">*<\/span>initial, <span style=\"color:#f92672\">&amp;<\/span>constraints.precedes);\n        curr_partials.push(partial);\n    }\n\n    <span style=\"color:#66d9ef\">for<\/span> _ <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">..<\/span><span style=\"color:#ae81ff\">8<\/span> {\n        swap(<span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#66d9ef\">mut<\/span> prev_partials, <span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#66d9ef\">mut<\/span> curr_partials);\n        curr_partials.clear();\n\n        <span style=\"color:#66d9ef\">for<\/span> partial <span style=\"color:#66d9ef\">in<\/span> prev_partials.iter() {\n            <span style=\"color:#66d9ef\">for<\/span> ply <span style=\"color:#66d9ef\">in<\/span> partial.moves().iter() {\n                <span style=\"color:#66d9ef\">let<\/span> new_partial <span style=\"color:#f92672\">=<\/span> partial.next(<span style=\"color:#f92672\">*<\/span>ply, <span style=\"color:#f92672\">&amp;<\/span>constraints.precedes);\n                curr_partials.push(new_partial);\n            }\n        }\n    }\n\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> solutions: Vec<span style=\"color:#f92672\">&lt;<\/span>Vec<span style=\"color:#f92672\">&lt;<\/span>Ply<span style=\"color:#f92672\">&gt;&gt;<\/span> <span style=\"color:#f92672\">=<\/span> vec![];\n    <span style=\"color:#66d9ef\">for<\/span> partial <span style=\"color:#66d9ef\">in<\/span> curr_partials.iter() {\n        solutions.push(partial.chosen.clone());\n    }\n    solutions\n}\n<\/code><\/pre><\/div>"},{"title":"Routines of Substitution (Review)","link":"https:\/\/ztoz.blog\/posts\/routines-of-substitution\/","pubDate":"Thu, 03 Apr 2025 16:01:00 -0600","guid":"https:\/\/ztoz.blog\/posts\/routines-of-substitution\/","description":"<p>Early post-war computing can often seem alien as terminology mixed metaphors and the pioneers brought their own distinct toolsets to the field. Thus, it seems fitting that a powerful influence on computing design would come from a &ldquo;Martian&rdquo;: John von Neumann. During the war, von Neumann surveyed computing capabilities within the United States, which introduced him to the ENIAC and the Harvard Mark I. Von Neumann joined as a consultant to the EDVAC project, a successor to the ENIAC, where he and the team worked out the concept of a stored-program computer.<\/p>\n<p>Mark Priestley&rsquo;s <em>Routines of Substitution: John von Neumann&rsquo;s Work on Software Development, 1945-1948<\/em> is a technical history of von Neumann&rsquo;s programming work, with special focus on the &ldquo;meshing&rdquo; algorithm he wrote as part of a merge sort, his diagrammatic programming language, and the integration and execution of subroutines within a program. While the &ldquo;Von Neumann Architecture,&rdquo; as documented in <em>First Draft of a Report on the EDVAC<\/em>, is the most famous outcome of his work at this time, this study illuminates the invention process and the practical aspects of implementation.<\/p>\n<figure><a href=\"https:\/\/doi.org\/10.1007\/978-3-319-91671-2\"><img src=\"978-3-319-91671-2.webp\"\n         alt=\"Routines of Substitution book cover\"\/><\/a><figcaption>\n            <p>Routines of Substitution book cover<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Priestley, Mark. 2018. Routines of Substitution: John von Neumann&rsquo;s Work on Software Development, 1945-1948. SpringerBriefs in History of Computing. <a href=\"https:\/\/doi.org\/10.1007\/978-3-319-91671-2\">https:\/\/doi.org\/10.1007\/978-3-319-91671-2<\/a>.<\/p>\n<p>The book begins with von Neumann&rsquo;s first program for the EDVAC, a program for sorting records. (Records being a modern term; metaphors at this point included the key field called the star and the other fields planets.) For 1945, sorting was an unexpected use case for a digital computer. The expectation, as Knuth jokes in (Knuth 1970), was a partial differential equation. Priestley tracks the impetus to an unspecified statistical problem brought forward by Samuel Wilks (pg 14) in April 1945. Solving this statistical problem required sorting. Although delegating sorting to IBM card sorting machines was possible, the round trip times would kill efficiency. Instead, sorting would be done within the EDVAC&rsquo;s memory. Although we do not have documents describing a complete sorting implementation, we do have documentation for the &ldquo;meshing&rdquo; or merging subroutine for a merge sort as worked out by Von Neumann.<\/p>\n<blockquote>\n<p>Meshing: Given two ordered sequences, produce a third ordered sequence containing all elements of the first two sequences.<\/p>\n<\/blockquote>\n<p>Priestely describes the interplay between the EDVAC orders (machine instructions) and von Neumann&rsquo;s algorithm. In contrast to Knuth&rsquo;s write-up (Knuth 1970), Priestely uses von Neumann&rsquo;s notation, rather than transcribing it to an assembly language form. Because of this, Priestely is able to show the greater use of self-modifying code and the principal role of substitution within programs. The von Neumann architecture is principally noted as mixing instructions and data. The EDVAC orders can leave out conditional jump instructions because they re-write the destinations of unconditional jumps. Although more difficult for a modern reader to follow, the lack of translation yields greater historical insight.<\/p>\n<p>The second half of the book focuses on von Neumann&rsquo;s proposed software development process and the role of subroutines. One of the principal technical tasks in crafting a program was setting or computing all the necessary address locations within the code, since there were no relative addressing modes nor was there a stack. In von Neumann&rsquo;s process, the source code is translated into machine instructions very early, which forces address specification to be made every time a program is run. In contrast, the EDSAC team, who was also building a &ldquo;von Neumman&rdquo;-style computer but translated symbolic instructions to machine instructions later in the process, developed an automatic and less-error prone approach. The book ends with a discussion of the limited role of subroutines in von Neumman&rsquo;s program design and how it may reflect how the production of mathematical tables were organized.<\/p>\n<p><em>Routines of Substitution<\/em> is a specialized work and best understood with a background in computer design. It sheds light on some of the earliest formal thinking about programming and how some of that thinking did not pan out. While challenging, it is also rewarding to study. Recommended.<\/p>\n<p><strong>Further Reading<\/strong><\/p>\n<p>Knuth, Donald. 1970. \u201cVon Neumann\u2019s First Computer Program.\u201d ACM Computing Surveys 2 (4): 247\u201360. <a href=\"https:\/\/doi.org\/10.1145\/356580.356581\">https:\/\/doi.org\/10.1145\/356580.356581<\/a>.<\/p>\n<p>Wilkes, M. V. 1951. The Preparation of Programs for an Electronic Digital Computer, with Special Reference to the EDSAC and the Use of a Library of Subroutines. Cambridge, Mass.: Addison-Wesley Press.<\/p>"},{"title":"Ludii Wumpus World","link":"https:\/\/ztoz.blog\/posts\/ludii-wumpus\/","pubDate":"Wed, 26 Mar 2025 09:55:20 -0600","guid":"https:\/\/ztoz.blog\/posts\/ludii-wumpus\/","description":"<p><a href=\"https:\/\/ludii.games\">Ludii<\/a> is a general game system for modeling games and puzzles, although it focuses on historical and traditional games. Ludii was recently used to provide the training data for a <a href=\"https:\/\/www.kaggle.com\/competitions\/um-game-playing-strength-of-mcts-variants\/overview\">Kaggle competition<\/a>. I noticed their <a href=\"https:\/\/ludii.games\/wishlist.php\">wishlist<\/a> contained <em>Wumpus World<\/em>, a common puzzle used for AI training and education. Intrigued, I implemented and submitted a version of <em>Wumpus World<\/em>.<\/p>\n<figure><img src=\"Wumpus_World_Screenshot.png\"\n         alt=\"Game player interface with human in a 4x4 cave\" width=\"100%\"\/><figcaption>\n            <h4>Wumpus World in the Ludii Player<\/h4>\n        <\/figcaption>\n<\/figure>\n\n<h2 id=\"background-hunt-the-wumpus--wumpus-world\">Background: Hunt the Wumpus \/ Wumpus World<\/h2>\n<p><em>Hunt the Wumpus<\/em> was published by Creative Computing magazine in the <a href=\"https:\/\/archive.org\/details\/CreativeComputingv01n05SeptemberOctober1975\/page\/n51\/mode\/2up\">September-October 1975 issue<\/a>. The creator, Gregory Yob, describes his desire for a game not set within a Cartesian grid and how the hunter, the Wumpus, and the world&rsquo;s hazards came together. The <a href=\"https:\/\/github.com\/RayKV423\/original-hunt-the-wumpus\/tree\/main\">game listing in BASIC<\/a> fits onto a single page and takes 226 lines, including a smattering of comments.<\/p>\n<figure><img src=\"Creative_Computing_Wumpus.png\"\n         alt=\"Banner illustrating numberered rooms, bats, pits, crooked arrow, and a wumpus\" width=\"100%\"\/><figcaption>\n            <h4>Banner for the Hunt the Wumpus Article<\/h4>\n        <\/figcaption>\n<\/figure>\n\n<p>The game was a big success (relative to the time) and the article was reprinted in a &ldquo;Best Of&rdquo; issue. The game was ported and recreated for a number of systems.<\/p>\n<p>Two decades later, Russell and Norvig&rsquo;s textbook <em>Artificial Intelligence: A Modern Approach<\/em>, uses &ldquo;Wumpus World&rdquo; within a series of motivating examples of AI techniques. <em>Wumpus World<\/em> simplifies the game of <em>Hunt the Wumpus<\/em>: The cave network is a simple rectangular grid. There are no bats to randomly move the player and the Wumpus and the pits are all static. The interface is normalized with tank-like controls: you can move forward or rotate 90 degrees to the right or left, you may pick up the gold, and you may fire an arrow in the direction you are facing. These simplifications reduce the game to something manageable in first-order logic and easy to model using probabilistic reasoning.<\/p>\n<h2 id=\"rules\">Rules<\/h2>\n<p><em>Wumpus World<\/em> is a hidden knowledge, single player game set on a 4x4 grid laid out as below. (Variants may randomize the board.) The player&rsquo;s piece starts at A1. The objective of the game is to pick up the gold and kill the wumpus, in any order.<\/p>\n<blockquote>\n<p>Divergence: The published rules for <em>Wumpus World<\/em> also require the player piece to return to the starting position in order to climb out of the cave. I dropped this requirement because backtracking is not fun.<\/p>\n<\/blockquote>\n<p>During their turn, a player may move the piece to an orthogonal cell or they may fire their arrow (they have only one) in any orthogonal direction.<\/p>\n<blockquote>\n<p>Divergence: The published rules only allow forward movement or rotation. Although Ludii can model pieces facing in a single direction, requiring players to rotate and move rather than just move to their desired cell would be annoying friction.<\/p>\n<\/blockquote>\n<p>If the player&rsquo;s piece is othogonally adjacent to a pit, they will receive a percept stating they feel a breeze. If they are orthogonally adjacent to the wumpus, they will receieve a percept of smelling a stench. If they are in the same cell as the gold, they will detect the gold. Player pieces will immediately pick up the gold.<\/p>\n<blockquote>\n<p>Divergence: The published rules require an explicit action to pick up the gold. This probably has an educational value when teaching students to program agents, but I feel this is an annoying friction within a game. Holding gold carries no penalty (e.g. encumbrance isn&rsquo;t modeled) so why wouldn&rsquo;t the gold be picked up at the first opportunity?<\/p>\n<\/blockquote>\n<p>If the player&rsquo;s piece moves into either the wumpus cell or a pit cell, they lose the game.<\/p>\n<p>If the player fires the arrow, it will follow that direction until it either hits a Wumpus, killing it, or hits the wall. Arrows cannot be retrieved once fired. If the arrow hits the Wumpus, the Wumpus will call out in pain informing the player of the Wumpus&rsquo;s demise.<\/p>\n<pre>\n    +---------+---------+---------+---------+\n 4\t| stench  |         | breeze  |  pit    |\n    |         |         |\t      |         |\n    |         |         |\t      |         |\n    +---------+---------+---------+---------+\n 3\t| wumpus  | stench  | pit     | breeze  |\n    |         | breeze  |\t      |         |\n    |         | gold    |\t      |         |\n    +---------+---------+---------+---------+\n 2\t| stench  |         | breeze  |         |\n    |         |         |\t      |         |\n    |         |         |\t      |         |\n    +---------+---------+---------+---------+\n 1\t| start   | breeze  | pit     | breeze  |\n    |         |         |\t      |         |\n    |   ==>   |         |\t      |         |\n    +---------+---------+---------+---------+\n         A         B         C         D\n<\/pre>\n<h2 id=\"game-rules-in-ludii\">Game Rules in Ludii<\/h2>\n<p>Ludii models games using a declarative language with a S-expression-like syntax. Although the syntax uses S-expressions, Ludii is not a Lisp-like language.<\/p>\n<p>The first &ldquo;atom&rdquo; within a S-expression defines the ludeme. Under the hood, each ludeme is implemented by a Java class within the Ludii engine. The top-level <code>game<\/code> ludeme requires three sub-ludemes: <code>players<\/code>, <code>equipment<\/code>, and <code>rules<\/code>. The complete source code is listed in the appendix and I describe each ludeme block in the sections below. For clarity, I describe the rules prior to the equipment rather than following the order within the file.<\/p>\n<h3 id=\"players\">Players<\/h3>\n<p>The <code>players<\/code> ludeme defines the number of players and information about each player, if relevant. As a single-player game, the declaration is very simple for this game.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">players<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n<\/code><\/pre><\/div><p>Within the code, <code>P1<\/code> denotes the first player. Ludii&rsquo;s model of ownership includes neutral pieces owned by nobody and shared pieces owned by all players.<\/p>\n<p>A player does not imply human; players may be AI agents or controlled over a network link.<\/p>\n<h3 id=\"start-rules-setup\">Start Rules (Setup)<\/h3>\n<p>The start ludeme is concerned with placing the initial pieces and initializing the board, sites, and players. For <em>Wumpus<\/em>, the human or player piece is placed in the bottom left. Ludii supports multiple coordinate systems since it supports multiple geometries, but this game seems to be best modeled using spreadsheet like coordinates, so I used <code>coord:&quot;A1&quot;<\/code> to denote the location. (The original <em>Hunt the Wumpus<\/em> was played on a dodecahedron with rooms labeled 1 to 20.) I tried modeling the arrow in various ways but placing the arrow into the &ldquo;hand&rdquo; of the player turned out to be the most natural option.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">\t(<span style=\"color:#a6e22e\">start<\/span>\n\t    <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n\t        (<span style=\"color:#a6e22e\">place<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> coord:<span style=\"color:#e6db74\">&#34;A1&#34;<\/span>)\n\t        (<span style=\"color:#a6e22e\">place<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> (<span style=\"color:#a6e22e\">handSite<\/span> P1))\n\t    <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t)\n<\/code><\/pre><\/div><p>I also tried variants by placing pieces representing the Wumpus, the gold, and the pits on the board. By default, pieces are visible (since most games have perfect knowledge, at least for the board) and only one piece per cell is rendered. Rather than fighting convention, I changed those to sites or markings of a grid location.<\/p>\n<p>Although many aspects of the game can be initialized in the <code>start<\/code> ludeme, general variables (<code>Var<\/code>) cannot be initialized within this ludeme.<\/p>\n<h3 id=\"end-rules-winning-and-losing\">End Rules (Winning and Losing)<\/h3>\n<p>There are a number of variants for how to win the game. In the original <em>Hunt<\/em>, you simply needed to kill the Wumpus. In the CIS587 readings, you need to kill the wumpus, collect the gold, and return to the starting position to climb out of the cave. In the Russel and Norvig text, the highest score is met by finding the gold and climbing out the cave without firing an arrow. I decided to require killing the Wumpus and collecting the gold, but not returning to the original location.<\/p>\n<p>The end rules are modeled as a partial function. If the game state matches the first criteria, the player (Mover) immediately loses. If the game state matches the second criteria, the player immediately wins. Otherwise, play continues.<\/p>\n<p>The <code>&quot;(AtWumpus&quot;)<\/code> and <code>(&quot;AtPit&quot;)<\/code> are &lsquo;defines&rsquo; and described later.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">end<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n        (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">or <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>)) (<span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span>))\n            (<span style=\"color:#a6e22e\">result<\/span> Mover Loss)\n        )\n        (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(= (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>) (= (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>))\n            (<span style=\"color:#a6e22e\">result<\/span> Mover Win)\n        )\n    <span style=\"color:#960050;background-color:#1e0010\">}<\/span>)\n<\/code><\/pre><\/div><h3 id=\"play-rules\">Play Rules<\/h3>\n<p>During development, I left the play section as <code>(play (forEach piece))<\/code> which allowed me to concentrate on the player movement and interactions with sites. The <code>forEach piece<\/code> means that each piece&rsquo;s movement logic (as defined as part of the equipment) may be &ldquo;played&rdquo;.<\/p>\n<p>I explored various options for modeling arrow firing. In a general programming system, I&rsquo;d create four buttons where a button would fire an arrow in one of the cardinal directions. However, Ludii does not have a concept of a generalizable button action. I tried modeling it as a card action, where your hand is populated by N, S, E, and W cards, but that felt unnatural, especially since you would lose access to all cards if you played one. Ludii has a &lsquo;move Shoot&rsquo; ludeme which I could have used, but the user interface was clunky between choosing to move or choosing to shoot.<\/p>\n<p>I found that modeling the firing of an arrow as taking an arrow from the hand has a number of advantages. One, in the user interface, it is a very explicit choice and does not collide with piece movement. Two, pieces in the hand can have a count attached to them, if a game variant that allows more arrows is desired.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">play<\/span>\n        (<span style=\"color:#a6e22e\">or<\/span>\n            (<span style=\"color:#a6e22e\">forEach<\/span> piece)\n            (<span style=\"color:#a6e22e\">move<\/span>\n                (<span style=\"color:#a6e22e\">from<\/span> (<span style=\"color:#a6e22e\">handSite<\/span> P1))\n                (<span style=\"color:#a6e22e\">to<\/span> (<span style=\"color:#a6e22e\">sites<\/span> LineOfSight Farthest at:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) Orthogonal))\n                (<span style=\"color:#a6e22e\">then<\/span> (<span style=\"color:#e6db74\">&#34;ArrowLogic&#34;<\/span>))\n            )\n        )\n    )\n<\/code><\/pre><\/div><p>Rather than including the arrow firing logic in the <code>play<\/code> ludeme, it could be attached to the piece description in the equipment section &mdash;- that is where the logic for the human piece is located. Ultimately, I decided to place the logic in the <code>play<\/code> ludeme because of its &ldquo;exceptional&rdquo; behavior; the arrow is not a regular piece as it is a consumable and does not stay after being placed. The human piece, in contrast, is permaneant and something that will be moved multiple times.<\/p>\n<h2 id=\"defines\">Defines<\/h2>\n<p>Defines are an optional feature within the Ludii language to extract ludemes from their regular position. Defines are used to reduce repeated code and attach semantic labels to logic.<\/p>\n<h3 id=\"predicates\">Predicates<\/h3>\n<p>I created five predicates to simplify expressions. All of them ask if the current location of the human (<code>(where &quot;Human&quot; P1)<\/code>) intersections within certain sets of locations. The labeled lists of sites or regions are created in the equipment ludeme.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;IsNextToPit&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Breeze&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;IsNextToWumpus&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Stench&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtGold&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Gold&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>)))\n<\/code><\/pre><\/div><p>Alternatively, parameterized defines could be used to remove the repetition:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtSite&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#f92672\">#<\/span><span style=\"color:#ae81ff\">1<\/span>)))\n<\/code><\/pre><\/div><p>and called via:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#e6db74\">&#34;AtSite&#34;<\/span> <span style=\"color:#e6db74\">&#34;Gold&#34;<\/span>)\n<\/code><\/pre><\/div><h3 id=\"arrow-path\">Arrow Path<\/h3>\n<p>When the player fires an arrow, they select one of the orthogonal positions farthest from the human. This is achieved via the <code>sites LineOfSight<\/code> ludeme and the appropriate filters:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">move<\/span>\n        (<span style=\"color:#a6e22e\">from<\/span> (<span style=\"color:#a6e22e\">handSite<\/span> P1))\n        (<span style=\"color:#a6e22e\">to<\/span> (<span style=\"color:#a6e22e\">sites<\/span> LineOfSight Farthest at:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) Orthogonal))\n        (<span style=\"color:#a6e22e\">then<\/span> (<span style=\"color:#e6db74\">&#34;ArrowLogic&#34;<\/span>))\n    )\n<\/code><\/pre><\/div><p>Once the arrow is fired, we need to check all the sites between the human and the farthest point. The <code>sites Between<\/code> filter is useful for selecting these sites. By default, the filter excludes the ends, so we include the end with the arrow. Since the human is killed if they enter a room with the wumpus, there is no need to check the human&rsquo;s current position.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;ArrowPath&#34;<\/span> \n\t(<span style=\"color:#a6e22e\">sites<\/span> Between Orthogonal \n\t\tfrom:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) fromIncluded:True \n\t\tto:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1)\n\t)\n)\n<\/code><\/pre><\/div><h3 id=\"arrow-firing-arrowlogic\">Arrow Firing (ArrowLogic)<\/h3>\n<p>Using the path of the arrow, we need to check if the arrow enters any room with the Wumpus. I couldn&rsquo;t use <code>is In<\/code> because it requires a single site and <code>(sites &quot;Wumpus&quot;)<\/code> returns a set, so I converted the sets to arrays of integers via <code>array<\/code> and used the array <code>intersection<\/code> ludeme as an &ldquo;exists&rdquo; operator by checking the size.<\/p>\n<p>I wish Ludii had an <code>progn<\/code> for processing multiple side-effects in sequence. Lacking that, I combined the multiple effects using logical operators per the documentation. If the Wumpus' location is in the set of locations of the arrow path, we set the <code>WumpusKilled<\/code> variable. In both cases, we provide a <code>note<\/code> to tell the player what happened and then remove the arrow from the board. If we did not remove the arrow, it would remain in the farthest location, potentially blocking the human&rsquo;s movement.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;ArrowLogic&#34;<\/span> (<span style=\"color:#66d9ef\">if <\/span>(= <span style=\"color:#ae81ff\">0<\/span> (<span style=\"color:#a6e22e\">size<\/span> Array (<span style=\"color:#a6e22e\">intersection<\/span> (<span style=\"color:#a6e22e\">array<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>)) (<span style=\"color:#a6e22e\">array<\/span> (<span style=\"color:#e6db74\">&#34;ArrowPath&#34;<\/span>)))))\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;Thunk! The arrow hits a wall. You have failed to kill the wumpus.&#34;<\/span>)\n                            (<span style=\"color:#a6e22e\">remove<\/span> (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) at:EndOfTurn)\n                        )\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#a6e22e\">or<\/span>\n                                (<span style=\"color:#a6e22e\">set<\/span> Var <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n                                (<span style=\"color:#a6e22e\">remove<\/span> (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) at:EndOfTurn)\n                            )\n                            (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;The wumpus yells in pain and dies&#34;<\/span>)\n                        )\n                     )\n)\n<\/code><\/pre><\/div><h3 id=\"note-and-percepts\"><code>note<\/code> and Percepts<\/h3>\n<p>A key aspect of the game is that the player is given clues to the positions of hazards. Ideally, this information could be annotated on the board, or lighting an indicator, but I didn&rsquo;t find a way to achieve that. Ultimately, I used the <code>note<\/code> ludeme which is a way to write a message to the player. Although the game is traditionally text driven, this aspect of the game should be modeled more explicitly especially for the benefit of generic game playing agents.<\/p>\n<h2 id=\"equipment\">Equipment<\/h2>\n<p>The <code>equipment<\/code> ludeme defines the board, pieces, and other semi-physical aspects of the game. Logic for how pieces are used in the game can be attached to their definition or placed within the <code>play<\/code> ludeme within the rules.<\/p>\n<h3 id=\"non-human-pieces-and-sites\">Non-Human Pieces and Sites<\/h3>\n<p><em>Wumpus World<\/em> uses a square grid of cave rooms, which can be easily declared as a <code>(board (square 4))<\/code>. Within the <code>equipment<\/code> ludeme we also define starting locations for all elements using spreadsheet coordinates.<\/p>\n<p>I decided to &ldquo;pre-compute&rdquo; the locations where the human will feel a breeze or smell a stench by setting those locations based on the existing regions. Using <code>sites Around<\/code> I can have the computer compute the set of sites rather than manually set them like Gold, Pits, and Wumpus.<\/p>\n<p>The pieces are declared in the <code>equipment<\/code> ludeme but placed in the <code>start<\/code> ludeme (above).<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">equipment<\/span>\n        <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n            (<span style=\"color:#a6e22e\">board<\/span> (<span style=\"color:#a6e22e\">square<\/span> <span style=\"color:#ae81ff\">4<\/span>))\n\n            \/\/ <span style=\"color:#f92672\">...<\/span> human piece defined here <span style=\"color:#f92672\">...<\/span>\n\n            (<span style=\"color:#a6e22e\">hand<\/span> P1 size:1)\n            (<span style=\"color:#a6e22e\">piece<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1)\n\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Gold&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;B3&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;C1&#34;<\/span> <span style=\"color:#e6db74\">&#34;C3&#34;<\/span> <span style=\"color:#e6db74\">&#34;D4&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;A3&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Stench&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> Around (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>) Orthogonal))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Breeze&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> Around (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span>) Orthogonal))\n        <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    )\n<\/code><\/pre><\/div><h3 id=\"human-piece\">Human Piece<\/h3>\n<p>The human piece (human was chosen as the name because it fit an existing Ludii graphic) is owned by the first player, P1, and can move one step orthogonally each turn. The syntax requires an explicit <code>to<\/code> ludeme. We restrict moves to locations that are empty. Since there is only one piece on the board, this restriction is unnecessary.<\/p>\n<p>Ludii lacks a <code>switch<\/code> or <code>match<\/code> structure, so we need to chain the effects of various sites using logical operators. The logic is complicated because the conditions are not exclusive to each other. I use <code>defines<\/code> heavily here to reduce the clutter.<\/p>\n<p>The end game messages are handled here because the syntax does not allow them within the end rules.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">    (<span style=\"color:#a6e22e\">piece<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1\n        (<span style=\"color:#a6e22e\">move<\/span> Step Orthogonal\n            (<span style=\"color:#a6e22e\">to<\/span> if:(<span style=\"color:#a6e22e\">is<\/span> Empty (<span style=\"color:#a6e22e\">to<\/span>) ))\n            (<span style=\"color:#a6e22e\">then<\/span>\n                (<span style=\"color:#a6e22e\">or<\/span>\n                    (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtGold&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>))\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#a6e22e\">set<\/span> Var <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n                            (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You found the gold&#34;<\/span>)\n                        )\n                    )\n                    (<span style=\"color:#a6e22e\">and<\/span>\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;IsNextToPit&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You feel a breeze. A pit is nearby!&#34;<\/span>))\n                            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;IsNextToWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;Your nose wrinkles from the strong stench. A wumpus is nearby!&#34;<\/span>))\n                        )\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You fall into a pit and die.&#34;<\/span>))\n                            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>)) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You are eaten by the wumpus.&#34;<\/span>))\n                        )\n                    )\n                )\n            )\n        )\n    )\n<\/code><\/pre><\/div><p>As an alternative to the variables, I could use <code>trigger<\/code> and <code>is Triggered<\/code> to track boolean states. Although they are referenced as events within the documentation, there does not appear to be a way to explicitly process some side effect whenever the trigger occurs. However, they could be checked explicitly within the <code>play<\/code> ludeme during turn execution.<\/p>\n<h2 id=\"developer-experience\">Developer Experience<\/h2>\n<p>I often found the development process to be tedious. The Ludii language is defined and enforced through a grammar, but the parser seemed unable to give the developer any explanation of what was incorrect. Rather, if something is wrong, a stack exception will be printed to the console, typically with no sense of which line is wrong. For instance, a missing parentheses would show a StringIndexOutOfBoundsException. By following a very incremental approach, I could at least restrict problems to the lines that had changed and that allowed progress.<\/p>\n<p>That said, the succinct nature of the language is very appealing. As a declarative language, the developer is able to concentrate on the core rules and mechanics and gets a user interface for &ldquo;free.&rdquo;<\/p>\n<h2 id=\"applicability-to-modern-or-euro-style-games\">Applicability to &ldquo;Modern&rdquo; or Euro-Style Games<\/h2>\n<p>I recently played <a href=\"https:\/\/boardgamegeek.com\/boardgame\/385331\/creature-caravan\">Creature Caravan<\/a>, a worker-placement game published in 2024. Each player in the game controls a caravan that moves across a shared board gathering points. Players draw cards and activate powers from creatures they add to their tableau. The game provides players with many options each turn and, since cards are unique, there are many interacting systems. If someone attempted to model this game in Ludii, I think:<\/p>\n<ul>\n<li>Tracking the caravans across the main board would be straight-forward along with marking locations for the mountains, lakes, towers, and such<\/li>\n<li>Restricting actions to the &ldquo;core&rdquo; should be feasible, with workers from the hand being played to specific locations<\/li>\n<li>Modeling cards that are recruited to the tableau would likely be infeasible or very complex. While some creatures are relatively simple (place a worker to gain 2 bread), many have meta abilities like reducing costs or feeding other cards.<\/li>\n<\/ul>\n<p>With the addition of new ludemes, additional styles of games may be supported more easily. The Ludii language does not support adding new ludemes within itself (defines are more of a macro system), so this would require development within the Java source code. This is not a focus area for the Ludii team who are more interested in historical and abstract games.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>My submission of <em>Wumpus World<\/em> was accepted in February 2025 and is part of the 1.3.14 release. The Ludii team has <a href=\"https:\/\/www.cost.eu\/actions\/CA22145\/\">secured additional funding<\/a> to &ldquo;inspire methodologies and applications on how to use game AI to study, reconstruct, and preserve the intangible cultural heritage of games&rdquo;.<\/p>\n<h2 id=\"appendix-source-code-for-version-100\">Appendix: Source code for version 1.0.0<\/h2>\n<p>This listing does not include the &ldquo;info&rdquo; block which contains metadata about the game.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">42\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">43\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">44\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">45\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">46\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">47\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">48\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">49\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">50\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">51\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">52\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">53\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">54\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">55\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">56\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">57\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">58\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">59\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">60\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">61\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">62\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">63\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">64\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">65\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">66\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">67\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">68\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">69\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">70\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">71\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">72\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">73\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">74\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">75\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">76\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">77\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">78\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">79\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">80\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">81\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">82\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">83\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">84\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">85\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">86\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">87\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">88\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">89\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">90\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">91\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">92\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">93\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">94\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">95\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scheme\" data-lang=\"scheme\">(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;IsNextToPit&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Breeze&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;IsNextToWumpus&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Stench&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtGold&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Gold&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span> (<span style=\"color:#a6e22e\">is<\/span> In (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;ArrowPath&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> Between Orthogonal from:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) fromIncluded:True to:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1)))\n(<span style=\"color:#66d9ef\">define <\/span><span style=\"color:#e6db74\">&#34;ArrowLogic&#34;<\/span> (<span style=\"color:#66d9ef\">if <\/span>(= <span style=\"color:#ae81ff\">0<\/span> (<span style=\"color:#a6e22e\">size<\/span> Array (<span style=\"color:#a6e22e\">intersection<\/span> (<span style=\"color:#a6e22e\">array<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>)) (<span style=\"color:#a6e22e\">array<\/span> (<span style=\"color:#e6db74\">&#34;ArrowPath&#34;<\/span>)))))\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;Thunk! The arrow hits a wall. You have failed to kill the wumpus.&#34;<\/span>)\n                            (<span style=\"color:#a6e22e\">remove<\/span> (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) at:EndOfTurn)\n                        )\n                        (<span style=\"color:#a6e22e\">and<\/span>\n                            (<span style=\"color:#a6e22e\">or<\/span>\n                                (<span style=\"color:#a6e22e\">set<\/span> Var <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n                                (<span style=\"color:#a6e22e\">remove<\/span> (<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1) at:EndOfTurn)\n                            )\n                            (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;The wumpus yells in pain and dies&#34;<\/span>)\n                        )\n                     )\n)\n\n\/\/ <span style=\"color:#e6db74\">&#34;Binary&#34;<\/span> Variables: GoldFound, WumpusKilled (<span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#66d9ef\">if <\/span>true, Constants<span style=\"color:#f92672\">.<\/span>UNDEFINED <span style=\"color:#66d9ef\">or <\/span><span style=\"color:#ae81ff\">-1<\/span> <span style=\"color:#66d9ef\">if <\/span>not set \/ false)\n\n(<span style=\"color:#a6e22e\">game<\/span> <span style=\"color:#e6db74\">&#34;Wumpus World&#34;<\/span>\n    (<span style=\"color:#a6e22e\">players<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n    (<span style=\"color:#a6e22e\">equipment<\/span>\n        <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n            (<span style=\"color:#a6e22e\">board<\/span> (<span style=\"color:#a6e22e\">square<\/span> <span style=\"color:#ae81ff\">4<\/span>))\n\n            (<span style=\"color:#a6e22e\">piece<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1\n                (<span style=\"color:#a6e22e\">move<\/span> Step Orthogonal\n                    (<span style=\"color:#a6e22e\">to<\/span> if:(<span style=\"color:#a6e22e\">is<\/span> Empty (<span style=\"color:#a6e22e\">to<\/span>) ))\n                    (<span style=\"color:#a6e22e\">then<\/span>\n                        (<span style=\"color:#a6e22e\">or<\/span>\n                            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtGold&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>))\n                                (<span style=\"color:#a6e22e\">and<\/span>\n                                    (<span style=\"color:#a6e22e\">set<\/span> Var <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span> <span style=\"color:#ae81ff\">1<\/span>)\n                                    (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You found the gold&#34;<\/span>)\n                                )\n                            )\n                            (<span style=\"color:#a6e22e\">and<\/span>\n                                (<span style=\"color:#a6e22e\">and<\/span>\n                                    (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;IsNextToPit&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You feel a breeze. A pit is nearby!&#34;<\/span>))\n                                    (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;IsNextToWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;Your nose wrinkles from the strong stench. A wumpus is nearby!&#34;<\/span>))\n                                )\n                                (<span style=\"color:#a6e22e\">and<\/span>\n                                    (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span>) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You fall into a pit and die.&#34;<\/span>))\n                                    (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>)) (<span style=\"color:#a6e22e\">note<\/span> <span style=\"color:#e6db74\">&#34;You are eaten by the wumpus.&#34;<\/span>))\n                                )\n                            )\n                        )\n                    )\n                )\n            )\n\n            (<span style=\"color:#a6e22e\">hand<\/span> P1 size:1)\n            (<span style=\"color:#a6e22e\">piece<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> P1)\n\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Gold&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;B3&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;C1&#34;<\/span> <span style=\"color:#e6db74\">&#34;C3&#34;<\/span> <span style=\"color:#e6db74\">&#34;D4&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span><span style=\"color:#e6db74\">&#34;A3&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">}<\/span>))\n\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Stench&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> Around (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Wumpus&#34;<\/span>) Orthogonal))\n            (<span style=\"color:#a6e22e\">regions<\/span> <span style=\"color:#e6db74\">&#34;Breeze&#34;<\/span> (<span style=\"color:#a6e22e\">sites<\/span> Around (<span style=\"color:#a6e22e\">sites<\/span> <span style=\"color:#e6db74\">&#34;Pits&#34;<\/span>) Orthogonal))\n        <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    )\n    (<span style=\"color:#a6e22e\">rules<\/span>\n        (<span style=\"color:#a6e22e\">start<\/span>\n            <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n                (<span style=\"color:#a6e22e\">place<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> coord:<span style=\"color:#e6db74\">&#34;A1&#34;<\/span>)\n                (<span style=\"color:#a6e22e\">place<\/span> <span style=\"color:#e6db74\">&#34;Arrow&#34;<\/span> (<span style=\"color:#a6e22e\">handSite<\/span> P1))\n            <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n        )\n\n        (<span style=\"color:#a6e22e\">play<\/span>\n            (<span style=\"color:#a6e22e\">or<\/span>\n                (<span style=\"color:#a6e22e\">forEach<\/span> piece)\n                (<span style=\"color:#a6e22e\">move<\/span>\n                    (<span style=\"color:#a6e22e\">from<\/span> (<span style=\"color:#a6e22e\">handSite<\/span> P1))\n                    (<span style=\"color:#a6e22e\">to<\/span> (<span style=\"color:#a6e22e\">sites<\/span> LineOfSight Farthest at:(<span style=\"color:#a6e22e\">where<\/span> <span style=\"color:#e6db74\">&#34;Human&#34;<\/span> P1) Orthogonal))\n                    (<span style=\"color:#a6e22e\">then<\/span> (<span style=\"color:#e6db74\">&#34;ArrowLogic&#34;<\/span>))\n                )\n            )\n        )\n\n        (<span style=\"color:#a6e22e\">end<\/span> <span style=\"color:#960050;background-color:#1e0010\">{<\/span>\n            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">or <\/span>(<span style=\"color:#66d9ef\">and <\/span>(<span style=\"color:#e6db74\">&#34;AtWumpus&#34;<\/span>) (<span style=\"color:#a6e22e\">!=<\/span> (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>)) (<span style=\"color:#e6db74\">&#34;AtPit&#34;<\/span>))\n                (<span style=\"color:#a6e22e\">result<\/span> Mover Loss)\n            )\n            (<span style=\"color:#66d9ef\">if <\/span>(<span style=\"color:#66d9ef\">and <\/span>(= (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;GoldFound&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>) (= (<span style=\"color:#a6e22e\">var<\/span> <span style=\"color:#e6db74\">&#34;WumpusKilled&#34;<\/span>) <span style=\"color:#ae81ff\">1<\/span>))\n                (<span style=\"color:#a6e22e\">result<\/span> Mover Win)\n            )\n        <span style=\"color:#960050;background-color:#1e0010\">}<\/span>)\n    )\n)\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><h2 id=\"links\">Links<\/h2>\n<p><a href=\"https:\/\/archive.org\/details\/CreativeComputingv01n05SeptemberOctober1975\/page\/n51\/mode\/2up\">Hunt the Wumpus in Creative Computing<\/a><\/p>\n<p><a href=\"https:\/\/cis.temple.edu\/~ingargio\/cis587\/readings\/wumpus.shtml\">Wumpus World<\/a><\/p>\n<p><a href=\"https:\/\/ecai2020.eu\/papers\/1248_paper.pdf\">Ludii - The Ludemic General Game System<\/a><\/p>\n<p><a href=\"https:\/\/ludii.games\/downloads\/LudiiGameLogicGuide.pdf\">Ludii Game Logic Guide<\/a><\/p>\n<p><a href=\"https:\/\/ludii.games\/downloads\/LudiiLanguageReference.pdf\">Ludii Language Reference<\/a><\/p>"},{"title":"Investigating MacPaint's Source Code","link":"https:\/\/ztoz.blog\/posts\/macpaint-source-code\/","pubDate":"Tue, 25 Mar 2025 11:27:36 -0600","guid":"https:\/\/ztoz.blog\/posts\/macpaint-source-code\/","description":"<p>MacPaint is a monochromatic raster image painting program that introduced many people to mouse-driven controls, tool palettes, and copy and paste integration with other applications. One of two launch applications for the Apple Macintosh in 1984, MacPaint is emblematic of the Macintosh&rsquo;s early quirky revolutionary branding, focus on ease of use, and appeal to artistic customers. Using the source code, we examine the design and implementation of the application. We find that the buffer management and bucket filling algorithms demonstrate mechanical empathy with the 68k platform and leverage the limitations of the domain as a means to improve performance. We also find positive and negative aspects in the code style and architecture and its pliability for change. Finally, we dispute some claimed novel aspects of the program while also arguing for its significance and impact on the development of digital graphic systems.<\/p>\n<figure><img src=\"MacPaint-1.5-WithContent.png\"\n         alt=\"Screenshot of MacPaint 1.5 with a drawing\" width=\"100%\"\/><figcaption>\n            <p>MacPaint 1.5 (1985)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p><strong>Correction 04-06<\/strong>: Corrected details on the origin of PCPaint.<\/p>\n<p><strong>Table of Contents<\/strong><\/p>\n<ol>\n<li><a href=\"#background\">Background<\/a><\/li>\n<li><a href=\"#timeline\">Timeline<\/a><\/li>\n<li><a href=\"#developer-bill-atkinson\">Developer: Bill Atkinson<\/a><\/li>\n<li><a href=\"#development-and-testing\">Development and Testing<\/a><\/li>\n<li><a href=\"#design-and-source-code\">Design and Source Code<\/a><\/li>\n<li><a href=\"#interesting-algorithms-and-designs\">Interesting Algorithms and Designs<\/a><\/li>\n<li><a href=\"#alternative-paths-and-competitors\">Alternative Paths and Competitors<\/a><\/li>\n<li><a href=\"#post-release\">Post Release<\/a><\/li>\n<li><a href=\"#conclusion\">Conclusion<\/a><\/li>\n<li><a href=\"#special-recognition\">Special Recognition<\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ol>\n<h2 id=\"background\">Background<\/h2>\n<p>At the Boston Computer Society&rsquo;s general meeting on January 30th 1984, Steve Jobs laid out his rationale for why Apple&rsquo;s newest product, the Macintosh computer, was the third milestone product of the computer industry after Apple&rsquo;s own Apple II and the IBM PC. The Macintosh would be the &ldquo;computer for the rest of us.&rdquo; The Macintosh used the same software as the Lisa bringing the same ease of use from a point-and-click interface and pull down menus as well as sharing the same fast Motorola 68000 processor.<\/p>\n<p>After highlighting hardware features such as portability, the 3.5&quot; diskette drive, and the AppleBus support, Jobs let the computer do a demo on its own. Booting the Mac from a diskette, the Mac displayed its name and the &ldquo;Insanely Great&rdquo; logo, and then showed the first real, application image: a MacPaint screenshot showing a woodcut of a Japanese lady.<\/p>\n<figure><img src=\"04_applegeisha.gif\"\n         alt=\"Screenshot of MacPaint with Japanese lady\" width=\"100%\"\/><figcaption>\n            <p>MacPaint screenshot with Japanese Lady (1984)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Apple used the screenshot heavily within their advertisements and most of the audience would have seen static screenshots of MacPaint and MacWrite in the months beforehand, but the audience was still delighted.<\/p>\n<p>Twenty-nine minutes in, Jobs introduces a panel of the Macintosh development team and Bill Atkinson starts the first demo, a demo of MacPaint that he developed.<\/p>\n<iframe width=\"560\" height=\"315\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/1tQ5XwvjPmA?si=Lq4yE_8IbbeYdNhk&amp;start=1770\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<p>Over the next seven minutes, Atkinson demonstrates how to make art with the program. The audience&rsquo;s first applause comes when he uses an eraser to erase little chunks of previously drawn lines and rectangles. The applause is due to both how easy and quick the tool is to use but also that it demonstrates the program is working with actual pixels; he isn&rsquo;t creating new clipping regions or restricted to deleting entire shapes. The audience also applauded when Bill draw lines filled with patterns and then again when, by holding down the spray can, the paint grows denser. Heralding the feature&rsquo;s value, the fourth applause comes when Bill zooms in and manipulates individual pixels using the Fat Bits mode. The audience similarly loves the many ways areas of the image can be selected, moved and copied.<\/p>\n<p>Atkinson finishes the demo by copying an image of a fish that will soon be pasted into a MacWrite document. In an unspoken nod to the limits of the Macintosh&rsquo;s hardware, Atkinson closes MacPaint so the memory can be used by MacWrite.<\/p>\n<p>Randy Wigginton, who is about to demo MacWrite, pauses to praise Bill&rsquo;s work on QuickDraw, the foundational graphical library for the Lisa and the Macintosh. &ldquo;Without Bill, none of us would be up here on stage.&rdquo;<\/p>\n<p>Bill Atkinson&rsquo;s contributions at Apple, including foundational user interface contributions for the Lisa and Macintosh computers, and one of the two application programs that shipped with the &ldquo;Insanely Great&rdquo; Macintosh as the advertisements claimed, have been well-recorded. With the <a href=\"https:\/\/computerhistory.org\/blog\/macpaint-and-quickdraw-source-code\/\">release of the QuickDraw and MacPaint source code<\/a> in 2010, we have an opportunity to examine the technical design and implementation of his work. This article examines the MacPaint application, how it was built, what are some of the interesting algorithms and engineering trade offs, and how we might measure its impact against the larger industry trends around image painting and rastering technology.<\/p>\n<h2 id=\"timeline\">Timeline<\/h2>\n<p>The development of MacPaint is intertwined with the development of the mouse and the graphical user interface, the Lisa and Macintosh computers, and QuickDraw, the foundational graphics library used by both the Lisa and the Macintosh. This timeline focuses on MacPaint and contemporary competitor painting programs, not on the overall history of raster drawing programs. See (Smith 2001) for a history of early raster drawing programs and their commercial applications and development.<\/p>\n<p><strong>1982<\/strong><\/p>\n<p>Over a period of six weeks, Atkinson develops a prototype painting program that &ldquo;sort-of worked&rdquo; (Young 1985, pg 315). The source code file <code>MyTools.text<\/code>, later renamed to <code>MyTools.a<\/code>, states it was created October 31st.<\/p>\n<p><strong>1983<\/strong><\/p>\n<p>In January, Apple announces the Lisa Computer, although no units are shipped until June. The Lisa includes Atkinson&rsquo;s QuickDraw library.<\/p>\n<p>Microsoft broadens the development of mouse-based applications by releasing their first <em>Microsoft Mouse<\/em>. The package includes a color raster drawing program program called &lsquo;Doodle.&rsquo; Doug Wolfgram releases perhaps the first third-party drawing program with &lsquo;Mouse Draw,&rsquo; which uses the Microsoft Mouse.<\/p>\n<p>Atkinson resumes work on MacPaint, at this point called MacSketch. The set of palettes and tools is already very close to the eventual MacPaint UI. The image in the screenshot (below) is celebrating ROM 2.0; based on MyTools.a, this would date the image between February 13th and March 16th, when the file was regenerated for ROM 2.0 but before ROM 2.4.<\/p>\n<figure><a href=\"https:\/\/www.folklore.org\/Busy_Being_Born,_Part_2.html\"><img src=\"early_macpaint_t.jpg\"\n         alt=\"Screenshot of MacSketch with UI similar to MacPaint\"\/><\/a><figcaption>\n            <p>MacSketch (MacPaint c. 1983; Source folklore.org)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>MacSketch is renamed MacPaint in April. Between then and October, Atkinson iterates on the program adding features and improving performance. The last entry in MyTools.a is dated September 1983.<\/p>\n<p>In December, Apple advertises the Macintosh with a <a href=\"https:\/\/www.digibarn.com\/collections\/ads\/apple-mac\/index.htm\">full-color brochure<\/a>. MacPaint is featured prominently and is used to educate the public on how tool palettes, menus, and copy-paste work. The ad also mentions that the content area can be scrolled for more work space.<\/p>\n<p><strong>1984<\/strong><\/p>\n<p>With great fanfare, the Apple Macintosh is shown at the Boston Computer Society January 30th General Meeting. Bill Atkinson demos MacPaint (and implicitly, QuickDraw) to the crowd. The original Macintosh 128k comes with two applications: MacWrite and MacPaint.<\/p>\n<p>In May, MacPaint 1.3 is released as part of a free software update to customers. This version adds the ability to lasso an object and repeatedly fill it (via the Fill item in the Edit menu) with a pattern.<\/p>\n<p>In September, MacPaint 1.4 is released along with the Macintosh 512k.<\/p>\n<p>Competitors quickly adopt the MacPaint interface. In June, Mouse Systems ships <em>PCPaint 1.0<\/em> bundled with a mouse in competition with Microsoft. <em>PCPaint<\/em>, co-developed by John Bridges and Doug Wolfgram, is based on Apple&rsquo;s MousePaint program (<a href=\"https:\/\/youtu.be\/WVCPY-BqYgo?si=-i97Cfz8SLLEojup&amp;t=58\">demoed in November 1983 at Comdex<\/a>), which itself is based on an early, unreleased version of MacPaint. (MacPaint originally featured a menubar &ldquo;Aids&rdquo; but changed it to &ldquo;Goodies&rdquo; before it was released.) Similarly, ZSoft Corporation ships their <em>PC Paintbrush<\/em>, also DOS-based but with an interface derived from MacPaint.<\/p>\n<p><strong>1985<\/strong><\/p>\n<p>Microsoft releases a new version of their mouse. They drop their &lsquo;Doodle&rsquo; program and replace it with a rebranded version of <em>PC Paintbrush<\/em>, licensed from ZSoft Corporation. In April, Apple releases System Software 2.0 which includes MacPaint 1.5 (<a href=\"https:\/\/archive.org\/details\/mac_Paint_2\">https:\/\/archive.org\/details\/mac_Paint_2<\/a>). This is the last version until 1988, when MacPaint 2.0 is released by a <a href=\"https:\/\/www.folklore.org\/Evolution_of_a_classic.html\">new developer<\/a>. There are no more official releases of MacPaint.<\/p>\n<h2 id=\"developer-bill-atkinson\">Developer: Bill Atkinson<\/h2>\n<p>When you start MacPaint, the name Bill Atkinson briefly flashes in the credits. You can also find Atkinson&rsquo;s name and a small portrait in the About menu.<\/p>\n<figure><img src=\"MacPaint-1.5-AboutWindowCropped.png\"\n         alt=\"Screenshot of About window from MacPaint showing Bill Atkinson as author with portrait\" width=\"100%\"\/><figcaption>\n            <p>MacPaint 1.5 About Window (1985)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Ironically for someone who would rise to be an Apple Fellow, Bill Atkinson did not receive classical training in computing. His undergraduate education was in chemistry and biochemistry with graduate training in neuroscience. However, he was not disconnected from the computing scene. He built both an IMSAI and an Altair computer (Atkinson 2004, pg 4). Further, his college work emphasized using computers and he made contacts that would prove highly influential to his career.<\/p>\n<p>While at UC San Diego, Atkinson met Jef Raskin and was introduced to Raskin&rsquo;s unconventional computer lab which emphasized direct and real-time connections with computers. He also met Guy Bud Tribble at UCSD, who would later live with Atkinson and help develop the Macintosh. Atkinson&rsquo;s mentor at the University of Washington, Kent Wilson, introduced him to computer graphics and innovations in the field, such as Ivan Sutherland&rsquo;s work on Sketchpad.<\/p>\n<p>Lured to Apple in 1978 by Jef Raskin, Atkinson became the company&rsquo;s first application software developer. His first project was a stock portfolio evaluator because, while Apple featured one in an advertisement, they did not have any in the catalog (Atkinson 2004, pg 7). His second major project was helping port the UCSD Pascal system to the Apple II. With no other structured programming options, Lisa development adopted this version of Pascal (Atkinson 2004, pg 9).<\/p>\n<p>In 1979, Steve Jobs made visits to Xerox Parc along with a small group of Apple employees, including Atkinson. At Parc they were shown the Alto computer, Smalltalk programming language, and (likely) the Bravo text editor (Atkinson 2004, pg 18). Transferred to the Lisa project, Atkinson was responsible for LisaGraf, the foundational graphics library (later named QuickDraw), as well as the original window manager, menu manager, and event manager (Atkinson 2004, pg 20).<\/p>\n<p>Since the Macintosh re-used software from the Lisa, in particular Atkinson&rsquo;s user interface and graphics code, he moved onto the Macintosh team where he started to work on MacPaint in earnest in 1983.<\/p>\n<p>Unlike the founders of Adobe, Atkinson did not come from the computer graphics research world, but was familiar with research advances through his academic mentors and had met luminaries such as Douglas Engelbart. Similar to the Adobe founders who worked on the same problems multiple times, he had time to iterate, with several years of experiments as Lisa developed from a research effort into a product. Intentionally straddling both foundational and application development, Atkinson had the viewpoint of a &ldquo;vertical integrator,&rdquo; able to control where functionality should go and how the interfaces should work.<\/p>\n<h2 id=\"development-and-testing\">Development and Testing<\/h2>\n<p>Finding life in the Apple office too &ldquo;busy&rdquo; (Atkinson 2004, pg 20), he worked out of a home laboratory using a prototype Apple Lisa. The Lisa had a &ldquo;Workshop&rdquo; mode which featured a <a href=\"https:\/\/winworldpc.com\/product\/apple-lisa-pascal-workshop\/20\">graphical editor and a command-line environment<\/a> for compiling and other development activities.<\/p>\n<p>Atkinson took Polaroids of the user interface as it evolved and drove into work to share them with the team. Fortunately for posterity, Atkinson saved the Polaroids and we have a detailed visual history of the evolution of the Lisa interface and QuickDraw capabilities as shown in the 2022 CHM interview below. (SketchPad is briefly shown and discussed, starting at 9:45.)<\/p>\n<iframe width=\"560\" height=\"315\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/Qg0mHFcB510?si=o2WL679_4p7r2gbl\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<p>The Macintosh team used the &lsquo;Monkey&rsquo; as a durability and robustness test mechanism. Developed by Steve Capps, the Monkey would randomly type keys, move objects, and interact with menus (Atkinson 2010, pg 14-15). The team used a computer running in Monkey mode to effectively stress-test an application. MacPaint was able to survive two weeks without crashing. Monkey mode can be seen in the source code:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">5\n<\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">6\n<\/span><\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">9\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-asm\" data-lang=\"asm\">         <span style=\"color:#a6e22e\">.FUNC<\/span> <span style=\"color:#66d9ef\">Monkey<\/span>\n<span style=\"color:#75715e\">;---------------------------------------------------------------------\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  FUNCTION Monkey: BOOLEAN;\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"color:#75715e\"><\/span>         <span style=\"color:#a6e22e\">TST<\/span>      <span style=\"color:#66d9ef\">MonkeyLives<\/span>                <span style=\"color:#75715e\">;IS THE MONKEY ACTIVE ?\n<\/span><\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">SGE<\/span>      <span style=\"color:#ae81ff\">4<\/span>(<span style=\"color:#66d9ef\">SP<\/span>)                      <span style=\"color:#75715e\">;YES IF &gt;= ZERO\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">NEG.B<\/span>    <span style=\"color:#ae81ff\">4<\/span>(<span style=\"color:#66d9ef\">SP<\/span>)                      <span style=\"color:#75715e\">;CONVERT TO PASCAL BOOLEAN\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">RTS<\/span>\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>To prevent Monkey mode from quitting the program and thus ending the test prematurely, the Pascal code calls this function to selectively disable the Apple menu, File menu, and the Quit Program command.<\/p>\n<p>Susan Kare, who served as the graphics designer for the Macintosh, was the main customer of MacPaint. Atkinson watched her use MacPaint and &ldquo;see what she stumbled on or wished she had&rdquo; (Atkinson 2004, pg 47). As the only true artist on the team, and someone who used MacPaint as a tool for their job, Kare&rsquo;s feedback was invaluable. Andy Hertzfeld characterized her impact as &ldquo;I think a lot of the refinement of MacPaint came from watching an actual user, an actual artist use the program on a day-to-day basis.&rdquo; (Atkinson 2010, pg 9). Similarly, Atkinson states &ldquo;I would credit Susan Kare as a co-designer of MacPaint because she used it as I was trying to write it.&rdquo; (ibid).<\/p>\n<h2 id=\"design-and-source-code\">Design and Source Code<\/h2>\n<h3 id=\"physical-description\">Physical Description<\/h3>\n<p>The MacPaint 1.3 distribution consists of five files:<\/p>\n<ol>\n<li><code>MacPaint.p<\/code>, 4,688 lines of Pascal (<a href=\"http:\/\/pascal.hansotten.com\/ucsd-p-system\/apple-pascal\/\">Lisa Pascal variant<\/a>)<\/li>\n<li><code>MacPaint.rsrc<\/code>, resource description for the program containing icons, strings, and other localizable attributes. The version string identifies itself as version 1.3.<\/li>\n<li><code>MyHeapAsm.a<\/code>, 67 lines of assembly for calls into system memory management routines<\/li>\n<li><code>MyTools.a<\/code>, 300 lines of assembly defining traps or external calls into QuickDraw. A comment states this file was mostly generated via <code>MakeTTraps<\/code>. This is the only file with a change log and content attributed to someone other than Bill Atkinson.<\/li>\n<li><code>PaintAsm.a<\/code>, 1,809 lines of assembly containing application code called from the Pascal code<\/li>\n<\/ol>\n<p>We counted physical lines of code using the <code>pascal_count<\/code> and <code>asm_count<\/code> programs, both part of David A. Wheeler&rsquo;s SLOCCount suite.<\/p>\n<h3 id=\"data-types-and-structures\">Data Types and Structures<\/h3>\n<p>MacPaint defines very few datatypes for its own use, leveraging instead types from QuickDraw such as Point, Rect (Rectangle), Pattern, and BitMap.<\/p>\n<p>The Pascal code heavily uses global variables; the list extends from lines 212 through 370, with white-space used to group them by commonality. Most of the global variables are used to store various flags or interface state, such as the current font specification. This list is distinct from global constants (lines 27 through 190). Most of the constants are used to specify menu items, buttons, or other interface elements.<\/p>\n<p>The application is a set of tools that, ultimately, modify the document which is a fixed size 1-bit bitmap stored as an array of integers. The document lacks dynamic metadata. As pixels are pretty simple, few abstractions or data types are required.<\/p>\n<h3 id=\"pascal-vs-assembly\">Pascal vs. Assembly<\/h3>\n<p>Roughly one-third of MacPaint&rsquo;s lines of code are in Motorola 68000 assembly, while two-thirds are in Pascal. In (Young 1985, pg 316), Atkinson explains the rationale and benefit of both languages:<\/p>\n<blockquote>\n<p>By frequency of working on it, I would bring up the Pascal file 20 or 30 times for every one time I brought up the assembly language file. Basically, the information in assembly language doesn\u2019t really need a lot of maintenance. The assembly-language portion contains things that are there for speed, or that were small and I knew wouldn\u2019t need a lot of maintenance. I put them in assembly language just to reduce the code size. By keeping the main control, flow, and logic in Pascal, the program was more pliable.<\/p>\n<\/blockquote>\n<p>Atkinson&rsquo;s rationale is supported by the list of procedures in each language. Performance-critical code, such as that which manipulates the buffers directly, is in assembly. User interface control logic is in Pascal, as well as code that handles initial setup or rare operations. Operating system calls, such as checking the amount of spare space in the disk drive or invoking the system beep, are in assembly. Since MacPaint was developed concurrently with the operating system, some functionality may have existed in the ROM but had not yet been exposed by Pascal system libraries.<\/p>\n<p>As an example of Atkinson&rsquo;s assembly style and quality, we present the <code>NearPt<\/code> function:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-asm\" data-lang=\"asm\">         <span style=\"color:#a6e22e\">.FUNC<\/span> <span style=\"color:#66d9ef\">NearPt<\/span>\n<span style=\"color:#75715e\">;-----------------------------------------------------------\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  FUNCTION NearPt(pt1,pt2: Point; tol: INTEGER): BOOLEAN;\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  NearPt:=((ABS(pt1.h-pt2.h) &lt; tol) AND (ABS(pt1.v-pt2.v) &lt; tol));\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#a6e22e\">MOVE.L<\/span>   (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">A0<\/span>                   <span style=\"color:#75715e\">;pop return addr\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">MOVE<\/span>     (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D0<\/span>                   <span style=\"color:#75715e\">;pop tolerance\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">MOVE.L<\/span>   (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D1<\/span>                   <span style=\"color:#75715e\">;pop pt2\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">MOVE.L<\/span>   (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                   <span style=\"color:#75715e\">;pop pt1\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">CLR.B<\/span>    (<span style=\"color:#66d9ef\">SP<\/span>)                       <span style=\"color:#75715e\">;assume result FALSE\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">SUB.W<\/span>    <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                      <span style=\"color:#75715e\">;calc delta horiz\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">BGE.S<\/span>    <span style=\"color:#66d9ef\">DHPOS<\/span>                      <span style=\"color:#75715e\">;continue if dh positive\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">NEG.W<\/span>    <span style=\"color:#66d9ef\">D2<\/span>                         <span style=\"color:#75715e\">;else negate for abs value\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">DHPOS<\/span>    <span style=\"color:#66d9ef\">CMP.W<\/span>    <span style=\"color:#66d9ef\">D0<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                      <span style=\"color:#75715e\">;is ABS(dh) &lt; tol ?\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">BGE.S<\/span>    <span style=\"color:#66d9ef\">FALSE<\/span>                      <span style=\"color:#75715e\">;no, return false\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">SWAP<\/span>     <span style=\"color:#66d9ef\">D1<\/span>                         <span style=\"color:#75715e\">;get pt2.v\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">SWAP<\/span>     <span style=\"color:#66d9ef\">D2<\/span>                         <span style=\"color:#75715e\">;get pt1.v\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">SUB.W<\/span>    <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                      <span style=\"color:#75715e\">;calc delta vert\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">BGE.S<\/span>    <span style=\"color:#66d9ef\">DVPOS<\/span>                      <span style=\"color:#75715e\">;continue if dv positive\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">NEG.W<\/span>    <span style=\"color:#66d9ef\">D2<\/span>                         <span style=\"color:#75715e\">;else negate for abs value\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">DVPOS<\/span>    <span style=\"color:#66d9ef\">CMP.W<\/span>    <span style=\"color:#66d9ef\">D0<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                      <span style=\"color:#75715e\">;is ABS(dv) &lt; tol ?\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">BGE.S<\/span>    <span style=\"color:#66d9ef\">FALSE<\/span>                      <span style=\"color:#75715e\">;no, return FALSE\n<\/span><span style=\"color:#75715e\"><\/span>         <span style=\"color:#66d9ef\">MOVE.B<\/span>   <span style=\"color:#75715e\">#1,(SP)                    ;result := TRUE\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">FALSE<\/span>    <span style=\"color:#66d9ef\">JMP<\/span>      (<span style=\"color:#66d9ef\">A0<\/span>)                       <span style=\"color:#75715e\">;and return\n<\/span><\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p><code>NearPt<\/code> returns true if two points are &ldquo;close enough,&rdquo; even if they are not truly equal. Since points use integral values, this code is not for handling floating point error, but for human imprecision. For instance, if a user was trying to close a polygon by clicking on a previous point, the code allows the polygon to be closed if the user clicks within a few pixels of an earlier point. Similarly, the mouse may slip a small distance during a double click. While this is not particularly performance critical code as it is called a relatively small number of times, it is code that is unlikely to need to change.<\/p>\n<p>As is typical for his assembly code, the function is documented with the Pascal calling convention. This function is further documented with a mathematical definition; few functions or procedures warranted descriptive comments. Each line is commented semantically. Almost every assembly line in <code>PaintAsm.a<\/code> is similarly commented.<\/p>\n<p>In contrast, <code>HVConstain<\/code> is a Pascal procedure. We chose it as an example because it is relatively short and self-enclosed. This procedure is used to constrain or trap an anchor while drawing to a particular direction or a 45 degree angle. For example, the user can hold down shift while drawing a rectangle to force the drawing of a square. Or, while drawing a line, force it to be parallel with the edges of the screen.<\/p>\n<div class=\"highlight\" linesno=\"table\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\">\t<span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#a6e22e\">HVConstrain<\/span>(VAR newPt: Point);\n\tVAR dh,dv: <span style=\"color:#66d9ef\">INTEGER<\/span>;\n\t<span style=\"color:#66d9ef\">BEGIN<\/span>\n\t  <span style=\"color:#66d9ef\">IF<\/span> shiftFlag <span style=\"color:#66d9ef\">THEN<\/span>  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> constrain to horiz <span style=\"color:#66d9ef\">or<\/span> vert <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t    <span style=\"color:#66d9ef\">BEGIN<\/span>\n\t      <span style=\"color:#66d9ef\">IF<\/span> hConstrain <span style=\"color:#f92672\">AND<\/span> vConstrain <span style=\"color:#66d9ef\">THEN<\/span>  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> still chosing direction <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t        <span style=\"color:#66d9ef\">BEGIN<\/span>\n\t          dh := <span style=\"color:#66d9ef\">ABS<\/span>(newPt.h<span style=\"color:#f92672\">-<\/span>ptConstrain.h);\n\t          dv := <span style=\"color:#66d9ef\">ABS<\/span>(newPt.v<span style=\"color:#f92672\">-<\/span>ptConstrain.v);\n\t          <span style=\"color:#66d9ef\">IF<\/span> (dh <span style=\"color:#f92672\">&gt;<\/span> dv) <span style=\"color:#f92672\">AND<\/span> (dh <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#66d9ef\">THEN<\/span> vConstrain := <span style=\"color:#66d9ef\">FALSE<\/span>;\n\t          <span style=\"color:#66d9ef\">IF<\/span> (dv <span style=\"color:#f92672\">&gt;<\/span> dh) <span style=\"color:#f92672\">AND<\/span> (dv <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#66d9ef\">THEN<\/span> hConstrain := <span style=\"color:#66d9ef\">FALSE<\/span>;\n\t        <span style=\"color:#66d9ef\">END<\/span>;\n\t      <span style=\"color:#66d9ef\">IF<\/span> hConstrain <span style=\"color:#66d9ef\">THEN<\/span> newPt.v := ptConstrain.v;  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> horiz <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t      <span style=\"color:#66d9ef\">IF<\/span> vConstrain <span style=\"color:#66d9ef\">THEN<\/span> newPt.h := ptConstrain.h;  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> vert  <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t    <span style=\"color:#66d9ef\">END<\/span>;\n\t<span style=\"color:#66d9ef\">END<\/span>;\n<\/code><\/pre><\/div><p>Atkinson has a very consistent naming style for his variables and the code is usually highly readable. In contrast to his assembly code, comments are rare and terse in the Pascal code, although still focused on explaining the semantic purpose of the line.<\/p>\n<p><code>HVConstrain<\/code> uses one global variable, <code>shiftFlag<\/code>, to track the user mode, and three global variables to store state: <code>hConstrain<\/code>, <code>vConstrain<\/code>, and <code>ptConstrain<\/code>. Although global variables, there is only one other procedure that accesses the latter three variables directly: <code>InitConstrain<\/code>. The three variables are commented as belonging to <code>HVConstrain<\/code>. The design allows a developer to use the variables without proper initialization or accidentally modifying them. Although this is a violation of the design principle of encapsulation, we believe this was an effective trade-off of code complexity and memory resources.<\/p>\n<h3 id=\"message-loop\">Message Loop<\/h3>\n<p>MacPaint is an early event-driven program. The core of the program is:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\">\tREPEAT\n\t\t\t<span style=\"color:#960050;background-color:#1e0010\">[<\/span>...<span style=\"color:#960050;background-color:#1e0010\">]<\/span>\n\t\t    <span style=\"color:#66d9ef\">IF<\/span> GetNextEvent(everyEvent,theEvent) <span style=\"color:#66d9ef\">THEN<\/span> ProcessTheEvent;\n\t\t\t<span style=\"color:#960050;background-color:#1e0010\">[<\/span>...<span style=\"color:#960050;background-color:#1e0010\">]<\/span>\n\t<span style=\"color:#66d9ef\">UNTIL<\/span> quitFlag;\n<\/code><\/pre><\/div><p><code>GetNextEvent<\/code> is a Toolbox Event Manager function that fetches the next event, if it exists, from the event queue. <code>ProcessTheEvent<\/code> is an application procedure that is a long CASE statement that maps event locations to buttons or, more accurately rectangular areas of the screen. <code>ProcessTheEvent<\/code> calls other application procedures that control tool-specific modes. <code>ProcessTheEvent<\/code> is sufficiently low-level it has to measure the time since the last click to differentiate between single and double clicks. Since the user interface is fixed (windows cannot be moved), the code is tedious but easy to follow.<\/p>\n<h3 id=\"example-straight-line-tool\">Example: Straight Line Tool<\/h3>\n<p>The Straight Line tool is an exemplar of the software design and the intermix of Pascal, assembly, and QuickDraw functionality. For the <code>StraightLine<\/code> procedure to be invoked, the user will have previously selected the straight line tool from the tool palette and then clicked (and held) the mouse within the content area. While in this procedure (the straight line &ldquo;mode&rdquo;), one end of the line will be anchored at the initial point (<code>startPt<\/code>) while the other end will follow the cursor until the user releases the mouse button. The mode-driven interface gives the user continuous feedback on what the eventual line will look like within the painting.<\/p>\n<div class=\"highlight\" linesnos=\"table\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\">\t<span style=\"color:#960050;background-color:#1e0010\">{$<\/span>S         <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t<span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#a6e22e\">StraightLine<\/span>;\n\tVAR newPt,oldPt,startPt: Point;\n\t    lineTop,lineBot: <span style=\"color:#66d9ef\">INTEGER<\/span>;\n\t<span style=\"color:#66d9ef\">BEGIN<\/span>\n\t  JamLine;\n\t  PinGridMouse(startPt);\n\t  oldPt.h := <span style=\"color:#ae81ff\">1000<\/span>;      <span style=\"color:#960050;background-color:#1e0010\">{<\/span> force first time <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t  REPEAT\n\t    PinGridMouse(newPt);\n\t    <span style=\"color:#66d9ef\">IF<\/span> shiftFlag <span style=\"color:#66d9ef\">THEN<\/span> Constrain(startPt,newPt,<span style=\"color:#66d9ef\">TRUE<\/span>);\n\t    <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#f92672\">NOT<\/span> EqualPt(newPt,oldPt) <span style=\"color:#66d9ef\">THEN<\/span>\n\t      <span style=\"color:#66d9ef\">BEGIN<\/span>\n\t        MainToAlt;        <span style=\"color:#960050;background-color:#1e0010\">{<\/span> erase old <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\t        AltBufLine(startPt,newPt,oldPt);\n\t        oldPt := newPt;\n\t      <span style=\"color:#66d9ef\">END<\/span>;\n\t  <span style=\"color:#66d9ef\">UNTIL<\/span> <span style=\"color:#f92672\">NOT<\/span> StillDown;\n\t<span style=\"color:#66d9ef\">END<\/span>;\n<\/code><\/pre><\/div><p>The first line <code>{$S      }<\/code> is a compiler directive stating that this procedure should live in the default main segment (for more about segments, see the section below <em>Allocation Failure and Segment Anti-Fragmentation<\/em>)<\/p>\n<p><code>JamLine<\/code> resets some QuickDraw state using <code>PenNormal<\/code> and then sets the pen&rsquo;s size, pattern, and mode (depending on keys being depressed) to the current palette settings.<\/p>\n<p><code>PinGridMouse<\/code> sets the passed in variable to the current position of the mouse as modified by various modes (e.g. snap to grid, fat bits). Similarly, <code>Constrain<\/code> sets  <code>newPt<\/code> to a 45-degree constrained point value if the user is holding down the shift key.<\/p>\n<p>The <code>REPEAT<\/code> block tests the <code>StillDown<\/code> condition. <code>StillDown<\/code> is a Toolbox Event Manager function that will return false even if the user released and then quickly pressed the mouse button again between invocations.<\/p>\n<p>If a line exists (<code>IF NOT EqualPt<\/code>), the main buffer will be copied into the alt(ernate) buffer. (QuickDraw uses integral values for a <code>Point<\/code>&rsquo;s coordinates, so <code>EqualPt<\/code> does not need a tolerance parameter.) <code>MainToAlt<\/code> calls <code>BufToBuf<\/code>, which is an assembly routine that includes the same <code>MOVEM<\/code> optimization as <code>BufToScrn<\/code> (see section <em>Fast Buffer to Screen Copy<\/em>). <code>AltBufLine<\/code> writes, to the alt buffer, the line from <code>startPt<\/code> to <code>newPt<\/code> using the <code>MoveTo<\/code> and <code>LineTo<\/code> QuickDraw routines. The contents from the alt buffer are then sent to the screen buffer using a boundary box that includes the <code>oldPt<\/code>, thus eliminating (by redrawing) any previous line sent to the screen, but also reducing the amount of written data. (<code>BandToScrn<\/code> also takes care to hide and show the cursor.)<\/p>\n<p>The <code>lineTop<\/code> and <code>lineBot<\/code> variables are declared but unused. Our assumption is that part of the functionality of <code>AltBufLine<\/code> originally was part of the <code>StraightLine<\/code> as it is the only other block to use variables with the same names. The program also uses the constant <code>lineTop<\/code> to describe the top of the line size palette window, so the compiler must not have considered duplicate <code>lineTop<\/code> declarations as an error.<\/p>\n<h2 id=\"interesting-algorithms-and-designs\">Interesting Algorithms and Designs<\/h2>\n<h3 id=\"fast-buffer-to-screen-copy\">Fast Buffer to Screen Copy<\/h3>\n<p>In an interview with the Macintosh team (Lemmon 1984), the team explains their process for optimizing code size and processing time. After arguing that code must be first made correct before being made fast, Atkinson brings up register allocation. He states:<\/p>\n<blockquote>\n<p>This little baby, the 68000, has sixteen 32-bit registers sitting there, and the way you get performance out of that is to keep them full. Keep the registers full of important stuff all the time. That&rsquo;s the way you make this processor sing.<\/p>\n<\/blockquote>\n<p>The buffer copying code, which is performance critical, illustrates this technique.<\/p>\n<p>MacPaint uses two off-screen buffers for rendering which are then copied to the screen buffer for display. Within the pascal code, the two buffer&rsquo;s storage are declared as:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\">   mainBuf:            <span style=\"color:#66d9ef\">ARRAY<\/span><span style=\"color:#960050;background-color:#1e0010\">[<\/span><span style=\"color:#ae81ff\">0.<\/span>.<span style=\"color:#ae81ff\">239<\/span>,<span style=\"color:#ae81ff\">0.<\/span>.<span style=\"color:#ae81ff\">12<\/span><span style=\"color:#960050;background-color:#1e0010\">]<\/span> <span style=\"color:#66d9ef\">OF<\/span> LongInt;     \n   altBuf:             <span style=\"color:#66d9ef\">ARRAY<\/span><span style=\"color:#960050;background-color:#1e0010\">[<\/span><span style=\"color:#ae81ff\">0.<\/span>.<span style=\"color:#ae81ff\">239<\/span>,<span style=\"color:#ae81ff\">0.<\/span>.<span style=\"color:#ae81ff\">12<\/span><span style=\"color:#960050;background-color:#1e0010\">]<\/span> <span style=\"color:#66d9ef\">OF<\/span> LongInt;  \n<\/code><\/pre><\/div><p>In Pascal, range definitions are inclusive, so each buffer contains 240 rows of 13 LongInts each. The content area is fixed at 416 pixels by 240 pixels. Since a LongInt contains 32 bits, each row stores 416 single-bit pixels.<\/p>\n<p>While MacPaint&rsquo;s user interface looks well-proportioned to the Macintosh&rsquo;s display, Atkinson could have designed the content area to be larger or adjusted the design for a portrait arrangement. However, the need to keep registers full suggests the technical reason for the given layout. If we look at the <code>BufToScrn<\/code> assembly code:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-asm\" data-lang=\"asm\">        <span style=\"color:#a6e22e\">.PROC<\/span> <span style=\"color:#66d9ef\">BufToScrn<\/span>,<span style=\"color:#ae81ff\">2<\/span>\n<span style=\"color:#75715e\">;--------------------------------------------------------\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  PROCEDURE BufToScrn(bufPtr,scrnPtr: Ptr; top,bottom: INTEGER);\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  top and bottom coords are relative to start of buffer\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  cursor has already been hidden.\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#a6e22e\">MOVE.L<\/span>    (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D0<\/span>                 <span style=\"color:#75715e\">;POP RETURN ADDR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE<\/span>      (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D1<\/span>                 <span style=\"color:#75715e\">;POP BOTTOM\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE<\/span>      (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                 <span style=\"color:#75715e\">;POP TOP\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE.L<\/span>    (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">A1<\/span>                 <span style=\"color:#75715e\">;POP SCRNPTR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE.L<\/span>    (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">A0<\/span>                 <span style=\"color:#75715e\">;POP BUFPTR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE.L<\/span>    <span style=\"color:#66d9ef\">D0<\/span>,-(<span style=\"color:#66d9ef\">SP<\/span>)                 <span style=\"color:#75715e\">;PUSH RETURN ADDR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVEM.L<\/span>   <span style=\"color:#66d9ef\">D3-D7<\/span><span style=\"color:#960050;background-color:#1e0010\">\/<\/span><span style=\"color:#66d9ef\">A2-A6<\/span>,-(<span style=\"color:#66d9ef\">SP<\/span>)        <span style=\"color:#75715e\">;SAVE REGS\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">SUB<\/span>       <span style=\"color:#66d9ef\">D2<\/span>,<span style=\"color:#66d9ef\">D1<\/span>                    <span style=\"color:#75715e\">;CALC HEIGHT\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">BLE.S<\/span>     <span style=\"color:#66d9ef\">GOHOME<\/span>                   <span style=\"color:#75715e\">;QUIT IF COUNT &lt;= 0\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE<\/span>      <span style=\"color:#66d9ef\">D1<\/span>,-(<span style=\"color:#66d9ef\">SP<\/span>)                 <span style=\"color:#75715e\">;INIT ROW COUNT\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVE<\/span>      <span style=\"color:#66d9ef\">D2<\/span>,<span style=\"color:#66d9ef\">D1<\/span>                    <span style=\"color:#75715e\">;COPY TOP COORD\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MULU<\/span>      <span style=\"color:#75715e\">#52,D1                   ;CALC SRC OFFSET\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">ADD.L<\/span>     <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">A0<\/span>                    <span style=\"color:#75715e\">;OFFSET SRCPTR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MULU<\/span>      <span style=\"color:#66d9ef\">screenRow<\/span>,<span style=\"color:#66d9ef\">D2<\/span>             <span style=\"color:#75715e\">;CALC SCRN OFFSET\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">ADD.L<\/span>     <span style=\"color:#66d9ef\">D2<\/span>,<span style=\"color:#66d9ef\">A1<\/span>                    <span style=\"color:#75715e\">;OFFSET SCRNPTR\n<\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">NXTROW<\/span>    <span style=\"color:#66d9ef\">MOVEM.L<\/span>   (<span style=\"color:#66d9ef\">A0<\/span>),<span style=\"color:#66d9ef\">D0-D7<\/span><span style=\"color:#960050;background-color:#1e0010\">\/<\/span><span style=\"color:#66d9ef\">A2-A6<\/span>         <span style=\"color:#75715e\">;SUCK UP 13 LONGS FROM BUF\n<\/span><\/span><span style=\"display:block;width:100%;background-color:#3c3d38\"><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">MOVEM.L<\/span>   <span style=\"color:#66d9ef\">D0-D7<\/span><span style=\"color:#960050;background-color:#1e0010\">\/<\/span><span style=\"color:#66d9ef\">A2-A6<\/span>,(<span style=\"color:#66d9ef\">A1<\/span>)         <span style=\"color:#75715e\">;SPIT THEM OUT TO SCREEN\n<\/span><\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">ADD<\/span>       <span style=\"color:#75715e\">#52,A0                   ;BUMP SRCPTR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">ADD<\/span>       <span style=\"color:#66d9ef\">screenRow<\/span>,<span style=\"color:#66d9ef\">A1<\/span>             <span style=\"color:#75715e\">;BUMP SCREENPTR\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">SUB<\/span>       <span style=\"color:#75715e\">#1,(SP)                  ;DECREMENT ROWCOUNT\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">BNE<\/span>       <span style=\"color:#66d9ef\">NXTROW<\/span>                   <span style=\"color:#75715e\">;LOOP 240 ROWS\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">TST<\/span>       (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>                    <span style=\"color:#75715e\">;POP ROW COUNT\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">GOHOME<\/span>    <span style=\"color:#66d9ef\">MOVEM.L<\/span>   (<span style=\"color:#66d9ef\">SP<\/span>)<span style=\"color:#960050;background-color:#1e0010\">+<\/span>,<span style=\"color:#66d9ef\">D3-D7<\/span><span style=\"color:#960050;background-color:#1e0010\">\/<\/span><span style=\"color:#66d9ef\">A2-A6<\/span>        <span style=\"color:#75715e\">;RESTORE REGS\n<\/span><span style=\"color:#75715e\"><\/span>          <span style=\"color:#66d9ef\">RTS<\/span>                                <span style=\"color:#75715e\">;AND RETURN\n<\/span><\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>The <code>NXTROW<\/code> loop is executed 240 times with each execution copying 13 LongInts from the buffer. The Motorola 68k supports a <a href=\"https:\/\/www.looksgoodworkswell.com\/elegance-of-macpaint-code\/\">&ldquo;move multiple&rdquo; (MOVEM) instruction that accepts up to 13 registers<\/a> as the source or destination. (The <code>.L<\/code> informs the assembler we are copying long values.) According to (Motorola 1993), a MOVEM instruction requires \\(12 + 4n\\) clock periods to move memory from an address stored in an A register to registers and \\(8 + 8n\\) clock periods to move values stored in registers to a memory location (Table 9-16). Thus, memory transfers require \\(20 + 12n\\) (where \\(n\\) is the number of registers) or 176 clock periods. In contrast, if the transfers were performed via MOVE instructions, each transfer requires 12 clock periods or 312 clock periods total (Table 9-18). By maximizing the throughput possible with MOVEM by populating all (relevant) registers, each buffer to screen transfer saves 136 clock periods.<\/p>\n<h3 id=\"bucket-fill-seed-fill\">Bucket Fill (Seed Fill)<\/h3>\n<p>Bucket Fill is an algorithm that, starting from a given pixel, travels along all adjacent pixels sharing a base color and transforms those pixels to a target color. MacPaint implements a variant of this function by filling the space with a pattern, rather than just a single color. As the boundaries can be arbitrarily complex, this can be an expensive computation. In the worst-case, filling an empty screen, this requires 416 by 240 pixel checks or 99,840 checks in total. Through various tricks, Atkinson&rsquo;s implementation reduces the amount of work and makes the operation feel fast.<\/p>\n<p>(This algorithm is not used when the boundaries are known, such as painting a filled in rectangle. QuickDraw supported drawing filled in polygons directly.)<\/p>\n<p>By 1983, several researchers had investigated and published algorithms for bucket filling. <em>SuperPaint<\/em>, in the mid 1970s, was likely the first implementation of the idea  (Glassner 2001); (Lieberman 1978) and (Smith 1979) represent early research papers. The MacPaint algorithm is similar to Lieberman&rsquo;s algorithm as the algorithm travels along vertical and horizontal pixel paths and supports filling in with patterns as well as colors. MacPaint&rsquo;s code has been optimized for the restrictions in the domain, i.e. one-bit bitmaps.<\/p>\n<p>Within the source code, the Bucket Fill tool is called <code>SeedFill<\/code>. The <code>SeedFill<\/code> procedure contains the top-level business logic for the tool:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\"><span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#a6e22e\">SeedFill<\/span>(startPt: Point);\nVAR firstBlack: <span style=\"color:#66d9ef\">BOOLEAN<\/span>;\n<span style=\"color:#66d9ef\">BEGIN<\/span>\n  firstBlack := PixelTrue(startPt.h,startPt.v,mainBits);\n  CalcMask(mainBits,altBits,altBits.bounds,startPt,firstBlack,<span style=\"color:#66d9ef\">FALSE<\/span>);\n\n  SetPortBits(altBits);\n  PenPat(thePat);\n  <span style=\"color:#66d9ef\">IF<\/span> firstBlack <span style=\"color:#66d9ef\">THEN<\/span>\n    <span style=\"color:#66d9ef\">BEGIN<\/span>\n      PenMode(patBic);\n      PaintRect(altBits.bounds);\n      InvertBuf(<span style=\"color:#960050;background-color:#1e0010\">@<\/span>altBuf);\n      BufAndBuf(<span style=\"color:#960050;background-color:#1e0010\">@<\/span>mainBuf,<span style=\"color:#960050;background-color:#1e0010\">@<\/span>altBuf);\n    <span style=\"color:#66d9ef\">END<\/span>\n  <span style=\"color:#a6e22e\">ELSE<\/span>\n    <span style=\"color:#a6e22e\">BEGIN<\/span>\n      <span style=\"color:#a6e22e\">PenMode<\/span><span style=\"color:#960050;background-color:#1e0010\">(<\/span><span style=\"color:#a6e22e\">notPatBic<\/span><span style=\"color:#960050;background-color:#1e0010\">)<\/span>;\n      PaintRect(altBits.bounds);\n      BufOrBuf(<span style=\"color:#960050;background-color:#1e0010\">@<\/span>mainBuf,<span style=\"color:#960050;background-color:#1e0010\">@<\/span>altBuf);\n    <span style=\"color:#66d9ef\">END<\/span>;\n\n  PenNormal;\n  AltToScrn;\n  clickTime := TickCount;\n  GetMouse(clickLoc);\n  LocalToGlobal(clickLoc);\n  killDouble := <span style=\"color:#66d9ef\">TRUE<\/span>;\n<span style=\"color:#66d9ef\">END<\/span>;\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>Based on the position of the mouse click (<code>startPt<\/code>) and the current state of the image (<code>mainBits<\/code>), the procedure determines if the clicked pixel is black (<code>firstBlack<\/code> is true) or white. The assembly <code>CalcMask<\/code> procedure travels the image in <code>mainBits<\/code>, from the <code>startPt<\/code>, and computes a mask for the eventual pattern, storing this in <code>altBits<\/code>.<\/p>\n<p>After the boundaries for the fill are determined by <code>CalcMask<\/code>, the graphics port is set to the alternate buffer and the pattern is set to the currently selected pattern. The pattern transfer mode (<code>patBic<\/code> or <code>notPatBic<\/code>) is set to effectively erase the destinations content. The pattern is then filled within the bounding box, inverting the color depending on if the object&rsquo;s boundary is a black or white pixel. Afterwards, the alternate buffer is sent to the screen and the procedures performs some cleanup. If the result is not what the user wishes, they can undo the action, whereupon the screen switches to the other buffer which contains the content prior to the fill.<\/p>\n<p><code>CalcMask<\/code> contains the complicated parts of the program. <code>CalcMask<\/code> is also shared with the Lasso tool, which similarly needs to match arbitrary shapes.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">42\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">43\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">44\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">45\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">46\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">47\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">48\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">49\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">50\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">51\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">52\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">53\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">54\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">55\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">56\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">57\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">58\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">59\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">60\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">61\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">62\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">63\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">64\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">65\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">66\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">67\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">68\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">69\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">70\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">71\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">72\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">73\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">74\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">75\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">76\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">77\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">78\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">79\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">80\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">81\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">82\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">83\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">84\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">85\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">86\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">87\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">88\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">89\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">90\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">91\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">92\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">93\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">94\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\"><span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#a6e22e\">CalcMask<\/span>(<span style=\"color:#f92672\">*<\/span> srcBits,dstBits: BitMap;\n                     limitRect: Rect;\n                     firstPt: Point;\n                     firstBlack: BOOLEAN;\n                     invertDst: BOOLEAN <span style=\"color:#f92672\">*<\/span>);\n\n<span style=\"color:#960050;background-color:#1e0010\">{<\/span> given data <span style=\"color:#f92672\">in<\/span> srcBits, compute mask into dstBits <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n<span style=\"color:#960050;background-color:#1e0010\">{<\/span> WARNING: srcBits <span style=\"color:#f92672\">and<\/span> dstBits must be the wired<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">in<\/span> size <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\nLABEL <span style=\"color:#ae81ff\">9<\/span>;\n\nCONST row  <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">52<\/span>;         <span style=\"color:#960050;background-color:#1e0010\">{<\/span> rowbytes <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n      queueSize <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">400<\/span>;   <span style=\"color:#960050;background-color:#1e0010\">{<\/span> how many entries big <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\nVAR aQueue,bQueue: <span style=\"color:#66d9ef\">ARRAY<\/span><span style=\"color:#960050;background-color:#1e0010\">[<\/span><span style=\"color:#ae81ff\">0.<\/span>.queueSize<span style=\"color:#960050;background-color:#1e0010\">]<\/span> <span style=\"color:#66d9ef\">OF<\/span> QueueEntry;\n    readPtr:   <span style=\"color:#66d9ef\">Entry<\/span><span style=\"color:#a6e22e\">Ptr<\/span>;\n    writePtr:  <span style=\"color:#66d9ef\">Entry<\/span><span style=\"color:#a6e22e\">Ptr<\/span>;\n    donePtr:   <span style=\"color:#66d9ef\">Entry<\/span><span style=\"color:#a6e22e\">Ptr<\/span>;\n    dstStart,dstLimit,srcOffset: LongInt;\n    firstByte,firstMask: <span style=\"color:#66d9ef\">INTEGER<\/span>;\n    firstPtr: <span style=\"color:#960050;background-color:#1e0010\">^<\/span><span style=\"color:#66d9ef\">INTEGER<\/span>;\n    dh,dv:    <span style=\"color:#66d9ef\">INTEGER<\/span>;\n    leftByte,rightByte:     <span style=\"color:#66d9ef\">INTEGER<\/span>;\n    wordsWide,height: <span style=\"color:#66d9ef\">INTEGER<\/span>;\n\n<span style=\"color:#66d9ef\">BEGIN<\/span>\n  ZeroBuf(dstBits.baseAddr);\n\n  <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#f92672\">NOT<\/span> firstBlack <span style=\"color:#66d9ef\">THEN<\/span> InvertBuf(srcBits.baseAddr);\n\n  dh := <span style=\"color:#f92672\">-<\/span>dstBits.bounds.left;\n  dv := <span style=\"color:#f92672\">-<\/span>dstBits.bounds.top;\n  OffsetRect(limitRect,dh,dv);  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> make global <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n  firstPt.h := firstPt.h <span style=\"color:#f92672\">+<\/span> dh;\n  firstPt.v := firstPt.v <span style=\"color:#f92672\">+<\/span> dv;\n\n  limitRect.left := BitAnd(limitRect.left,<span style=\"color:#960050;background-color:#1e0010\">$<\/span>FFF0);       <span style=\"color:#960050;background-color:#1e0010\">{<\/span> round down to word <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n  limitRect.right := BitAnd(limitRect.right<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">15<\/span>,<span style=\"color:#960050;background-color:#1e0010\">$<\/span>FFF0);  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> round up to word <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\n  leftByte  := limitRect.left DIV <span style=\"color:#ae81ff\">8<\/span>;\n  rightByte := limitRect.right DIV <span style=\"color:#ae81ff\">8<\/span>;\n\n  dstStart := ORD(dstBits.baseAddr) <span style=\"color:#f92672\">+<\/span> limitRect.top <span style=\"color:#f92672\">*<\/span> row <span style=\"color:#f92672\">+<\/span> leftByte;\n  dstLimit := ORD(dstBits.baseAddr) <span style=\"color:#f92672\">+<\/span> limitRect.bottom <span style=\"color:#f92672\">*<\/span> row <span style=\"color:#f92672\">+<\/span> leftByte;\n  srcOffset := ORD(srcBits.BaseAddr) <span style=\"color:#f92672\">-<\/span> ORD(dstBits.baseAddr);\n\n  firstByte := <span style=\"color:#ae81ff\">2<\/span><span style=\"color:#f92672\">*<\/span>(firstPt.h DIV <span style=\"color:#ae81ff\">16<\/span>);\n  firstPtr := Pointer(ORD(srcBits.baseAddr) <span style=\"color:#f92672\">+<\/span> firstPt.v <span style=\"color:#f92672\">*<\/span> row <span style=\"color:#f92672\">+<\/span> firstByte);\n\n  firstMask := <span style=\"color:#ae81ff\">0<\/span>;\n  BitSet(<span style=\"color:#960050;background-color:#1e0010\">@<\/span>firstMask,BitAnd(firstPt.h,<span style=\"color:#ae81ff\">15<\/span>));\n  firstMask := VertSeed(firstMask,firstPtr<span style=\"color:#960050;background-color:#1e0010\">^<\/span>);\n  <span style=\"color:#66d9ef\">IF<\/span> firstMask <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#66d9ef\">THEN<\/span> <span style=\"color:#66d9ef\">GOTO<\/span> <span style=\"color:#ae81ff\">9<\/span>;\n\n  <span style=\"color:#960050;background-color:#1e0010\">{<\/span> Prime <span style=\"color:#e6db74\">&#34;aQueue&#34;<\/span> <span style=\"color:#f92672\">with<\/span> seed <span style=\"color:#66d9ef\">at<\/span> first point <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\n  writePtr := <span style=\"color:#960050;background-color:#1e0010\">@<\/span>aQueue;\n  <span style=\"color:#f92672\">WITH<\/span> writePtr<span style=\"color:#960050;background-color:#1e0010\">^<\/span> <span style=\"color:#66d9ef\">DO<\/span>\n    <span style=\"color:#66d9ef\">BEGIN<\/span>\n      addr := ORD(firstPtr)<span style=\"color:#f92672\">-<\/span>srcOffset;\n      bump := <span style=\"color:#ae81ff\">2<\/span>;\n      twoH := firstByte <span style=\"color:#f92672\">-<\/span> leftByte;\n      mask := firstMask;\n    <span style=\"color:#66d9ef\">END<\/span>;\n  writePtr := Pointer(ORD(writePtr) <span style=\"color:#f92672\">+<\/span> SizeOf(queueEntry));\n\n <span style=\"color:#960050;background-color:#1e0010\">{<\/span> Ping pong between the two Queues.  Read each <span style=\"color:#66d9ef\">entry<\/span> <span style=\"color:#a6e22e\">from<\/span> <span style=\"color:#a6e22e\">one<\/span> <span style=\"color:#a6e22e\">queue<\/span> <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n <span style=\"color:#960050;background-color:#1e0010\">{<\/span> <span style=\"color:#a6e22e\">and<\/span> <span style=\"color:#a6e22e\">push<\/span> <span style=\"color:#a6e22e\">all<\/span> <span style=\"color:#a6e22e\">the<\/span> <span style=\"color:#a6e22e\">untried<\/span> <span style=\"color:#a6e22e\">ones<\/span> <span style=\"color:#a6e22e\">it<\/span> <span style=\"color:#a6e22e\">spawns<\/span> <span style=\"color:#a6e22e\">onto<\/span> <span style=\"color:#a6e22e\">the<\/span> <span style=\"color:#a6e22e\">other<\/span> <span style=\"color:#a6e22e\">queue<\/span>.     <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\n  <span style=\"color:#a6e22e\">REPEAT<\/span>\n\n    <span style=\"color:#a6e22e\">donePtr<\/span> := <span style=\"color:#a6e22e\">writePtr<\/span>;\n    readPtr := <span style=\"color:#960050;background-color:#1e0010\">@<\/span>aQueue;                         <span style=\"color:#960050;background-color:#1e0010\">{<\/span> read from queue A <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    writePtr := <span style=\"color:#960050;background-color:#1e0010\">@<\/span>bQueue;                        <span style=\"color:#960050;background-color:#1e0010\">{<\/span> push into queue B <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    MaskIt(dstStart,dstLimit,srcOffset,\n           rightByte<span style=\"color:#f92672\">-<\/span>leftByte,row,readPtr,donePtr,writePtr);\n\n    donePtr := writePtr;\n    readPtr := <span style=\"color:#960050;background-color:#1e0010\">@<\/span>bQueue;                         <span style=\"color:#960050;background-color:#1e0010\">{<\/span> read from queue B <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    writePtr := <span style=\"color:#960050;background-color:#1e0010\">@<\/span>aQueue;                        <span style=\"color:#960050;background-color:#1e0010\">{<\/span> push into queue A <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n    MaskIt(dstStart,dstLimit,srcOffset,\n           rightByte<span style=\"color:#f92672\">-<\/span>leftByte,row,readPtr,donePtr,writePtr);\n\n  <span style=\"color:#66d9ef\">UNTIL<\/span> writePtr <span style=\"color:#f92672\">=<\/span> <span style=\"color:#960050;background-color:#1e0010\">@<\/span>aQueue;                     <span style=\"color:#960050;background-color:#1e0010\">{<\/span> <span style=\"color:#66d9ef\">until<\/span> aQueue <span style=\"color:#66d9ef\">is<\/span> empty <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\n<span style=\"color:#ae81ff\">9<\/span>: <span style=\"color:#66d9ef\">IF<\/span> <span style=\"color:#f92672\">NOT<\/span> firstBlack <span style=\"color:#66d9ef\">THEN<\/span> InvertBuf(srcBits.baseAddr);   <span style=\"color:#960050;background-color:#1e0010\">{<\/span> restore src <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n\n  <span style=\"color:#66d9ef\">IF<\/span> invertDst <span style=\"color:#66d9ef\">THEN<\/span>\n    <span style=\"color:#66d9ef\">BEGIN<\/span>\n      height := limitRect.bottom <span style=\"color:#f92672\">-<\/span> limitRect.top;\n      wordsWide := (limitRect.right <span style=\"color:#f92672\">-<\/span> limitRect.left) DIV <span style=\"color:#ae81ff\">16<\/span>;\n      InvertChunk(dstStart,wordsWide,height);\n    <span style=\"color:#66d9ef\">END<\/span>;\n<span style=\"color:#66d9ef\">END<\/span>;\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>At a high-level, <code>CalcMask<\/code> is very similar to (Lieberman 1978)&rsquo;s fill algorithm. The initial point, the seed, is expanded left and right until a boundary is found, and then the program checks up and down. Lieberman&rsquo;s version uses a queue, while Atkinson&rsquo;s uses two arrays as queues. Because a queue can be completed consumed before &lsquo;ping pong&rsquo;ing to the other queue, iteration and appending items are simpler than using a single queue.<\/p>\n<p>The <code>MaskIt<\/code> procedure scans right and left from an initial position, and then seeks to expand up and down the image. Based on the movement direction, the algorithm is looking for boundary conditions. Two lower-level algorithms implement much of the bit-level logic: <code>HSeed<\/code> and <code>VSeed<\/code>. Both are &ldquo;local functions,&rdquo; they are effectively inlined functions within the assembly rather than functions callable (and requiring a stack frame) from Pascal. We will focus on <code>VSeed<\/code> because it is simpler.<\/p>\n<p>Conceptually, <code>VSeed<\/code> is trying to extrude the seed up (or down, the directions are equivalent) through the &ldquo;grill&rdquo; of the data. If a slice of the seed can flow into an open space of the grill, it will then also flow right and left within the grill to the edge or to a boundary. For clarity of visualization, in the table below we use a 3-bit word size. The one bits in the mask represent the set bits of the target color being &ldquo;pushed&rdquo; into the data. The data may be full of the target color, in which case we push into the entire &ldquo;word&rdquo;. However, we might encounter a wall, such as &ldquo;data empty&rdquo;, in which case we are unable to push to any location, resulting in zero. In the case of a hook, we flow both up and to the open left.<\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>Open<\/th>\n<th>Blocked<\/th>\n<th>Same<\/th>\n<th>Offset<\/th>\n<th>Hook Left<\/th>\n<th>Hook Right<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Data<\/td>\n<td>111<\/td>\n<td>000<\/td>\n<td>010<\/td>\n<td>100<\/td>\n<td>110<\/td>\n<td>011<\/td>\n<\/tr>\n<tr>\n<td>Mask<\/td>\n<td>010<\/td>\n<td>010<\/td>\n<td>010<\/td>\n<td>010<\/td>\n<td>010<\/td>\n<td>010<\/td>\n<\/tr>\n<tr>\n<td>Result<\/td>\n<td>111<\/td>\n<td>000<\/td>\n<td>010<\/td>\n<td>000<\/td>\n<td>110<\/td>\n<td>011<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The assembly code implementation:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">42\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-asm\" data-lang=\"asm\"><span style=\"color:#75715e\">;---------------------------------------------------\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  LOCAL FUNCTION VSeed(mask,data: INTEGER ): INTEGER;\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  ENTER WITH:  D4 MASK\n<\/span><span style=\"color:#75715e\">;               D1 DATA\n<\/span><span style=\"color:#75715e\">;               D3 ALL ONES\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  RETURNS      D6 SEED\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\">;  CLOBBERS     D0,D1,D2\n<\/span><span style=\"color:#75715e\">;\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#a6e22e\">VSEED<\/span>   <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D4<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;COPY MASK\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">AND<\/span>     <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;SEED := MASK AND DATA\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">BNE.S<\/span>   <span style=\"color:#66d9ef\">NOTZERO<\/span>                         <span style=\"color:#75715e\">;CONTINUE IF RESULT NONZERO\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">RTS<\/span>                                     <span style=\"color:#75715e\">;ELSE RETURN ZERO\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#a6e22e\">NOTZERO<\/span> <span style=\"color:#66d9ef\">CMP<\/span>     <span style=\"color:#66d9ef\">D3<\/span>,<span style=\"color:#66d9ef\">D1<\/span>                           <span style=\"color:#75715e\">;IS DATA ALL ONES ?\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">BNE.S<\/span>   <span style=\"color:#66d9ef\">NOTONES<\/span>                         <span style=\"color:#75715e\">;NO, CONTINUE\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;YES, RESULT = ONES\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">RTS<\/span>                                     <span style=\"color:#75715e\">;AND QUIT\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#a6e22e\">NOTONES<\/span> <span style=\"color:#66d9ef\">CMP<\/span>     <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;IS SEED = DATA ?\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">BNE.S<\/span>   <span style=\"color:#66d9ef\">RIGHTLP<\/span>                         <span style=\"color:#75715e\">;NO, CONTINUE\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">RTS<\/span>                                     <span style=\"color:#75715e\">;YES WE&#39;RE DONE\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#a6e22e\">RIGHTLP<\/span> <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                           <span style=\"color:#75715e\">;REMEMBER OLDSEED\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D0<\/span>                           <span style=\"color:#75715e\">;COPY SEED\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">LSR<\/span>     <span style=\"color:#75715e\">#1,D0                           ;SHIFT SEED RIGHT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">OR<\/span>      <span style=\"color:#66d9ef\">D0<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;LEAK SEED TO RIGHT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">AND<\/span>     <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;AND WITH DATA TO LIMIT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">CMP<\/span>     <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                           <span style=\"color:#75715e\">;IS SEED SAME AS OLD SEED ?\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">BNE<\/span>     <span style=\"color:#66d9ef\">RIGHTLP<\/span>                         <span style=\"color:#75715e\">;NO TRY SOME MORE\n<\/span><span style=\"color:#75715e\"><\/span>\n<span style=\"color:#a6e22e\">LEFTLP<\/span>  <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                           <span style=\"color:#75715e\">;REMEMBER OLDSEED\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">MOVE<\/span>    <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D0<\/span>                           <span style=\"color:#75715e\">;COPY SEED\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">ADD<\/span>     <span style=\"color:#66d9ef\">D0<\/span>,<span style=\"color:#66d9ef\">D0<\/span>                           <span style=\"color:#75715e\">;SHIFT SEED LEFT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">OR<\/span>      <span style=\"color:#66d9ef\">D0<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;LEAK SEED TO LEFT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">AND<\/span>     <span style=\"color:#66d9ef\">D1<\/span>,<span style=\"color:#66d9ef\">D6<\/span>                           <span style=\"color:#75715e\">;AND WITH DATA TO LIMIT\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">CMP<\/span>     <span style=\"color:#66d9ef\">D6<\/span>,<span style=\"color:#66d9ef\">D2<\/span>                           <span style=\"color:#75715e\">;IS SEED SAME AS OLD SEED ?\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">BNE<\/span>     <span style=\"color:#66d9ef\">LEFTLP<\/span>                          <span style=\"color:#75715e\">;NO TRY SOME MORE\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">RTS<\/span>                                     <span style=\"color:#75715e\">;AND RETURN\n<\/span><\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>The first half of <code>VSeed<\/code> is equivalent to the pseudo-code:<\/p>\n<pre tabindex=\"0\"><code>switch (mask bitand data)\n\tcase 0: return 0\n\tcase -1: return -1\n\tcase mask \/* mask = data *\/: return mask\n<\/code><\/pre><p>The last half performs iterative bit-twiddling to flow the seed into adjoining spaces of data.<\/p>\n<p>The test of the seed against the earlier data also demonstrates the value of &lsquo;playing the odds.&rsquo; Although boundaries may be arbitrary, they are often regular in shape &ndash; e.g. a boundary being a vertical line. Expensive computations can be reused if a relatively simple check is made first. Atkinson relates this technique to a similar domain in (Lemmons 1984, pg 76):<\/p>\n<blockquote>\n<p>So play your odds. People draw characters in OR mode a whole lot, and OR mode is about twice as fast as the other modes, so 95 percent of all characters are drawn in OR mode. Statistical measuring of the use of the thing allows you to get much more performance on your average throughput than you can if you don&rsquo;t go back and measure.\u201d<\/p>\n<\/blockquote>\n<p>In cases where the fillable area is open, or edges are regular, the algorithm can effectively process a 32 pixels at a time with much higher throughput than checking each individual pixel.<\/p>\n<h3 id=\"fatbits\">FatBits<\/h3>\n<p>One of the &ldquo;Goodies&rdquo; &ndash; additional MacPaint features selectable via the menubar &ndash; FatBits is a mode that magnifies the image. When in this mode, a small picture-in-picture provides the artist context of the full image while the rest of the content area expands to make it easier to manipulate individual pixels (below). Originally MacPaint only allowed the Pencil tool to be used in FatBits mode, but Atkinson extended support for all the tools (Young 1985, pg 315).<\/p>\n<figure><img src=\"MacPaint-1.5-WithContentFatBits.png\"\n         alt=\"Screenshot of MacPaint 1.5 with magnified content\" width=\"100%\"\/><figcaption>\n            <p>MacPaint 1.5 in FatBits Mode (1985)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>As suggested by the title of an early manual <em>Inside MacPaint: Sailing through the Sea of FatBits on a Single-Pixel Raft<\/em>, FatBits was a killer feature for MacPaint. For Susan Kare, Apple&rsquo;s in-house graphic designer, the mode accelerated the development of Macintosh&rsquo;s small bitmap icons where every pixel mattered.<\/p>\n<p>Marketing pushed to rename &ldquo;FatBits&rdquo; &ldquo;Magnify&rdquo;, but Atkinson won the argument, arguing that the name gave the program some personality (Young 1985).<\/p>\n<p>FatBits is restricted to a single zoom level and, within the code, is handled via many special-cased changes to the input and output. The single zoom level restriction was due to the limited hardware; an arbitrary scaling factor would have required too much CPU (Atkinson 2010, pg 10). The contemporary PostScript program supported arbitrary scaling via a transformation matrix but operated with looser performance requirements and greater hardware capabilities.<\/p>\n<p>While an important and impactful feature, <code>IF fatFlag<\/code> is called 36 times in the Pascal code. A cross-cutting feature, it requires special handling when mapping input coordinates, screen rendering, and buffer manipulation. Thus, we consider it an expensive feature because it violates the open-closed design principle &mdash; existing code needed to be modified to support it and future tools need special-cased code to support the mode. Functional composition is used to reduce the impact in certain code paths (e.g. <code>GetFatMouse<\/code> calls <code>GetMouse<\/code> and internally performs any necessary conversion), but there are too few abstractions overall to keep the cross-cutting nature of the feature constrained.<\/p>\n<h3 id=\"allocation-failure-and-segment-anti-fragmentation\">Allocation Failure and Segment Anti-Fragmentation<\/h3>\n<p>As a way to work within the limited memory of the Macintosh, developers could divide their application code into multiple segments. A segment could be unloaded when not needed, for example, printing routines (Inside-Vol2 1985, pg II-55). In the Pascal code, the compiler directive <code>{$S SegPrint }<\/code> places the procedure or function into the <code>SegPrint<\/code> segment. An empty name (e.g, <code>{$S       }<\/code> denotes the default, main segment.) In the assembly code, <code>.SEG<\/code> is used.<\/p>\n<p>Other than the main segment, MacPaint has the following eleven segments: SegBrush, SegFlip, SegHelp, SegInit, SegPage, SetPaste, SegPatEdit, SegPrint, SegScrap, SegSym, and SegUpdate.<\/p>\n<p>Within the main event loop, each of these segments is unloaded after each loop. Per (Inside-Vol2 1985, II-57), this is recommended practice as unloading a segment that is not loaded is a no-op and presumably cheap.<\/p>\n<p>Although a program may have sufficient free space to work, the layout of allocated memory might not leave a block large enough for an allocation to succeed. MacPaint&rsquo;s design was robust to a certain level of memory fragmentation.<\/p>\n<p>Within the main event loop, the code tests for <code>heapJam<\/code> to be set and, if it is, calls the <code>MaxMem<\/code> system routine. This routine compacts and frees unused memory, acting as a kind of garbage collector and compactor. Once fragmentation is reduced, the next memory allocation is likely to succeed.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">5\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ada\" data-lang=\"ada\">    <span style=\"color:#66d9ef\">IF<\/span> heapJam <span style=\"color:#66d9ef\">THEN<\/span>\n      <span style=\"color:#66d9ef\">BEGIN<\/span>\n        tempLong := MaxMem(tempLong);   <span style=\"color:#960050;background-color:#1e0010\">{<\/span> purge heap <span style=\"color:#960050;background-color:#1e0010\">}<\/span>\n        heapJam := <span style=\"color:#66d9ef\">FALSE<\/span>;\n      <span style=\"color:#66d9ef\">END<\/span>;\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>MacPaint&rsquo;s implementation largely avoids dynamic memory allocation, but there are eight calls to <code>NewHandle<\/code> (which allocates memory). Of these, only <code>AllocMask<\/code> tests the return value of <code>NewHandle<\/code> to verify it is non-Nil and sets the <code>heapJam<\/code> global variable if it is. The reason why <code>AllocMask<\/code> is the only path that checks for a memory allocation failure may be due to its unusually large allocation request. <code>AllocMask<\/code> asks for 12,480 bytes to store a mask which is nearly 10% of the Macintosh&rsquo;s 128k memory in total. Other than some calls that are used for initialization, the next largest called to <code>NewHandle<\/code> is 3,024 bytes and the rest below 100 bytes. As these other calls are much less likely to fail, it is perhaps acceptable that they lacked similar error detection.<\/p>\n<p>All callers of <code>AllocMask<\/code> check the return status and return early if the call failed. Although the user would need to retry their operation, the program would not suffer a crash due to the failed memory allocation.<\/p>\n<p>Atkinson recounted this general technique in (Atkinson 2010, pg 15):<\/p>\n<blockquote>\n<p>When code segments were loaded, you needed some code to do this job and that job. They would be loaded sort of at the first available place. But if you needed another code segment and this one would go out, it might leave a hole there that wasn\u2019t quite big enough for the next one that you needed but now you had sort of what we called memory fragmentation. That even though you had enough memory total, you couldn\u2019t load the pieces of code that you needed. And I developed a little technique for this which is setting a flag at the top of the event loop, saying that we failed and as I went to load code segments, if I failed to load one then I would beep and let it go back to the top of the event loop without doing anything. And it would say uh-oh, there\u2019s a failure here. Whereas if I succeeded and got to the right part of the code and got everything in, then I would set the flag to say we succeeded. The net result was the user would go to draw something and it would beep and they would try it again and it would work, and they\u2019d shrug and they\u2019d never know that they just avoided crashing the program.<\/p>\n<\/blockquote>\n<h2 id=\"alternative-paths-and-competitors\">Alternative Paths and Competitors<\/h2>\n<p>As a counter-factual, if MacPaint had not shipped with the Macintosh, Apple may instead have shipped a vector drawing program. The Lisa shipped with LisaDraw, a vector drawing program which was the basis for MacDraw. However, MacDraw was likely not available in time for the January launch, as the September\/October issue of MacWorld states MacDraw is &ldquo;soon-to-be-released&rdquo;. Since MacPaint was finished several months prior to the Macintosh&rsquo;s release date, we are unaware of any historical &ldquo;Plan B&rdquo;.<\/p>\n<p>Competively, there were several digital painting programs released in parallel. The year prior to the release of MacPaint, 1983, saw the release of the first Microsoft Mouse and two drawing programs that used it: Microsoft&rsquo;s <em>Doodle<\/em> program (below) and Doug Wolfgram&rsquo;s <em>Mouse Draw<\/em>. Notably, both programs supported color but the user interfaces can be charitably treated as primitive.<\/p>\n<figure><a href=\"https:\/\/winworldpc.com\/product\/microsoft-mouse\/1x\"><img src=\"Microsoft-Doodle.png\"\n         alt=\"Screenshot of Microsoft Doodle showing lack of tool and pattern palettes\"\/><\/a><figcaption>\n            <p>Microsoft Doodle (1983)<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>After the release of MacPaint and the Macintosh at the beginning of 1984, competitors quickly adopted the MacPaint user interface.<\/p>\n<p>Thus, the demand for painting programs, tied to sales of mice, was recognized independently of the Macintosh and multiple vendors were attempting to fill that need. That said, within the consumer market, MacPaint seems to be the first &ldquo;great&rdquo; painting program and highly influential.<\/p>\n<p>Technologically, was MacPaint novel? The site <a href=\"http:\/\/macpaint.org\/history.html\">macpaint.org claims<\/a> the following aspects of MacPaint were novel:<\/p>\n<blockquote>\n<p>The marching ants around a selection; the palette of drawing tools; the (rudimentary) ability to zoom in; the spray can; the paint bucket; copying and pasting images between programs; just moving the mouse and drawing: we take these for granted in the 21st century, but Macpaint did them first, with only 128k of RAM available.<\/p>\n<\/blockquote>\n<p>While we concede the marching ants and maybe the spray can were novel to MacPaint, MacPaint has historical antecedents in research and commercial systems. As a contrary example, <em>SuperPaint<\/em>, which displayed its first picture in 1973, supported color, and featured a tool palette and the following functions: Paint, Shrink 2x, Expand 2x, Move, Copy, Store and Load, Text, Video In, Make Brush, Draw Lines, Gridding, Area Fill, and various color table animation tools (Shoup 2001). Originally aimed at television graphics use cases, this research system was rapidly supplanted by commercial offerings. For example, the Quantel <em>Paintbox<\/em>, a dedicated computer for compositing broadcast video and graphics, was first released in 1981 and was quickly adopted by the major TV networks.<\/p>\n<p>Although the Macintosh was opening up new markets for consumer and small-scale digital production, broadcasting and commercial computer animation firms had been operating for several years. In Alvy Ray Smith&rsquo;s article &ldquo;Digital paint systems: an anecdotal and historical overview&rdquo; (Smith 2001), he focuses on &ldquo;systems&rdquo; rather than &ldquo;programs&rdquo; and eliminates 1-bit and 3-bit systems from his article&rsquo;s scope because of their lack of influence. MacPaint is thus eliminated both due to its limited functionality and its 1-bit nature. Is this restriction fair?<\/p>\n<p>Certainly, the professional market was developing on its own, without regard for the hobbyist market. The Macintosh was not capable until the late 80s to support the monied use cases which required color and higher-performance. However, consumer technology eventually caught up and the existence of consumer-level tools provided a path for artists to &ldquo;graduate&rdquo; to professional-grade tools. Photoshop is an example of this progression.<\/p>\n<p>The Knoll brothers, who developed Photoshop in 1988, were both exposed to the original Macintosh when their <a href=\"https:\/\/web.archive.org\/web\/20070626182822\/http:\/\/www.storyphoto.com\/multimedia\/multimedia_photoshop.html\">father bought one<\/a> in mid-1984. Thomas Knoll used a Macintosh Plus in 1987 to assist in his PhD work, which also came with MacPaint. The Plus did not support gray scale, so Thomas started working on image manipulation routines. John, who was working at Industrial Lights and Magic at the time, saw connections between what Thomas was developing and the features in the Pixar Image Computer. Combining the use case insights from professional animators with a graphical user interface led to Photoshop.<\/p>\n<p>Photoshop was widely adopted both by professionals but also by aspiring students and hobbyists. It represents a joining of two pathways, one from the personal, consumer world, and the other from the research and commercial world.<\/p>\n<p><strong>Impact of Source Code?<\/strong><\/p>\n<p>In 1984, event-driven programming was a relatively new paradigm in software design. A possible additional influence of MacPaint was the program being used to train developers on how to develop an event-driven program and how to program for the mouse. Atkinson believes Apple gave the source to some developers [Atkinson 2010, pg 8], although we have not seen evidence for this elsewhere. (At least among external developers; the Macintosh team certainly had access to the code and there are three programmers attested in MyTools.a, including Andy Hertzfeld.) <em>Inside Macintosh<\/em>, the official developer documentation, includes an Example Program that demonstrates basic window management and handling. Comparing the Example Program to MacPaint, we find there are similarities in naming and structure, which suggests MacPaint may have been a reference when writing the documentation.<\/p>\n<h2 id=\"post-release\">Post Release<\/h2>\n<p>MacPaint saw two more releases in 1984, the first as a free update with System Software 1.1 and the second with the release of the Macintosh 512K model. In 1985, System Software was updated to version 2.0 and included version 1.5 of MacPaint.<\/p>\n<p>Under pressure from third-party developers for less competition from system shipped  software, in 1987 Apple spun off their internally built applications to the Claris company, including MacPaint. Claris released version 2.0 of MacPaint in 1988 which featured tear off palettes, support for multiple documents open at once, the magic eraser, and <a href=\"https:\/\/macgui.com\/usenet\/?group=14&amp;id=10855\">other features<\/a>.<\/p>\n<p>The release seemed also perfunctory, as evidenced by the developer working on the 2.0 release to <a href=\"https:\/\/www.folklore.org\/Evolution_of_a_classic.html\">fill time and the lack of any product specs<\/a>. By April 1988, as listed by <a href=\"https:\/\/archive.org\/details\/MacWorld_8804_April_1988\/page\/n53\/mode\/2up\">Macworld<\/a>, MacPaint was in competition with an array of graphics programs, including <em>PixelPaint<\/em>, <em>Modern Artist<\/em>, <em>Aldus Freehand<\/em>, and <em>Adobe Illustrator<\/em>. Furthermore, the Macintosh II had been announced in 1987 with support for 8-bit color, limiting the appeal of a monochrome only application. No more versions were officially released and Claris discontinued the product in 1998.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>(Knuth Shustek, 2021) has this to say about MacPaint and QuickDraw&rsquo;s source code:<\/p>\n<blockquote>\n<p>They are brilliant programs, beautifully organized and structured, that are a treat to read and deserve to be annotated and studied.<\/p>\n<\/blockquote>\n<p>This article does not cover QuickDraw, but do we agree with these claims as they relate to MacPaint?<\/p>\n<p><strong>Is it brilliant?<\/strong><\/p>\n<p>It is uncontroversial that MacPaint strongly influenced the user interfaces of painting and drawing programs. Contemporary competitors changed their UIs to be similar to MacPaint and much of interface design live on in Photoshop, Illustrator, and other programs. Many of the tool modifiers, such as the shift key to constrain boxes and lines, became informally standardized. Although MacPaint was not the earliest painting program and did not invent all it is often credited with, it was the first &ldquo;great&rdquo; implementation, was widely available, and helped cement Macintosh&rsquo;s reputation as a computer for creative people.<\/p>\n<p><strong>Is it beautifully organized and structured? A treat to read?<\/strong><\/p>\n<p>This is subjective, so we&rsquo;ll examine certain aspects of the code.<\/p>\n<p>If we look at the physical sequencing of functions and procedures within the code, we find that the first five callables within the <code>PaintAsm.a<\/code> file are:<\/p>\n<ol>\n<li>EjectReset (eject disks)<\/li>\n<li>PixelTrue (return state of specific pixel within bitmap)<\/li>\n<li>Monkey (test for reliability testing mechanism)<\/li>\n<li>Stretch2x (re-samples buffer for higher resolution printing)<\/li>\n<li>MySetItemStyle (type coercion utility)<\/li>\n<\/ol>\n<p>Within the Pascal <code>MacPaint.p<\/code> file, after a long list of EXTERNAL declarations, we find:<\/p>\n<ol>\n<li>KeyIsDown (function to test if a specific key is pressed)<\/li>\n<li>GetFatMouse (get mouse position in fat bit coordinates)<\/li>\n<li>GridPoint (modify point to a grid or truncated position)<\/li>\n<li>GetGridMouse (get mouse position in grid coordinates)<\/li>\n<li>PinGridMouse (clamp grid mouse position within rectangle)<\/li>\n<\/ol>\n<p>We reject the idea that the physical organization of the code is &ldquo;beautifully organized&rdquo;; if someone wanted to understand the core functionality and design of the program, they should not read it linearly. However, we suspect this was not the foundation of Knuth and Shustek&rsquo;s claim, as outside of very short or a few literate programs, programs are not read linearly.<\/p>\n<p>Readers will often start with the entry point to the program. In this case, the reader will find a significant number of lines dedicated to <code>UnloadSeg<\/code> and <code>MoreMaster<\/code>, both efforts to mitigate the constrained memory. Outside of that, the master block contains sequence of <code>Init...<\/code> calls, followed by the main loop. The naming and structure facilitate distinguishing between the business logic and the system interfaces. So, we find that the main block is not necessarily beautiful, since it interleaves both kinds of logic, but the organization and structure are praise-worthy.<\/p>\n<p>Within a procedure or function, Pascal mandates some aspects of the ordering, but we find that the code does not fight language idioms. In general, in terms of code style, we find it is easy to read and follow. Anecdotally, while we were writing this article, another programmer saw the Pascal code on our laptop screen (main method) and noted that it appeared clean. (They also thought it looked &ldquo;old&rdquo; and certainly procedural code has a different visual appearance than functional or OOP code.)<\/p>\n<p><strong>Does it deserve to be annotated and studied?<\/strong><\/p>\n<p>There are three primary audiences for a historical source code artifact: 1) historians, interested in the context and impact of an artifact, 2) software practitioners, who can see it as a way to improve their craft, and 3) students, who may be introduced to techniques, practices, and paradigms and use it as training material.<\/p>\n<p>For historians, MacPaint exhibits the engineering trade-offs necessary to bring forth an application within the limitations of the Macintosh hardware, as well as an example of how a graphical program was expected to be designed by someone who also wrote one of the largest, most foundational libraries. Historians of algorithms can place the use of multiple buffers and the seed fill algorithm into the published history of graphics algorithms and how that informs the interrelation of commercial and academic research.<\/p>\n<p>Unless a practitioner is working within extreme resource limits or is working on legacy devices, we suspect the MacPaint source code has less to offer them. User interfaces are now asynchronous and callback based, languages are far richer than Pascal in their support of abstractions and encapsulation, and few use cases are restricted to 1-bit displays.<\/p>\n<p>However, we see many advantages as an artifact of study for students of computer science. First, MacPaint is not a toy, but a full-featured program, yet is quite small so there is little to distract the student. MacPaint&rsquo;s domain &mdash; painting &mdash; is readily understood and does not require students to understand unfamiliar domains such as finance or physics. Students are expected to understand multiple languages. Pascal and M68k assembly test a student&rsquo;s ability to learn a new language while neither is particularly difficult nor obscure.<\/p>\n<p>Although MacPaint runs in emulation and students can easily play with it, a disadvantage is that students would have difficulty modifying the source code and running their own versions. MacPaint doesn&rsquo;t even run on late series Macintoshes as it was too tightly bound to the hardware. That said, there are many potential assignments that do not require modifying and running the source code:<\/p>\n<ul>\n<li>Code Explanation. A student prepares a presentation or document explaining how a certain tool function is implemented or how a data structure is used. More advanced students might calculate the computational complexity of an algorithm.<\/li>\n<li>Design Alternative. A student shows how a certain feature might be re-designed with features from a different programming language or within a certain framework. Compare and contrast qualitatively and quantitatively.<\/li>\n<li>Develop Tooling. Pascal is relatively easy to parse and Apple&rsquo;s documentation on <em>Lisa Pascal<\/em> features detailed railroad diagrams for the syntax. Students could write tooling, similar to <code>cflow<\/code>, to analyze the program. (This tooling be reused to analyze the <em>Lisa<\/em> source code.)<\/li>\n<\/ul>\n<p>So, yes, MacPaint deserves to be studied.<\/p>\n<h2 id=\"special-recognition\">Special Recognition<\/h2>\n<p>Releasing the source code was surprisingly <a href=\"https:\/\/web.archive.org\/web\/20100721233205\/http:\/\/www.businessweek.com\/technology\/ByteOfTheApple\/blog\/archives\/2010\/07\/apple_donates_macpaint_source_code_to_computer_history_museum.html\">long and difficult<\/a>. I thank those involved for their dogged persistence that gave me the opportunity to study this source code.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Atkinson 2004) Atkinson, Bill, and Andy Hertzfeld. 2004. MacPaint oral history with Bill Atkinson and Andy Hertzfeld Interview by Grady Booch. Oral history collection. <a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102658007\">https:\/\/www.computerhistory.org\/collections\/catalog\/102658007<\/a>.<\/p>\n<p>(Atkinson 2010) Atkinson, Bill, and Andy Hertzfeld. 2010. MacPaint Interview and Demonstration with Bill Atkinson and Andy Herzfeld. <a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102743021\">https:\/\/www.computerhistory.org\/collections\/catalog\/102743021<\/a>.<\/p>\n<p>(Glassner 2001) Glassner, Andrew. 2001. \u201cFill \u2019er up! [Graphics Filling Algorithms].\u201d IEEE Computer Graphics and Applications 21 (1): 78\u201385.<\/p>\n<p>(Inside-Vol2 1985) Apple Computer, Inc. 1985. Inside Macintosh. Vol. 2. Addison-Wesley Publishing Company, Inc. <a href=\"https:\/\/vintageapple.org\/inside_o\/pdf\/Inside_Macintosh_Volume_I_1985.pdf\">https:\/\/vintageapple.org\/inside_o\/pdf\/Inside_Macintosh_Volume_I_1985.pdf<\/a>.<\/p>\n<p>(Knuth Shustek, 2021) Knuth, Donald, and Len Shustek. 2021. \u201cLet\u2019s Not Dumb down the History of Computer Science.\u201d Communications of the ACM 64 (2): 33\u201335. <a href=\"https:\/\/doi.org\/10.1145\/3442377\">https:\/\/doi.org\/10.1145\/3442377<\/a>.<\/p>\n<p>(Lemmons 1984) Lemmons, Phil. 1984. \u201cAn Interview: The Macintosh Design Team.\u201d BYTE, February 1984.<\/p>\n<p>(Lieberman 1978) Lieberman, Henry. 1978. How to color in a coloring book. In Proceedings of the 5th annual conference on Computer graphics and interactive techniques (SIGGRAPH &lsquo;78). Association for Computing Machinery, New York, NY, USA, 111\u2013116. <a href=\"https:\/\/doi.org\/10.1145\/800248.807380\">https:\/\/doi.org\/10.1145\/800248.807380<\/a><\/p>\n<p>(Shoup 2001) Shoup, Richard. 2001. \u201cSuperPaint: An Early Frame Buffer Graphics System.\u201d IEEE Annals of the History of Computing 23 (2): 32\u201337.<\/p>\n<p>(Smith 1979) Smith, Alvy Ray. 1979. Tint fill. SIGGRAPH Comput. Graph. 13, 2 (August 1979), 276\u2013283. <a href=\"https:\/\/doi.org\/10.1145\/965103.807456\">https:\/\/doi.org\/10.1145\/965103.807456<\/a><\/p>\n<p>(Smith 2001) Smith, Alvy Ray. 2001. \u201cDigital Paint Systems: An Anecdotal and Historical Overview.\u201d IEEE Annals of the History of Computing 23 (2): 4\u201330.<\/p>\n<p>(Young 1985) Young, Jeffrey S. 1985. Inside MacPaint: Sailing through the Sea of FatBits on a Single-Pixel Raft. Microsoft Press.<\/p>"},{"title":"Programming Lewis Carroll's *Memoria Technica*","link":"https:\/\/ztoz.blog\/posts\/memoria_technica\/","pubDate":"Thu, 26 Dec 2024 13:44:43 -0700","guid":"https:\/\/ztoz.blog\/posts\/memoria_technica\/","description":"<p>Charles Dodgson (pen name Lewis Carroll) had difficulty remembering numbers, such as dates. He developed a cipher to help him remember numbers by embedding them in couplets or phrases. For example, the couplet &ldquo;Brass trumpet and brazen bassoon, will speedily mark you a tune&rdquo; encodes the specific gravity of brass (8.39) in the last four consonants: <code>r k t n<\/code> (y is treated as a vowel). In this article, we describe the cipher, present online tools for encoding and decoding, discuss how we implemented the algorithms in TypeScript, and the cipher&rsquo;s relevancy to steganography.<\/p>\n<h2 id=\"memoria-technica\"><em>Memoria Technica<\/em><\/h2>\n<p>The cipher encodes plaintext&rsquo;s of sequences of digits. Using the cipher, consonants (except &lsquo;y&rsquo;) are mapped to digits. Vowels, punctuation, and &lsquo;y&rsquo; are ignored and thus can be freely added to the ciphertext The mapping follows the figure:<\/p>\n<!-- | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |\n|---|---|---|---|---|---|---|---|---|---|\n| b | d | t | f | l | s | p | h | n | z |\n| c | w | j | q | v | x | m | k | g | r | -->\n<figure><img src=\"digit_mapping.png\"\/><figcaption>\n            <h4>Mapping of consonants and digits<\/h4>\n        <\/figcaption>\n<\/figure>\n\n<p>In a three-page pamphlet published by Carroll in 1878, he describes how the digits and their associated letters came to be:<\/p>\n<ol>\n <li value=1>\"b\" and \"c\", the first two consonants in the Alphabet.<\/li>\n <li>\"d\" from \"duo\"; \"w\" from \"two\".<\/li>\n <li>\"t\" from \"tres\"; the other may wait awhile.<\/li>\n <li>\"f\" from four; \"q\" from \"quatuor\".<\/li>\n <li>\"l\" and \"v\", because \"L\" and \"V\" are the Roman symbols for \"fifty\" and \"five\".<\/li>\n <li>\"s\" and \"x\", from \"six\".<\/li>\n <li>\"p\" and \"m\", from \"septem\".<\/li>\n <li>\"h\" from \"huit\"; and \"k\" from the Greek \"okto\".<\/li>\n <li>\"n\" from \"nine\"; and \"g\" because it is so like \"9\".<\/li>\n <li value=0>\"z\" and \"r\", from \"zero\".<\/li>\n<\/ol>\n<p>The letter &ldquo;j&rdquo; is associated with 3 because it is left-over.<\/p>\n<p>Decoding ciphertext involves extracting the consonants from a message and finding the mapped integer. Vowels, y, and punctuation are ignored.<\/p>\n<p>Encoding an integer involves choosing one of two consonants for each digit and inserting vowels, y, and punctuation to create valid words.<\/p>\n<p>Decoding may also involve &ldquo;meta&rdquo; information. In the pamphlet, Carroll provides example couplets for encoding the specific gravities of various metals. The decoder will need to know that only the last four consonants are relevant.<\/p>\n<h3 id=\"example\">Example<\/h3>\n<p>Given the ciphertext <code>found<\/code>, we can first filter to the consonants: <code>fnd<\/code>. f is 4, n is 9, and d is 2, so the plaintext is <code>492<\/code>. (What was found in 492? Christopher Columbus discovered America in 1492 and Dodgson believed the leading 1 could be assumed.)<\/p>\n<p>Given the plaintext <code>492<\/code>, we are interested in words that include the consonants f or q, n or g, and d or w, in that order, with vowels, &lsquo;y&rsquo;, or punctuation in-between. Using the second edition Scrabble dictionary, we find 15 single words that match these constraints:<\/p>\n<table>\n\t<tbody>\n\t\t<tr>\n\t\t\t<td>fend<\/td><td>fiend<\/td><td>find<\/td><td>fined<\/td><td>foined<\/td>\n\t\t<\/tr>\n\t\t<tr>\n\t\t\t<td>fond<\/td><td>fondu<\/td><td>fondue<\/td><td>found<\/td><td>fugued<\/td>\n\t\t<\/tr>\n\t\t<tr>\n\t\t\t<td>fund<\/td><td>fundi<\/td><td>queened<\/td><td>quinoid<\/td><td>quoined<\/td>\n\t\t<\/tr>\n\t<\/tbody>\n<\/table>\n<p>Using our encoding tool, we found 74,151 possibilities from <code>fend<\/code> to <code>quey yuga yowie<\/code>.<\/p>\n<h2 id=\"decoding-and-encoding-tools\">Decoding and Encoding Tools<\/h2>\n<script src=\"index.js\" type=\"module\" async><\/script>\n<form>\n\t<fieldset>\n\t\t<legend>Decode a string to digits<\/legend>\n\t\t<label for=\"ciphertext\" style=\"display: inline-block; width: 15ch\">Ciphertext: <\/label>\n\t\t<input type=\"text\" id=\"ciphertext\" value=\"rk you a tune\" size=50>\n\t\t<br \/>\n\t\t<button id=\"decode-btn\">Decode<\/button>\n\t\t<br \/>\n\t\t<label for=\"plaintext\" style=\"display: inline-block; width: 15ch\">Plaintext (digits): <\/label>\n\t\t<output id=\"plaintext\" name=\"plaintext\" for=\"ciphertext\"><\/output>\n\t<\/fieldset>\n<\/form>\n<script async type=\"module\">\nimport { decode } from \".\/index.js\";\nfunction decodeCipherText(evt) {\n\tevt.preventDefault();\n\tconst ciphertext = document.getElementById(\"ciphertext\").value;\n\tconst plaintext = decode(ciphertext);\n\tdocument.getElementById(\"plaintext\").value = plaintext;\n}\ndocument.getElementById(\"decode-btn\").addEventListener('click', decodeCipherText);\n<\/script>\n<br \/>\n<form>\n\t<fieldset>\n\t\t<legend>Encode digits to potential memorable strings<\/legend>\n\t\t<label for=\"plaintext-encode\" style=\"display: inline-block; width: 15ch\">Plaintext (digits): <\/label>\n\t\t<input type=\"text\" id=\"plaintext-encode\" value=\"123\" pattern=\"[0-9]+\" size=5>\n\t\t<br \/>\n\t\t<label for=\"plaintext-limit\" style=\"display: inline-block; width: 15ch\">Limit (n values): <\/label>\n\t\t<input type=\"text\" id=\"plaintext-limit\" value=\"200\" pattern=\"[0-9]+\" size=5>\n\t\t<br \/>\n\t\t<label for=\"plaintext-offset\" style=\"display: inline-block; width: 15ch\">Offset (n values): <\/label>\n\t\t<input type=\"text\" id=\"plaintext-offset\" value=\"0\" pattern=\"[0-9]+\" size=5>\n\t\t<br \/>\n\t\t<button id=\"encode-btn\">Encode<\/button>\n\t\t<br \/>\n\t\t<label for=\"ciphertext-encode\">Ciphertext: <\/label>\n\t\t<output id=\"ciphertext-encode\" for=\"plaintext-encode\"><ol id=\"ciphers\"><\/ol><\/output>\n\t\t<br \/>\n\t<\/fieldset>\n<\/form>\n<script async type=\"module\">\nimport { drop, encode_all, limit } from \".\/index.js\";\nimport dictJson from \".\/cipher-words.json\" with { type: 'json' };\nconst dictionary = new Map(Object.entries(dictJson));\nfunction encodePlainText(evt) {\n\tevt.preventDefault();\n\tconst plaintext = document.getElementById(\"plaintext-encode\").value;\n\tconst lim = Number.parseInt(document.getElementById(\"plaintext-limit\").value);\n\tconst d = Number.parseInt(document.getElementById(\"plaintext-offset\").value);\n\tlet out = document.getElementById(\"ciphers\")\n\tout.replaceChildren();  \/\/ clear the list\n\tfor (let plain of limit(lim, drop(d, encode_all(plaintext, dictionary)))) {\n\t\tlet child = document.createElement('li');\n\t\tchild.textContent = plain;\n\t\tout.appendChild(child);\n\t}\n}\ndocument.getElementById(\"encode-btn\").addEventListener('click', encodePlainText);\n<\/script>\n<blockquote>\n<p>Note 1: These computations run on your computer.<\/p>\n<\/blockquote>\n<blockquote>\n<p>Note 2: The JavaScript code for <code>Encode<\/code> uses <a href=\"https:\/\/caniuse.com\/mdn-javascript_statements_import_import_attributes\">import attributes<\/a> which, as of December 2024, are not supported on Firefox, but are expected to be available soon.<\/p>\n<\/blockquote>\n<blockquote>\n<p>Note 3: Negative limits are treated as no limits (i.e. Infinity). Negative offsets are treated as no offset (i.e. zero).<\/p>\n<\/blockquote>\n<h2 id=\"implementation\">Implementation<\/h2>\n<h3 id=\"mapping-consonants-to-digits\">Mapping Consonants to Digits<\/h3>\n<p>Our implementation of the mapping is a fairly straight-forward switch case with fall-through:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\"><span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\"> * For a given letter, return the cipher digit based on Carroll&#39;s Memoria Technica scheme, or null if there\n<\/span><span style=\"color:#75715e\"> * is no associated digit. This function will return null for vowels (including y), punctuation marks, and\n<\/span><span style=\"color:#75715e\"> * any other non-consonants. Only single letter English consonants have values; case is ignored.\n<\/span><span style=\"color:#75715e\"> *\n<\/span><span style=\"color:#75715e\"> * @param chr a single character\n<\/span><span style=\"color:#75715e\"> *\/<\/span>\n<span style=\"color:#66d9ef\">export<\/span> <span style=\"color:#66d9ef\">function<\/span> <span style=\"color:#a6e22e\">consonant_value<\/span>(<span style=\"color:#a6e22e\">chr<\/span>: <span style=\"color:#66d9ef\">string<\/span>)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">number<\/span> <span style=\"color:#f92672\">|<\/span> <span style=\"color:#66d9ef\">null<\/span> {\n    <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">code16<\/span>: <span style=\"color:#66d9ef\">number<\/span> <span style=\"color:#f92672\">|<\/span> <span style=\"color:#66d9ef\">undefined<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">chr<\/span>.<span style=\"color:#a6e22e\">toLowerCase<\/span>().<span style=\"color:#a6e22e\">codePointAt<\/span>(<span style=\"color:#ae81ff\">0<\/span>);\n    <span style=\"color:#66d9ef\">switch<\/span> (<span style=\"color:#a6e22e\">code16<\/span>) {\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">98<\/span> <span style=\"color:#75715e\">\/*&#39;b&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">99<\/span> <span style=\"color:#75715e\">\/*&#39;c&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">100<\/span> <span style=\"color:#75715e\">\/*&#39;d&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">119<\/span> <span style=\"color:#75715e\">\/*&#39;w&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">116<\/span> <span style=\"color:#75715e\">\/*&#39;t&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">106<\/span> <span style=\"color:#75715e\">\/*&#39;j&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">3<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">102<\/span> <span style=\"color:#75715e\">\/*&#39;f&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">113<\/span> <span style=\"color:#75715e\">\/*&#39;q&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">4<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">108<\/span> <span style=\"color:#75715e\">\/*&#39;l&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">118<\/span> <span style=\"color:#75715e\">\/*&#39;v&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">5<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">115<\/span> <span style=\"color:#75715e\">\/*&#39;s&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">120<\/span> <span style=\"color:#75715e\">\/*&#39;x&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">6<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">112<\/span> <span style=\"color:#75715e\">\/*&#39;p&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">109<\/span> <span style=\"color:#75715e\">\/*&#39;m&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">7<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">104<\/span> <span style=\"color:#75715e\">\/*&#39;h&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">107<\/span> <span style=\"color:#75715e\">\/*&#39;k&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">8<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">110<\/span> <span style=\"color:#75715e\">\/*&#39;n&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">103<\/span> <span style=\"color:#75715e\">\/*&#39;g&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">9<\/span>;\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">122<\/span> <span style=\"color:#75715e\">\/*&#39;z&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">114<\/span> <span style=\"color:#75715e\">\/*&#39;r&#39;*\/<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n        <span style=\"color:#66d9ef\">default<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">null<\/span>;\n    }\n}\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>The original version (as suggested by the commented code) compared strings against  strings (JavaScript lacks a character type). Profiling revealed that more than half the time was being spent in StringEqual calls. By switching the comparisons from string equality to integer equality, the new code takes approximately 15% of the time than the older version or a speed-up greater than 6x.<\/p>\n<h2 id=\"decoding-ciphertext\">Decoding Ciphertext<\/h2>\n<p>The decoding process is essentially a functional <code>collect<\/code>: map across all elements and keep those that are non-null.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\"><span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\"> * Return an array of numbers based on decoding s. If there are no (valid) consonants within s, returns\n<\/span><span style=\"color:#75715e\"> * an empty array.\n<\/span><span style=\"color:#75715e\"> *\n<\/span><span style=\"color:#75715e\"> * @param s word or phrase\n<\/span><span style=\"color:#75715e\"> *\/<\/span>\n<span style=\"color:#66d9ef\">export<\/span> <span style=\"color:#66d9ef\">function<\/span> <span style=\"color:#a6e22e\">decode<\/span>(<span style=\"color:#a6e22e\">s<\/span>: <span style=\"color:#66d9ef\">string<\/span>)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">number<\/span>[] {\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">result<\/span>: <span style=\"color:#66d9ef\">number<\/span>[] <span style=\"color:#f92672\">=<\/span> [];\n\n    <span style=\"color:#66d9ef\">for<\/span>(<span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">chr<\/span> <span style=\"color:#66d9ef\">of<\/span> <span style=\"color:#a6e22e\">s<\/span>) {\n        <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">v<\/span>: <span style=\"color:#66d9ef\">number<\/span> <span style=\"color:#f92672\">|<\/span> <span style=\"color:#66d9ef\">null<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">consonant_value<\/span>(<span style=\"color:#a6e22e\">chr<\/span>);\n        <span style=\"color:#66d9ef\">if<\/span>(<span style=\"color:#a6e22e\">v<\/span> <span style=\"color:#f92672\">!==<\/span> <span style=\"color:#66d9ef\">null<\/span>) {\n            <span style=\"color:#a6e22e\">result<\/span>.<span style=\"color:#a6e22e\">push<\/span>(<span style=\"color:#a6e22e\">v<\/span>);\n        }\n    }\n\n    <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#a6e22e\">result<\/span>;\n}\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>Alternatively, <code>TextEncoder<\/code> could be used to build a <code>Uint8Array<\/code> of byte values and <code>consonant_value<\/code> could be given a byte value as an argument. However, since a character in UTF-8 may span multiple bytes, we would still need to capture the complexity of Unicode encoding somewhere.<\/p>\n<h2 id=\"encoding-a-specific-plaintext-value\">Encoding a (Specific) Plaintext Value<\/h2>\n<p>Using (Moby 2002), we previously developed a dictionary by mapping each word into a cipher value and then grouping words by a shared cipher value. Thus, generating a list of words that match a given plaintext digit string is a look-up:<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\"><span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\"> * Return from the dictionary all words that have the plaintext value of n.\n<\/span><span style=\"color:#75715e\"> *\n<\/span><span style=\"color:#75715e\"> * @param n plaintext value\n<\/span><span style=\"color:#75715e\"> * @param dictionary plaintext value to array of words\n<\/span><span style=\"color:#75715e\"> *\/<\/span>\n<span style=\"color:#66d9ef\">export<\/span> <span style=\"color:#66d9ef\">function<\/span> <span style=\"color:#a6e22e\">encode<\/span>(<span style=\"color:#a6e22e\">n<\/span>: <span style=\"color:#66d9ef\">string<\/span>, <span style=\"color:#a6e22e\">dictionary<\/span>: <span style=\"color:#66d9ef\">Map<\/span>&lt;<span style=\"color:#f92672\">string<\/span><span style=\"color:#960050;background-color:#1e0010\">,<\/span> <span style=\"color:#a6e22e\">string<\/span><span style=\"color:#960050;background-color:#1e0010\">[]<\/span>&gt;)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">string<\/span>[] {\n    <span style=\"color:#66d9ef\">if<\/span>(<span style=\"color:#a6e22e\">dictionary<\/span>.<span style=\"color:#a6e22e\">has<\/span>(<span style=\"color:#a6e22e\">n<\/span>)) {\n        <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#a6e22e\">dictionary<\/span>.<span style=\"color:#66d9ef\">get<\/span>(<span style=\"color:#a6e22e\">n<\/span>)<span style=\"color:#f92672\">!<\/span>;\n    } <span style=\"color:#66d9ef\">else<\/span> {\n        <span style=\"color:#66d9ef\">return<\/span> [];\n    }\n}\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>Using a JavaScript <code>Map<\/code> for the dictionary rather than a JavaScript <code>Object<\/code> is a bit of a personal quirk. JavaScript&rsquo;s <code>Map<\/code>s are restricted to string keys, which in this case is not a restriction for our domain. We appreciate <code>Map<\/code>&rsquo;s clear separation between keys and properties and likely improved performance given its less flexibility compared to <code>Object<\/code>s.<\/p>\n<p>The exclamation point after the <code>get(n)<\/code> call is a <a href=\"https:\/\/www.typescriptlang.org\/docs\/handbook\/2\/everyday-types.html#non-null-assertion-operator-postfix-\">TypeScript type assertion<\/a> informing the compiler that the return value is neither null nor undefined. The compiler is (not yet) smart enough to recognize that since <code>has<\/code> returned true, <code>get<\/code> cannot return undefined. At least, unless the dictionary is modified by another process.<\/p>\n<h2 id=\"encoding-a-general-plaintext-value\">Encoding a (General) Plaintext Value<\/h2>\n<p>While <code>encode<\/code> returns single word ciphertexts that fully match a given plaintext, we will often need groups of partial matches for a given plaintext. <code>encode_all<\/code> is a generator function that will yield one ciphertext at a time, where each ciphertext contains one or more words. As described in the &lsquo;English Language Coverage\/Branching Factor&rsquo; section below, most plaintexts will yield large numbers of potential ciphertexts (e.g. thousands for three digits). Since the consumer of <code>encode_all<\/code> is unlikely to need all the outputs at the same time, we use a generator or iterable design to avoid loading all the outputs into memory at once.<\/p>\n<p>(If <code>encode_all<\/code> is a generator, why not <code>encode<\/code>? In <code>encode<\/code>&rsquo;s case, the dictionary provides all outputs at once so we do not save any resources by yielding only a single entry from that dictionary at a time.)<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-typescript\" data-lang=\"typescript\"><span style=\"color:#75715e\">\/**\n<\/span><span style=\"color:#75715e\"> * Return all phrases (space delimited words from the dictionary) that encode the sequence of\n<\/span><span style=\"color:#75715e\"> * digits in n.\n<\/span><span style=\"color:#75715e\"> *\n<\/span><span style=\"color:#75715e\"> * @param n sequence of digits (plaintext)\n<\/span><span style=\"color:#75715e\"> * @param dictionary plaintext value to array of words\n<\/span><span style=\"color:#75715e\"> *\/<\/span>\n<span style=\"color:#66d9ef\">export<\/span> <span style=\"color:#66d9ef\">function<\/span><span style=\"color:#f92672\">*<\/span> <span style=\"color:#a6e22e\">encode_all<\/span>(<span style=\"color:#a6e22e\">n<\/span>: <span style=\"color:#66d9ef\">string<\/span>, <span style=\"color:#a6e22e\">dictionary<\/span>: <span style=\"color:#66d9ef\">Map<\/span>&lt;<span style=\"color:#f92672\">string<\/span><span style=\"color:#960050;background-color:#1e0010\">,<\/span> <span style=\"color:#a6e22e\">string<\/span><span style=\"color:#960050;background-color:#1e0010\">[]<\/span>&gt;)<span style=\"color:#f92672\">:<\/span> <span style=\"color:#a6e22e\">Generator<\/span>&lt;<span style=\"color:#f92672\">string<\/span>&gt; {\n    <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">digits<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">n<\/span>.<span style=\"color:#a6e22e\">split<\/span>(<span style=\"color:#e6db74\">&#39;&#39;<\/span>);\n    <span style=\"color:#66d9ef\">const<\/span> <span style=\"color:#a6e22e\">nonEmpty<\/span> <span style=\"color:#f92672\">=<\/span> (<span style=\"color:#a6e22e\">seq<\/span>: <span style=\"color:#66d9ef\">any<\/span>[]) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#a6e22e\">seq<\/span>.<span style=\"color:#a6e22e\">length<\/span> <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">divisors<\/span>: <span style=\"color:#66d9ef\">number<\/span>[] <span style=\"color:#f92672\">=<\/span> [];\n    <span style=\"color:#66d9ef\">for<\/span>(<span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">i<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span>; <span style=\"color:#a6e22e\">i<\/span> <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#a6e22e\">digits<\/span>.<span style=\"color:#a6e22e\">length<\/span>; <span style=\"color:#a6e22e\">i<\/span><span style=\"color:#f92672\">++<\/span>) {\n        <span style=\"color:#a6e22e\">divisors<\/span>.<span style=\"color:#a6e22e\">push<\/span>(<span style=\"color:#a6e22e\">i<\/span>);\n    }\n\n    <span style=\"color:#66d9ef\">for<\/span>(<span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">divisor<\/span> <span style=\"color:#66d9ef\">of<\/span> <span style=\"color:#a6e22e\">subsets<\/span>(<span style=\"color:#a6e22e\">divisors<\/span>)) {\n        <span style=\"color:#a6e22e\">divisor<\/span>.<span style=\"color:#a6e22e\">unshift<\/span>(<span style=\"color:#ae81ff\">0<\/span>);  <span style=\"color:#75715e\">\/\/ include all digits by starting the slice at index 0\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">slices<\/span>: <span style=\"color:#66d9ef\">string<\/span>[][] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">sliceIndices<\/span>(<span style=\"color:#a6e22e\">digits<\/span>, <span style=\"color:#a6e22e\">divisor<\/span>);\n\n        <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">phrases<\/span>: <span style=\"color:#66d9ef\">string<\/span>[][] <span style=\"color:#f92672\">=<\/span> [];\n        <span style=\"color:#66d9ef\">for<\/span>(<span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">slice<\/span> <span style=\"color:#66d9ef\">of<\/span> <span style=\"color:#a6e22e\">slices<\/span>) {\n            <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">words<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#a6e22e\">encode<\/span>(<span style=\"color:#a6e22e\">slice<\/span>.<span style=\"color:#a6e22e\">join<\/span>(<span style=\"color:#e6db74\">&#39;&#39;<\/span>), <span style=\"color:#a6e22e\">dictionary<\/span>)\n            <span style=\"color:#a6e22e\">phrases<\/span>.<span style=\"color:#a6e22e\">push<\/span>(<span style=\"color:#a6e22e\">words<\/span>);\n        }\n\n        <span style=\"color:#66d9ef\">if<\/span>(<span style=\"color:#a6e22e\">phrases<\/span>.<span style=\"color:#a6e22e\">every<\/span>(<span style=\"color:#a6e22e\">nonEmpty<\/span>)) {\n            <span style=\"color:#66d9ef\">for<\/span>(<span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#a6e22e\">wrappedPhrase<\/span> <span style=\"color:#66d9ef\">of<\/span> <span style=\"color:#a6e22e\">cartesian<\/span>(<span style=\"color:#a6e22e\">phrases<\/span>)) {\n                <span style=\"color:#66d9ef\">yield<\/span> <span style=\"color:#a6e22e\">wrappedPhrase<\/span>.<span style=\"color:#a6e22e\">underlying<\/span>.<span style=\"color:#a6e22e\">join<\/span>(<span style=\"color:#e6db74\">&#39; &#39;<\/span>);\n            }\n        }\n    }\n}\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p>Line 9 separates a string of digits into an array of digit characters. This function assumes digits will only contain the characters &lsquo;0&rsquo; to &lsquo;9&rsquo;.<\/p>\n<p>This function needs to check all ordered sub-sequences of the sequence, <code>digits<\/code>. For example, if digits is <code>['1', '2', '3']<\/code>, then we will attempt to <code>encode<\/code>: <code>['1', '2', '3'], ['1'] + ['2', '3], ['1', '2'] + ['3'],<\/code> and <code>['1'] + ['2'] + ['3']<\/code>. To do this, we incrementally add zero or more divisors that separate the array. A divisor splits an array from a given index (inclusive) to the right. Each split is referred to as a &lsquo;slice&rsquo;.<\/p>\n<p>In lines 21 through 25, we <code>encode<\/code> each slice (the <code>slice.join('')<\/code> transforms an array of digits into a string of digits) and collect all possible words under an array of arrays termed <code>phrases<\/code>.<\/p>\n<p>If all positions within a phrase contain at least one word (lines 27 through 31), then we perform a cartesian join between each index. A cartesian join or product is similar to a nested for-loop and generates all possible tuples. The <code>yield<\/code> gives the caller a single value (a space-separated list of words) while retaining state so the next value can be retrieved without re-computing earlier elements.<\/p>\n<p>The code for <code>subsets<\/code>, <code>sliceIndices<\/code>, and <code>cartesian<\/code> (among other functions) is in <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/memoria_technica\/-\/blob\/master\/util.ts?ref_type=heads\"><code>util.ts<\/code><\/a> in this project&rsquo;s gitlab repository. Knuth&rsquo;s <em>The Art of Programming, Volume 4A<\/em> is a useful resource for these kinds of algorithms.<\/p>\n<h2 id=\"suitability-for-steganography\">Suitability for Steganography<\/h2>\n<p>Steganography is the art of hiding information in plain sight. While cryptography obscures the message so the attackers have difficulty reading it, attackers will know  that the message is not meant to be read. In contrast, steganography <em>may<\/em> trick the attackers into believing the message is benign or not noticing the message at all. For example, invisible ink and microdots are forms of steganography. Similarly, a message may be hidden by distorting specific pixels of an image. Steganography and cryptography may be combined with the message first encrypted and then hidden.<\/p>\n<p>While Dodgson did not present this technique as a cipher nor as a tool for steganography, but only as an aid to memory, we believe it has potential utility.<\/p>\n<p>Let&rsquo;s try to hide phi, 1.61803, as a classified ad. The ad will start with &lsquo;For sale: ' and the item for sale will be an encoding of the number. We will assume the reader can place the decimal point appropriately (or we transmit that data via some other way). Using the encode tool above, <code>161803<\/code> yields some promising outputs within the first few hundred cases:<\/p>\n<ul>\n<li>&ldquo;box chart&rdquo;<\/li>\n<li>&ldquo;boys chariot&rdquo;<\/li>\n<li>&ldquo;cosy chariot&rdquo;<\/li>\n<\/ul>\n<p>Would an attacker suspect ads like these?<\/p>\n<ol>\n<li>For sale: Box Chart 2016. Software library for R. $1 + postage.<\/li>\n<li>For sale: Boy&rsquo;s Chariot. Clean, blue, ages 3-6. $100 OBO.<\/li>\n<li>For sale: Cosy Chariot. Padded seats, yellow. One seater.<\/li>\n<\/ol>\n<p>We chose these three examples after scanning about 3,000 outputs. The dictionary used by the tool contains almost 118,000 words (including plural and forms of tenses).<\/p>\n<h3 id=\"english-language-coveragebranching-factor\">English Language Coverage\/Branching Factor<\/h3>\n<p>The branching factor (number of words in the dictionary that match a given sequence) is numerous for single and double digit sequences. (Note that leading zeros are allowed.) The first table below represents the number of words that perfectly match the single digit sequence while the second table represents two digit sequences.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Digit<\/th>\n<th style=\"text-align:right\">Count<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">0<\/td>\n<td style=\"text-align:right\">63<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2<\/td>\n<td style=\"text-align:right\">53<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">3<\/td>\n<td style=\"text-align:right\">45<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">4<\/td>\n<td style=\"text-align:right\">21<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">5<\/td>\n<td style=\"text-align:right\">54<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">6<\/td>\n<td style=\"text-align:right\">46<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">7<\/td>\n<td style=\"text-align:right\">57<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">8<\/td>\n<td style=\"text-align:right\">51<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">9<\/td>\n<td style=\"text-align:right\">64<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\"><\/th>\n<th style=\"text-align:right\">0<\/th>\n<th style=\"text-align:right\">1<\/th>\n<th style=\"text-align:right\">2<\/th>\n<th style=\"text-align:right\">3<\/th>\n<th style=\"text-align:right\">4<\/th>\n<th style=\"text-align:right\">5<\/th>\n<th style=\"text-align:right\">6<\/th>\n<th style=\"text-align:right\">7<\/th>\n<th style=\"text-align:right\">8<\/th>\n<th style=\"text-align:right\">9<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\"><strong>0x<\/strong><\/td>\n<td style=\"text-align:right\">31<\/td>\n<td style=\"text-align:right\">36<\/td>\n<td style=\"text-align:right\">59<\/td>\n<td style=\"text-align:right\">44<\/td>\n<td style=\"text-align:right\">12<\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:right\">82<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">23<\/td>\n<td style=\"text-align:right\">58<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>1x<\/strong><\/td>\n<td style=\"text-align:right\">65<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">53<\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:right\">9<\/td>\n<td style=\"text-align:right\">69<\/td>\n<td style=\"text-align:right\">71<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">46<\/td>\n<td style=\"text-align:right\">61<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>2x<\/strong><\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:right\">31<\/td>\n<td style=\"text-align:right\">54<\/td>\n<td style=\"text-align:right\">37<\/td>\n<td style=\"text-align:right\">9<\/td>\n<td style=\"text-align:right\">70<\/td>\n<td style=\"text-align:right\">65<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">27<\/td>\n<td style=\"text-align:right\">59<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>3x<\/strong><\/td>\n<td style=\"text-align:right\">44<\/td>\n<td style=\"text-align:right\">25<\/td>\n<td style=\"text-align:right\">37<\/td>\n<td style=\"text-align:right\">29<\/td>\n<td style=\"text-align:right\">6<\/td>\n<td style=\"text-align:right\">38<\/td>\n<td style=\"text-align:right\">48<\/td>\n<td style=\"text-align:right\">45<\/td>\n<td style=\"text-align:right\">30<\/td>\n<td style=\"text-align:right\">52<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>4x<\/strong><\/td>\n<td style=\"text-align:right\">53<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">21<\/td>\n<td style=\"text-align:right\">27<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">35<\/td>\n<td style=\"text-align:right\">26<\/td>\n<td style=\"text-align:right\">11<\/td>\n<td style=\"text-align:right\">7<\/td>\n<td style=\"text-align:right\">33<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>5x<\/strong><\/td>\n<td style=\"text-align:right\">50<\/td>\n<td style=\"text-align:right\">34<\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">12<\/td>\n<td style=\"text-align:right\">65<\/td>\n<td style=\"text-align:right\">75<\/td>\n<td style=\"text-align:right\">44<\/td>\n<td style=\"text-align:right\">24<\/td>\n<td style=\"text-align:right\">95<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>6x<\/strong><\/td>\n<td style=\"text-align:right\">37<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:right\">27<\/td>\n<td style=\"text-align:right\">37<\/td>\n<td style=\"text-align:right\">6<\/td>\n<td style=\"text-align:right\">46<\/td>\n<td style=\"text-align:right\">42<\/td>\n<td style=\"text-align:right\">40<\/td>\n<td style=\"text-align:right\">29<\/td>\n<td style=\"text-align:right\">37<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>7x<\/strong><\/td>\n<td style=\"text-align:right\">78<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">64<\/td>\n<td style=\"text-align:right\">49<\/td>\n<td style=\"text-align:right\">9<\/td>\n<td style=\"text-align:right\">71<\/td>\n<td style=\"text-align:right\">94<\/td>\n<td style=\"text-align:right\">67<\/td>\n<td style=\"text-align:right\">36<\/td>\n<td style=\"text-align:right\">107<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>8x<\/strong><\/td>\n<td style=\"text-align:right\">34<\/td>\n<td style=\"text-align:right\">13<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">23<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">50<\/td>\n<td style=\"text-align:right\">38<\/td>\n<td style=\"text-align:right\">28<\/td>\n<td style=\"text-align:right\">37<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\"><strong>9x<\/strong><\/td>\n<td style=\"text-align:right\">56<\/td>\n<td style=\"text-align:right\">35<\/td>\n<td style=\"text-align:right\">55<\/td>\n<td style=\"text-align:right\">48<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">68<\/td>\n<td style=\"text-align:right\">83<\/td>\n<td style=\"text-align:right\">56<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:right\">84<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>There are 78 three-digit sequences where do we not see any single word in the dictionary (although these three-digits could be separated into one and two-digit subsequences): 008, 014, 048, 084, 088, 128, 147, 148, 204, 224, 234, 248, 264, 294, 314, 324, 334, 342, 347, 348, 349, 354, 364, 394, 404, 411, 414, 421, 423, 424, 427, 428, 434, 437, 447, 464, 467, 471, 474, 481, 483, 484, 487, 494, 534, 541, 547, 564, 574, 581, 584, 624, 634, 644, 661, 747, 748, 774, 784, 814, 817, 821, 827, 834, 837, 841, 845, 847, 848, 849, 864, 874, 881, 914, 948, 974, 981, 984.<\/p>\n<p>If the specific words do not matter, then any given plaintext can be encoded into a reasonable selection of words. If specific words do matter (i.e. the ciphertext is being hidden), then this branching factor may not be sufficient to cover all contexts. For instance, it may not be possible to consistently hide messages within recipes since the number of ingredient-related words is restrictive.<\/p>\n<h3 id=\"weakness-benfords-law-versus-letter-frequencies\">Weakness? Benford&rsquo;s Law versus Letter Frequencies<\/h3>\n<p>A potential weakness of this approach as a steganographic technique is that the frequency of digits does not match the frequency of consonants. For instance, the digit 4 maps to &lsquo;f&rsquo; and &lsquo;q&rsquo;. In studies of Benford&rsquo;s Law, which measures the frequency of digits in real-world contexts, a 4 is predicted as the leading digit in 9.7% of all numbers (Berger 2015). However, &lsquo;f&rsquo; is seen as the leading letter <a href=\"https:\/\/www3.nd.edu\/~busiforc\/handouts\/cryptography\/letterfrequencies.html\">in 2% of words and &lsquo;q&rsquo; is very rare<\/a>. Thus, a study of letter frequencies in a passage could detect an anomaly based on the imbalance. However, unless the volume of messages is high or the plaintext is very long, we suspect this would fall into the noise.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Abeles 2005) Abeles, Francine F. 2005. <em>Lewis Carroll&rsquo;s ciphers: The literary connections.<\/em> Advances in Applied Mathematics,\nVolume 34, Issue 4. Pages 697-708. ISSN 0196-8858, <a href=\"https:\/\/doi.org\/10.1016\/j.aam.2004.06.006\">https:\/\/doi.org\/10.1016\/j.aam.2004.06.006<\/a>.<\/p>\n<p>(Berger 2015) Berger, Arno, and Theodore Preston Hill. 2015. <em>An Introduction to Benford\u2019s Law.<\/em> Princeton, New Jersey: Princeton University Press. <a href=\"https:\/\/assets.press.princeton.edu\/chapters\/s10526.pdf\">First Chapter<\/a><\/p>\n<p>(Gardner 1996) Gardner, Martin, and Lewis Carroll. 1996. <em>The Universe in a Handkerchief : Lewis Carroll\u2019s Mathematical Recreations,\nGames, Puzzles, and Word Plays.<\/em> New York: Copernicus. Pages 31-36.<\/p>\n<p>(Moby 2002) Ward, Grady. 2002. <em>Moby Word II<\/em>. <a href=\"https:\/\/www.gutenberg.org\/ebooks\/3201\">https:\/\/www.gutenberg.org\/ebooks\/3201<\/a><\/p>\n<p><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/memoria_technica\">Gitlab Repository<\/a>. <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/memoria_technica\">https:\/\/gitlab.com\/jeffrey_starr\/memoria_technica<\/a><\/p>"},{"title":"Using LLMs to Explain Historical Code: FLOW-MATIC Investigation","link":"https:\/\/ztoz.blog\/posts\/flow-matic-llm\/","pubDate":"Sat, 14 Dec 2024 13:18:37 -0700","guid":"https:\/\/ztoz.blog\/posts\/flow-matic-llm\/","description":"<p>The new generation of code assistance tools powered by Large Language Models (LLM)s may be useful in efficiently categorizing and translating historical software corpora. In this qualitative study, we examine LLM performance in explaining and translating FLOW-MATIC programs. We also test their reliability by injecting OCR-like errors into the text.  Released in 1958, FLOW-MATIC targeted business applications and used a heavily English-inspired syntax that later influenced COBOL. We find that the models perform inconsistently for describing the key aspects of a program but fare better when translating business logic into SQL. LLMs appear robust against OCR-like errors in the source code, but we caution against relying on the current technology.<\/p>\n<figure><img src=\"Univac.Flowmatic.1957.102646140.fc.lg.jpg\"\/><figcaption>\n            <h4>FLOW-MATIC Brochure<\/h4>\n        <\/figcaption>\n<\/figure>\n\n<h2 id=\"motivation\">Motivation<\/h2>\n<p>Historical source code may contain insights into the development of algorithms,  programming practices, and shed light into the computational needs and pursuits of organizations. However, this same source code is often poorly cataloged, unavailable in digital form (e.g. exists in the form of paper tape or punch cards), and will usually be written in archaic programming languages, thus frustrating efforts to understand and interpret it. In order for researchers to effectively &ldquo;triage&rdquo; piles of artifacts and obtain a limited understanding of the contents, researchers might use an LLM to help explain and translate the code.<\/p>\n<h3 id=\"why-flow-matic\">Why FLOW-MATIC?<\/h3>\n<p>Developed by Dr. Grace Hopper and her team at Reminton Rand from around 1955 to 1958, FLOW-MATIC was an early language targeted at business applications. Internally, the language was called B-0 and experimented with an English-like syntax. Using approximately 33 verbs and coupled with an external data description language, programmers could implement sequential data processing logic (Sammet 1969).<\/p>\n<p>We chose FLOW-MATIC for this study because:<\/p>\n<ol>\n<li>The language is sufficiently obscure that there is unlikely to be any special-case logic for the language, and<\/li>\n<li>The manual (Rand 1958) contains three highly documented programs that perform typical business logic, and<\/li>\n<li>The English syntax may be generalizable to text mining of programming or process specifications.<\/li>\n<\/ol>\n<p>Although FLOW-MATIC is an obscure and dead programming language (even Rosetta Code doesn&rsquo;t include any examples), these programs and associated commentary may have been included in the AI&rsquo;s background corpus. The manual is available online as a PDF and (Sammet 1969) reprints Problem 1. We have also seen snippets of the code incorporated into various online articles on the language and Grace Hopper&rsquo;s history. However, we think it is safe to assume no engineering effort has been made to improve FLOW-MATIC related answers.<\/p>\n<h2 id=\"methodology\">Methodology<\/h2>\n<p>We extracted three FLOW-MATIC programs from a PDF of (Rand 1958) and manually corrected translation errors to the best of our ability. Then, using the AI Assistant tool as part of IntelliJ, we sent select prompts to different AI models, attaching the FLOW-MATIC programs as part of the request.<\/p>\n<p>For the corruption studies, we ran the original source code through our <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/corrupt.py?ref_type=heads\">corrupt.py<\/a> script. This tool randomly selects a corruption from a list and applies it to the text. If the corruption does not apply (i.e. produced no edits), then a different corruption was selected until the number of edits matches the desired number.<\/p>\n<p>Source code for the programs, the corrupted sources, the AI model output, and our corruption script are available within the <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\">Gitlab repository<\/a>.<\/p>\n<p>Testing LLMs is <a href=\"https:\/\/cacm.acm.org\/news\/is-it-possible-to-truly-understand-performance-in-llms\/\">tricky<\/a>. LLMs are stochastic and highly sensitive to variations in prompts. As an apologetically qualitative study, our task specification is vague and we lack clear, rigorous performance metrics.<\/p>\n<h2 id=\"explaining-and-translating-historical-code\">Explaining and Translating Historical Code<\/h2>\n<p>For this task, we asked the LLM to produce an English summary of a program. We are less interested in a breakdown of individual lines, but rather succinct statements to the key aspects of the business logic, the data model, and pre and post conditions of the input and output.<\/p>\n<h3 id=\"problem-1---join-inventory-and-prices\">Problem 1 - Join Inventory and Prices<\/h3>\n<p>The first program described in (Rand 1958) joins an ordered sequence of inventory records with an ordered sequence of price records, yielding a file containing priced inventory and a file containing unpriced inventory. The program requires 18 steps and would have been accompanied by a series of packets describing the input and output data formats.<\/p>\n<pre tabindex=\"0\"><code class=\"language-flowmatic\" data-lang=\"flowmatic\">(0) INPUT INVENTORY FILE-A PRICE FILE-B ; OUTPUT PRICED-INV FILE-C UNPRICED-INV FILE-D ; HSP D .\n\n(1) COMPARE PRODUCT-NO (A) WITH PRODUCT-NO (B) ; IF GREATER GO TO OPERATION 10 ; IF EQUAL GO TO OPERATION 5 ; OTHERWISE GO TO OPERATION 2 .\n\n(2) TRANSFER A TO D .\n\n(3) WRITE-ITEM D .\n\n(4) JUMP TO OPERATION 8 .\n\n(5) TRANSFER A TO C .\n\n(6) MOVE UNIT-PRICE (B) TO UNIT-PRICE (C) .\n\n(7) WRITE-ITEM C .\n\n(8) READ-ITEM A ; IF END OF DATA GO TO OPERATION 14 .\n\n(9) JUMP TO OPERATION 1 .\n\n(10) READ-ITEM B ; IF END OF DATA GO TO OPERATION 12 .\n\n(11) JUMP TO OPERATION 1 .\n\n(12) SET OPERATION 9 TO GO TO OPERATION 2 .\n\n(13) JUMP TO OPERATION 2 .\n\n(14) TEST PRODUCT-NO (B) AGAINST ZZZZZZZZZZZZ ; IF EQUAL GO TO OPERATION 16 ; OTHERWISE GO TO OPERATION 15 .\n\n(15) REWIND B .\n\n(16) CLOSE-OUT FILES C , D .\n\n(17) STOP . (END)\n<\/code><\/pre><p>Notable aspects of this program:<\/p>\n<ul>\n<li>There are two input files, inventory and prices. The output consists of a file with joined inventory data and pricing data and a secondary output of inventory lacking a price.<\/li>\n<li>Product Numbers (PRODUCT-NO) and Unit Price (UNIT-PRICE) are the only two fields referenced within the code.<\/li>\n<li>Input files are assumed to be sorted. If this assumption is violated, the unpriced output will be full of false positives.<\/li>\n<li>Each product in this version will have exactly one inventory record. This constraint is relaxed in later programs.<\/li>\n<li>The program uses self-modifying code as step 9 is overwritten in step 12.<\/li>\n<li><code>HSP D<\/code> refers to the high-speed printer present on the UNIVAC. The instruction &ldquo;redirects&rdquo; output to the printer rather than a tape.<\/li>\n<\/ul>\n<h4 id=\"llm-summaries\">LLM Summaries<\/h4>\n<p>Using the prompt <code>Explain the code at the current caret<\/code>, all three LLM models produced a detailed explanation covering each line of code. All three models referenced &lsquo;ZZZZZZZZZZZZ&rsquo; as a special marker. The final summaries from each model (below) all invoke matching entries between an inventory and pricing file or data and that prices may be missing.<\/p>\n<dl>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#openai-gpt-4o\">openai-gpt-4o<\/a><\/dt>\n<dd>&ldquo;This sequence executes a comparison loop to match items between an inventory and a pricing file, writing matched items with prices to one output file and unmatched items to another. It handles end-of-file conditions elegantly, ensuring all items are processed properly.&rdquo;<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-pro-15\">google-chat-gemini-pro-1.5<\/a><\/dt>\n<dd>&ldquo;In essence, this code implements a matching algorithm, similar to a merge join in database systems. It efficiently processes inventory and pricing data, handling scenarios where prices might be missing for certain items.&rdquo;<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#openai-chat-gpt\">openai-chat-gpt<\/a><\/dt>\n<dd>&ldquo;Overall, the code snippet describes a simple matching process between inventory items and their prices, handling cases where prices may be missing in the price file.&rdquo;<\/dd>\n<\/dl>\n<h4 id=\"translation-into-sql\">Translation into SQL<\/h4>\n<p>We used the prompt <code>Translate the code at the current caret into SQL<\/code> to translate Problem 1 into SQL. A good translation should capture the business intent, that of a join, as well as features of the data model.<\/p>\n<p>We observed:<\/p>\n<dl>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#openai-gpt-4o-3\">openai-gpt-4o<\/a><\/dt>\n<dd>The output features two inserts based on joins (implicit inner join and left join) into a priced and unpriced tables. The output includes a &lsquo;quantity&rsquo; field which is not present in the original code, although an inventory data record is highly likely to include such a field.<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-pro-15-2\">google-chat-gemini-pro-1.5<\/a><\/dt>\n<dd>The output uses procedural SQL features and attempts to match the original code feature-by-feature, which requires more lines than the original FLOW-MATIC version. The only fields referenced are the product number and the unit price. Includes an &lsquo;operation_9_target&rsquo; for emulating the self-modifying code, but the variable is only written, never read. The code also includes a useless block for checking for the &lsquo;ZZZZZZZZZZZZ&rsquo; end of data marker. This version also declares product numbers as <code>VARCHAR(255)<\/code> and prices as <code>DECIMAL(10,2)<\/code> for which there is no a priori support.<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-flash-15\">google-chat-gemini-flash-1.5<\/a><\/dt>\n<dd>Similar to the openai-gpt-4o output, except the inner join is explicit and the code does not include a &lsquo;quantity&rsquo; field. Of the three outputs, this is the cleanest version.<\/dd>\n<\/dl>\n<p>Translation may require the creation of entities not present in the original, such as table names, and changes to formatting, such as dashes to underscores, to be more idiomatic. However, ideally, insertion of content that relies on background knowledge rather than the source itself should be annotated.<\/p>\n<h3 id=\"problem-2---treat-duplicate-inventory-as-errors\">Problem 2 - Treat Duplicate Inventory as Errors<\/h3>\n<p>Problem 2 extends Problem 1 by allowing duplicate inventory records. If a duplicate record is found, the record is sent to an error file. We performed some <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#problem-2\">limited testing using Problem 2<\/a>, but found it provided limited additional insight versus Problem 1 and Problem 3, so are ignoring it in the interests of space.<\/p>\n<h3 id=\"problem-3---join-inventory-at-multiple-sites-with-prices\">Problem 3 - Join Inventory at Multiple Sites with Prices<\/h3>\n<p>Problem 3 extends Problem 1 by allowing duplicate inventory records. Unlike in Problem 2, duplicate inventory records are meaningful and quantities are summed across records. (Conceptually, this represents inventory at multiple sites and we are calculating the global quantity.) Problem 3 also adds an &lsquo;extended price,&rsquo; to the output. Within the manual, the use of <code>X-1<\/code> is to demonstrate how a program may call out to non-FLOW-MATIC code even though the logic (multiplying the quantity by the price) could have been accomplished within the standard code.<\/p>\n<pre tabindex=\"0\"><code class=\"language-flowmatic\" data-lang=\"flowmatic\">(0) INPUT INVENTORY FILE-A PRICE FILE-B ; OUTPUT PRICED-INV FILE-C UNPRICED-INV FILE-D ; HSP D .\n\n(1) COMPARE PRODUCT-NO (A) WITH PRODUCT-NO (B) ; IF GREATER GO TO OPERATION 21 ; IF EQUAL GO TO OPERATION 5 ; OTHERWISE GO TO OPERATION 2 .\n\n(2) TRANSFER A TO D .\n\n(3) SET OPERATION 13 TO GO TO OPERATION 18 .\n\n(4) JUMP TO OPERATION 8 .\n\n(5) TRANSFER A TO C .\n\n(6) MOVE UNIT-PRICE (B) TO UNIT-PRICE (C) .\n\n(7) SET OPERATION 13 TO GO TO OPERATION 14 .\n\n(8) MOVE PRODUCT-NO (A) TO PRODUCT-NO (W) ; QUANTITY (A) TO QUANTITY (W) .\n\n(9) READ-ITEM A ; IF END OF DATA GO TO OPERATION 23 .\n\n(10) COMPARE PRODUCT-NO (A) WITH PRODUCT-NO (W) ; IF EQUAL GO TO OPERATION 11 ; OTHERWISE GO TO OPERATION 13 .\n\n(11) X-1 ADD QUANTITY (A) TO STORED QUANTITY (W) .\n\n(12) JUMP TO OPERATION 9 .\n\n(13) JUMP TO OPERATION 14 .\n\n(14) MOVE QUANTITY (W) TO QUANTITY (C) .\n\n(15) X-1 COMPUTE EXTENDED PRICE AND INSERT IN C ITEM\n\n(16) WRITE-ITEM C ,\n\n(17) JUMP TO OPERATION 1 .\n\n(18) MOVE QUANTITY (W) TO QUANTITY (D)\n\n(19) WRITE-ITEM D .\n\n(20) JUMP TO OPERATION 17 .\n\n(21) READ-ITEM B ; IF END OF DATA GO TO OPERATION 1 .\n\n(22) JUMP TO OPERATION 1 .\n\n(23) EXECUTE OPERATION 13 THROUGH OPERATION 17 .\n\n(24) TEST PRODUCT-NO (8) AGAINST ZZZZZZZZZZZZ ; IF EQUAL GO TO OPERATION 26 ; OTHERWISE GO TO OPERATION 25 .\n\n(25) REWIND B .\n\n(26) CLOSE-OUT FILES C , D .\n\n(27) STOP . (END)\n<\/code><\/pre><p>For reference to the reader, the first X-1 code (label 11) is defined as:<\/p>\n<pre tabindex=\"0\"><code class=\"language-univac-x1\" data-lang=\"univac-x1\">B0W001\nA-A001\nC0W001\n<\/code><\/pre><p>These <a href=\"https:\/\/ed-thelen.org\/comp-hist\/Univac1\/Univac1.html#b13\">instructions<\/a> perform, using relative addresses:<\/p>\n<ol>\n<li>Load the temporary value (W, offset 1) into register<\/li>\n<li>Add the value in file A (offset 1) to the register<\/li>\n<li>Copy the accumulated value to the temporary value (W, offset 1)<\/li>\n<\/ol>\n<p>The second X-1 code (label 15) is defined as:<\/p>\n<pre tabindex=\"0\"><code class=\"language-univac-x1\" data-lang=\"univac-x1\">L0C001\nP0C002\nJ0C003\n<\/code><\/pre><ol>\n<li>Load into the L register from file C at offset 1<\/li>\n<li>Multiply (product) the L register and the contents from file C at offset 2, storing the result in A and X registers<\/li>\n<li>Store the result into file C (offset 3)<\/li>\n<\/ol>\n<p>The layout of the data and the definitions of the extended functions are located in ancillary files, rather than the source code itself. Thus, unless the LLM cheated by reading the manual, &ldquo;intuiting&rdquo; the computation of an extended price is based on background knowledge of the likely implementation.<\/p>\n<h4 id=\"llm-summaries-1\">LLM Summaries<\/h4>\n<p>Similar to Problem 1, each model produced a detailed explanation covering each line. However, unlike Problem 1, their summaries of Problem 3 lack detail. Two of the explanations (openai-gpt-4o and google-chat-gemini-pro-1.5) simply describe the program as doing something with inventory and pricing, lacking any mention of aggregating data between inventory records or how joins are implemented. The third mentions the aggregation of inventory but drops references to pricing data.<\/p>\n<dl>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#openai-gpt-4o-4\">openai-gpt-4o<\/a><\/dt>\n<dd>&ldquo;The code provided appears to be using a structured way to process inventory and price files.&rdquo;<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-pro-15-3\">google-chat-gemini-pro-1.5<\/a><\/dt>\n<dd>&ldquo;This code snippet represents a pseudocode or a simplified representation of a program logic, likely for processing inventory and pricing data.&rdquo;<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-flash-15-1\">google-chat-gemini-flash-1.5<\/a><\/dt>\n<dd>&ldquo;Operations 8-12 form a loop that reads inventory items, and aggregates quantities for each unique product number by storing the running total quantity in temporary storage (<code>W<\/code>). When a different product number is encountered, the loop breaks, and the aggregated quantity will be processed and written to the appropriate output file. This technique efficiently combines quantities for the same product before writing to the output file, avoiding redundant entries.&rdquo;<\/dd>\n<\/dl>\n<p>None of the models provide the same quality of summary as they did for Problem 1.<\/p>\n<h4 id=\"translation-into-sql-1\">Translation into SQL<\/h4>\n<p>Similar to the task in Problem 1, each model was asked to translate the code into SQL. Our observations:<\/p>\n<dl>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#openai-gpt-4o-4\">openai-gpt-4o<\/a><\/dt>\n<dd>Accumulates inventory across multiple inventory records. Treats extended price as the product of inventory and unit price. Uses inner joins and left joins to differentiate the presence of pricing data. Uses common table expressions.<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-pro-15-3\">google-chat-gemini-pro-1.5<\/a><\/dt>\n<dd>Very similar to openai-gpt-4o.<\/dd>\n<dt><a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#google-chat-gemini-flash-15-1\">google-chat-gemini-flash-1.5<\/a><\/dt>\n<dd>Very similar to both except it does not use common table expressions.<\/dd>\n<\/dl>\n<p>While the different models generate different names for tables and vary slightly in their use of SQL, they all use the same joins and model the data in the same way. We consider the output to be high-quality.<\/p>\n<h2 id=\"robustness-to-source-code-corruption\">Robustness to Source Code Corruption<\/h2>\n<p>An LLM is not a compiler. While a compiler is expected to enforce the rules of a language and enforce consistency within the source code, an LLM matches the source code to a probabilistic model. Since historical source code is unlikely to come in a pristine digital form but rather a decayed backup or print-out, resistance to random corruptions is a beneficial quality. Furthermore, if we desire to triage or quickly check the contents of an archive, we do not need the power of a compiler.<\/p>\n<p>To simulate corruption of the source code, we created copies of the originals with a number of edits applied. The nature of the edit was based on our experiences correcting the programs from the PDF. Thus, the errors simulate optical character recognition errors rather than bit flips. The <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/corrupt.py?ref_type=heads\">script<\/a> chose edits randomly from:<\/p>\n<ol>\n<li>Changing zero (0) characters to O characters<\/li>\n<li>Change 1 to I<\/li>\n<li>Change 8 to B<\/li>\n<li>Change ; to :<\/li>\n<li>Remove a space (e.g. <code>A B<\/code> to <code>AB<\/code>)<\/li>\n<li>Add a space (e.g. <code>AB<\/code> to <code>A B<\/code>)<\/li>\n<\/ol>\n<p>As the number of edits increased, eventually all the characters were swapped and the code was simply adding and removing spaces in random locations. The number of edits varied from 81 (approximately 10% of the total characters), to 243, and 486 (approximately 50%).<\/p>\n<p>For this exercise, we focused on a single AI model, openai-gpt-4o, and a single original source program, Problem 1. We chose to focus on openai-gpt-4o for convenience.<\/p>\n<h3 id=\"llm-summaries-2\">LLM Summaries<\/h3>\n<p>When asked to explain three corruptions of Problem 1, each with 81 edits, the model&rsquo;s <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#explain-corrupted-code-robustness-test\">summaries<\/a> stated the input files were inventory and pricing data in each case. In only one case does the summary state the output has priced and unpriced categories. In the other two cases, the summary merely describes that sequential or batch processing is in play.<\/p>\n<p>In contrast, this model provided the priced and unpriced insight for the original source code, as did the other two AI models. Similar to performance from Problem 1 to Problem 3, corruptions or other slight increases in functionality led to a drop in the quality of output.<\/p>\n<h3 id=\"translation-into-sql-2\">Translation into SQL<\/h3>\n<p>Across six versions of Problem 1, three with 243 corruptions and three with 486 corruptions, the model produced the same <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/flow-matic\/-\/blob\/master\/ai-output.md?ref_type=heads#translate-to-sql-robustness-tests\">SQL output consistently<\/a>:<\/p>\n<pre tabindex=\"0\"><code>-- Step 1: Identify and insert unpriced inventory items\nINSERT INTO unpriced_inventory (product_no, other_columns)\nSELECT i.product_no, i.other_columns\nFROM inventory i\nLEFT JOIN price p ON i.product_no = p.product_no\nWHERE p.product_no IS NULL;\n\n-- Step 2: Identify and insert priced inventory items\nINSERT INTO priced_inventory (product_no, unit_price, other_columns)\nSELECT i.product_no, p.unit_price, i.other_columns\nFROM inventory i\nINNER JOIN price p ON i.product_no = p.product_no;\n<\/code><\/pre><p>If we contrast this to the model&rsquo;s output with the original file:<\/p>\n<pre tabindex=\"0\"><code>-- SQL for Priced Inventory (FILE C)\nINSERT INTO PRICED_INV_TABLE_C (product_no, quantity, unit_price)\nSELECT a.product_no, a.quantity, b.unit_price\nFROM INVENTORY_TABLE_A a\nJOIN PRICE_TABLE_B b ON a.product_no = b.product_no;\n\n-- SQL for Unpriced Inventory (FILE D)\nINSERT INTO UNPRICED_INV_TABLE_D (product_no, quantity)\nSELECT a.product_no, a.quantity\nFROM INVENTORY_TABLE_A a\nLEFT JOIN PRICE_TABLE_B b ON a.product_no = b.product_no\nWHERE b.product_no IS NULL;\n<\/code><\/pre><p>Apart from the changes in table names and order of outputs, the output from the corrupted sources includes a <code>other_columns<\/code> column, while the original has a <code>quantity<\/code> column. Neither <code>quantity<\/code> nor <code>other_columns<\/code> exists in the original FLOW-MATIC source code. As a clue to the developer that the desired record may include additional fields, <code>other_columns<\/code> is more explicit than <code>quantity<\/code>, although <code>quantity<\/code> is a believable column for an inventory table.<\/p>\n<p>Using Problem 3 as the base, in six cases with the same variations of edits the model also produces consistent output. This output includes an <code>other_columns<\/code> field but otherwise does not add additional columns. This version lacks Problem 3&rsquo;s handling of multiple inventory records, thus marking a regression in functionality.<\/p>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>The LLM models appear to be unreliable as a tool to quickly summarize programs or translate them into other forms while retaining key aspects of the original logic. While their output can be highly consistent in the presence of random errors, their ability to capture and communicate key aspects of the programs can quickly decay to uselessness.<\/p>\n<p>Thus, researchers should be wary about using LLMs as a tool for researching historical source code. Technologically, improvements in models and workflows might render LLMs a reliable tool.<\/p>\n<p>We believe future quantitative studies should include:<\/p>\n<ol>\n<li>Measures of quality for the output, including key aspects included, contextual aspects not referenced but included, and aspects added without utility<\/li>\n<li>Measures of the narrative signal to noise ratio<\/li>\n<li>Rendering to a &ldquo;canonical&rdquo; form to simplify bulk analysis<\/li>\n<\/ol>\n<h2 id=\"references\">References<\/h2>\n<p><a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102646140\">FLOW-MATIC Brochure<\/a> Courtesy of Computer History Museum.<\/p>\n<p>(Rand 1958) Remington Rand Univac. 1958. <em>UNIVAC FLOW-MATIC Programming System.<\/em> <a href=\"https:\/\/archive.org\/details\/bitsavers_univacflowProgrammingSystem1958_9367413\">archive.org<\/a><\/p>\n<p>(Sammet 1969) Sammet, Jean E. 1969. <em>Programming Languages: History and Fundamentals.<\/em> Englewood Cliffs, N.J.: Prentice-Hall. <a href=\"https:\/\/archive.org\/details\/programminglangu00unse\">archive.org<\/a><\/p>"},{"title":"Chatbots Decoded: Exploring AI (Review)","link":"https:\/\/ztoz.blog\/posts\/chatbots-decoded\/","pubDate":"Fri, 15 Nov 2024 12:20:34 -0700","guid":"https:\/\/ztoz.blog\/posts\/chatbots-decoded\/","description":"<p><em>Chatbots Decoded: Exploring AI<\/em> is a <a href=\"https:\/\/computerhistory.org\/exhibits\/chatbots-decoded\/\">new exhibit<\/a> from the Computer History Museum (CHM) opening November 20th, 2024. In development for more than a year, the exhibit covers both the history of chatbots and conversational interfaces as well as the social ramifications of the technology. While there is not an explanation for why GPUs rather than CPUs are so useful for learning or what &ldquo;attention is all you need&rdquo; means, both are alluded too and I think, many audiences, including children, will enjoy it.<\/p>\n<p>Conversational interfaces, as typified by chatbots, have had a long parallel development in computing with more traditional interfaces. Science fiction has featured robots and automated creatures controlled by voice or written words as far back as the Golem and R.U.R. <em>Star Trek<\/em>&rsquo;s main computer presents a mixed interface, both tactile graphical displays and voice control. The series differentiated between the computer&rsquo;s intelligence and sentience. Although the computer could flexibly carry out tasks, it lacked any introspection, motivations, or free will. This distinction is one of the subtleties explored in the exhibit.<\/p>\n<p>The exhibit begins with fictional robots that, by obeying voice commands, helped shape both our expectations as well as fears. The first real grounded example is <a href=\"https:\/\/doi.org\/10.1145\/365153.365168\">ELIZA<\/a>, the conversational system which was typically used to imitate a Rogerian therapist. Visitors have access to a terminal where they can interact with an implementation, hopefully experiencing some of the magic feeling of interactivity but also witnessing the limitations of the technology.<\/p>\n<figure><img src=\"eliza-panel.jpg\"\n         alt=\"museum panel describing ELIZA with computer terminal\" width=\"100%\"\/><figcaption>\n            <p>ELIZA, the earliest chatbot system<\/p>\n        <\/figcaption>\n<\/figure>\n\n<figure><img src=\"eliza-interface.jpg\"\n         alt=\"interactions with ELIZA\" width=\"100%\"\/><figcaption>\n            <p>Interact with an ELIZA acting as a Rogerian therapist<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>Developments in symbolic logic and improvements such as A.L.I.C.E. are recounted. However, the next big revolution is deep learning where a timeline describes the exponential growth in the models, the amount of data, and the amount of funding. This leads into probably the central showcase: the Ameca exhibit.<\/p>\n<figure><img src=\"ameca.jpg\"\n         alt=\"ameca&#39;s robotic human face\" width=\"100%\"\/><figcaption>\n            <p>Ameca as a chatbot with a physical face<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p><a href=\"https:\/\/engineeredarts.co.uk\/robot\/ameca\/\">Ameca<\/a> is a high-fidelity robotic face joined to a ChatGPT model. Visitors can step right up and start talking to Ameca, which tries to spark some conversation on its own. According to a staff member, children feel an immediate emotional connection to the robot, laughing and giggling. For my part, I felt self-conscious, wanting to say something profound amongst all the humans in ear-shot. In order to prompt conversation, Ameca will urge visitors on. This can end up interrupting the visitor&rsquo;s attempts at interaction, but that ends up increasing the humanity of the experience.<\/p>\n<p>As a topic that is more current event than historical, <em>Chatbots Decoded<\/em> concludes with a large section of open questions. For instance, CheatGPT raises both the potential of LLMs as education aids, the concerns of students using LLMs as labor-saving devices, the battle to detect LLM generated text versus student authored text, and how relevant the older debate about student use of calculators is to the current fight. With no planned closing date for the exhibit, CHM will need to gauge when new developments warrant an update to the display.<\/p>\n<figure><img src=\"cheatgpt.jpg\"\n         alt=\"museum panel for cheatgpt\" width=\"100%\"\/><figcaption>\n            <p>Should students use ChatGPT?<\/p>\n        <\/figcaption>\n<\/figure>\n\n<p>All photos copyright of the author.<\/p>\n<blockquote>\n<p>Note: As a Supporting Member of the CHM, I gained early access to the exhibit. I was not compensated for the review.<\/p>\n<\/blockquote>\n"},{"title":"Mathematical Analysis of the Royal Game of Ur","link":"https:\/\/ztoz.blog\/posts\/math-analysis-royal-game-ur\/","pubDate":"Tue, 29 Oct 2024 14:39:36 -0600","guid":"https:\/\/ztoz.blog\/posts\/math-analysis-royal-game-ur\/","description":"<p>From the <a href=\"https:\/\/sciendo.com\/journal\/BGS\">Board Games Study Journal<\/a>:<\/p>\n<p><strong><a href=\"https:\/\/doi.org\/10.2478\/bgs-2023-0001\">Mathematical analysis of the Royal Game of Ur<\/a><\/strong><\/p>\n<blockquote>\n<p>Despite many discoveries and proposals for rules for the ancient board game known as the Royal Game of Ur (RGU), no mathematical analysis has yet been performed investigating those rules. In an attempt to fill that gap, this paper presents an initial mathematical analysis of the RGU from an introductory point of view. The paper deduces the overall complexity of the RGU using a state-space and game-tree complexity analysis, allowing the RGU to be compared to the popular games Checkers, Backgammon, Ludo, Chess, and Go. The paper builds upon the fundamental laws of combinatorics and probability to improve the understanding of the game: what patterns should you expect, what moves increase your chance to win, and what moves should you avoid. The paper also presents theorems to predict the probability of future dice rolls and piece movements within the game, allowing basic inferences to be made about strategy in the RGU. The game is further examined by analysing three different influences when determining the best move: advancement and attack (beneficial to the player), and captures (detrimental to the player). These influences are used to deduce explicit equations for the advantage gained by playing each possible move from a position, which allows the formalization of a strategic algorithm to play the RGU.<\/p>\n<\/blockquote>"},{"title":"A New Look","link":"https:\/\/ztoz.blog\/posts\/a-new-look\/","pubDate":"Mon, 28 Oct 2024 10:17:21 -0600","guid":"https:\/\/ztoz.blog\/posts\/a-new-look\/","description":"<p>I&rsquo;ve given this blog a new look by switching the theme from <a href=\"https:\/\/github.com\/kimcc\/hugo-theme-noteworthy\">Noteworthy<\/a> to <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/longform-hugo\">Long Form<\/a>, the latter a theme of my own design.<\/p>\n<h2 id=\"why-the-change\">Why the change?<\/h2>\n<p>I adopted the Noteworthy theme at the onset of this blog and maintained a branch with some fixes (primarily KaTeX support). Since it is a cliche that engineering blogs will write more about the blog&rsquo;s infrastructure than actual content, I wanted to first establish a track record of content before spending too much effort on design. Now, with a few years of posts, there were a few factors that pushed me to create something new:<\/p>\n<ul>\n<li><em>Printing Support<\/em> The old site created near unprintable pages. I want my works to be useful offline.<\/li>\n<li><em>Dense Tables<\/em> Many of my articles feature tables, often with dense content - many columns and rows. Noteworthy&rsquo;s table CSS did not scale to that level of data which hampered my ability to present information. (I expect I&rsquo;ll continue to tinker with Long Form&rsquo;s table formatting.)<\/li>\n<li><em>Unsupported<\/em> Noteworthy was no longer supported or updated by the author. As open-source, I could have continued to maintain my branch and make other changes, but a clean slate both allows a new vision and reduces the maintenance burden.<\/li>\n<\/ul>\n<h2 id=\"on-dropping-tags\">On dropping tags<\/h2>\n<p>The previous site had a section for tags, which never worked because I hadn&rsquo;t tagged any pages. I explored adopting an existing system like ACM&rsquo;s <a href=\"https:\/\/dl.acm.org\/ccs\">CCS<\/a>, but that system is not supportable within Hugo&rsquo;s functionality. Further, my content often does not align with CCS. There is enough content that adopting some tag or taxonomy makes sense, but I&rsquo;ll wait until I have a system before advertising its presence again.<\/p>"},{"title":"About","link":"https:\/\/ztoz.blog\/about\/","pubDate":"Mon, 28 Oct 2024 10:58:00 +0000","guid":"https:\/\/ztoz.blog\/about\/","description":"<p>The mapping of integers to integers is a fair summary of the work of programming. A blog about computer science, history of computing, engineering, game theory, and other things that attract my interest.<\/p>\n<h2 id=\"contact--social-links\">Contact \/ Social Links<\/h2>\n<ul>\n<li><a href=\"mailto:jeffrey.w.starr@protonmail.com\">E-Mail<\/a><\/li>\n<li><a href=\"https:\/\/www.linkedin.com\/in\/jeffrey-starr-bb6753b\/\">LinkedIn<\/a><\/li>\n<li><a href=\"https:\/\/gitlab.com\/jeffrey_starr\">GitLab<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/jeffreystarr\">GitHub<\/a><\/li>\n<li><a href=\"https:\/\/scholar.google.com\/citations?user=rEF2wXoAAAAJ\">Google Scholar<\/a><\/li>\n<\/ul>\n"},{"title":"Exposure to Art: a MTA Case Study","link":"https:\/\/ztoz.blog\/posts\/art-proximity\/","pubDate":"Mon, 14 Oct 2024 12:55:44 -0600","guid":"https:\/\/ztoz.blog\/posts\/art-proximity\/","description":"<p>Many corporations and agencies allocate a certain amount of their budget towards the creation of public artwork. For instance, New York&rsquo;s Metropolitan Transportation Authority (MTA) has funded nearly 300 permanent art installations since 1982 under the city&rsquo;s <a href=\"https:\/\/www.nyc.gov\/site\/dclapercentforart\/about\/about.page\">&ldquo;Percent for Art&rdquo; law<\/a>. If we question the effectiveness of the program, we immediately run into problems of measuring the subjective aesthetic value of the creations. However, we may measure their effectiveness by a necessary quality of input: the extent of MTA travellers exposed to art. We propose a &ldquo;weighted proximity&rdquo; metric as a performance indicator and, using NY Open Data, evaluate their scoring as of 1990, 2000, 2010, and 2020. We find the MTA has been increasing the extent of the traveling population exposed to public art each decade, although the rate of increase has dropped over time and many installations &ldquo;reinforce&rdquo; already existing, nearby art installations. We provide a ranked list of 20 sites which, through new art installations, will yield the greatest increase in our weighted proximity metric and find that three of the sites already have art installations since 2021, signaling continued improvement.<\/p>\n<h2 id=\"background\">Background<\/h2>\n<p>Public art is &ldquo;work created by artists for places accessible to and used by the public&rdquo; (Backer 2004). The &ldquo;for places&rdquo; in the definition is key, as public art is contrasted to &ldquo;art in public,&rdquo; where the art was not created specifically for the given site. Further, although an art gallery may be accessible at no cost, the &ldquo;place&rdquo; is normally a broader public space such as a park, thoroughfare, train station, library, or bridge.<\/p>\n<p>As of 2004, the United States has over 350 public art programs supporting thousands of artist&rsquo;s projects. Eighty-one percent of these programs are controlled within a public agency (Backer 2004). Funding for these projects typically comes from a &ldquo;percent for art&rdquo; law which allocates a certain percentage of the construction cost to artwork. The funding agency often influences the art&rsquo;s subject matter and the designated site will impose certain restrictions or inspirations. For example, a recent prospectus for public art at the <a href=\"https:\/\/artist.callforentry.org\/festivals_unique_info.php?ID=14233\">Roosevelt Science Center in New Mexico<\/a> specifies the art will be two-dimensional, fit within certain specified areas of the building, and feature a theme that reflects the &ldquo;nature of the building&rsquo;s science-focused focus.&rdquo; Artwork for the public in motion may carry safety requirements, such as not distracting drivers (Wilkerson 2009).<\/p>\n<p>Since public art is usually publicly funded, represents an increase to the cost of construction or rehabilitation, and sometimes controversial, defenders have argued for the benefits of public art. (Backer 2004) summarizes the benefits using four categories: 1) civic dialogue and community engagement 2) increased attention and economic activity 3) connection of artists with communities and 4) enhancement of public appreciation of art. A more recent survey by (Cheung 2022) categorized benefits along eight themes: Placemaking, Society, Culture, Economy, Sustainability, Wellbeing, Wisdom, and Innovation.<\/p>\n<p>Although many of these stated benefits are qualitative, we believe it is uncontroversial that the public is intended to be exposed to public art, so the choice of locations for public art should be influenced by where the public happens to be. The pairing of the public&rsquo;s location and the art location is an input to the &ldquo;attract attention and economic benefit&rdquo; from the first survey and the &ldquo;placemaking&rdquo; and &ldquo;wellbeing&rdquo; benefits of the second survey above.<\/p>\n<h2 id=\"weighted-proximity-score\">Weighted Proximity Score<\/h2>\n<p>We model exposure to art by the proximity, <em>p<\/em>, of travellers to an art installation. <em>p<\/em> is defined as:<\/p>\n<p>$$ p=e^{-d\/100} $$<\/p>\n<p>where <em>d<\/em> is the distance between an anchor point (e.g. a stop or station) and the art installation, in meters. Proximity is modeled as an exponential decay where approximately 2\/3rds of the value is within the first 100 meters or one approximate city block. As travellers move beyond a city block, we model that the probability of their exposure to an art piece rapidly declines. Computationally, distances of 500m or more can be treated as zero to avoid numerical issues.<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/art-proximity\/fig-proximity.png\" alt=\"Proximity goes from 1 at 0 distance to 0.37 at 100m and 0.05 at 300m\">\n  \n  <figcaption>Figure: Proximity (p) against distance (m)<\/figcaption>\n<\/figure>\n\n\n<p>For a given anchor point, the aggregate proximity, <em>P<\/em>, is the clamped sum of <em>p<\/em>, or:<\/p>\n<p>$$ P=min(\\sum{p}, 1) $$<\/p>\n<p>By clamping the value of <em>P<\/em>, this model implies that at some point there is &ldquo;enough&rdquo; art and installing more will not increase the value.<\/p>\n<p>Since travellers do not visit anchor points with equal probability, we assign a weight, <em>w<\/em>, to each anchor based on the proportion of visits. For this study, we computed the weight as the sum of the average ridership estimate in 2023 based on the &ldquo;MTA Subway Origin-Destination Ridership Estimate&rdquo; data set divided by the total average ridership estimate. The product of <em>P<\/em> and <em>w<\/em> gives the weighted proximity score:<\/p>\n<p>$$ P_w=Pw $$<\/p>\n<h2 id=\"mta-performance-1990---2020\">MTA Performance, 1990 - 2020<\/h2>\n<p>The &ldquo;MTA Permanent Art Catalog&rdquo; lists art installations within their network. The first artwork is <a href=\"https:\/\/new.mta.info\/agency\/arts-design\/collection\/masstransiscope\">Masstransiscope<\/a>, installed in 1980. Installations were slow to start but by 1990 we see a steady flow. (Note that we are treating each row in the catalog as an installation. However, the catalog has repeated instances when installations span multiple stops.)<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/art-proximity\/fig-art-installations-by-year.png\" alt=\"Installations from 1980 to 2024, averaging 5 per year, peak of 38 in 2018\">\n  \n  <figcaption>Figure: MTA Art Installations by Year<\/figcaption>\n<\/figure>\n\n\n<p>When we plot art installations by location, grouped by decade, we see the definite shape of the transit network becoming visible as art is installed.<\/p>\n<figure>\n\t<table>\n\t\t<tbody>\n\t\t\t<tr>\n\t\t\t\t<td style=\"padding: 0; border: 0\"><img src=\"fig-art-locations-1990.png\" width=100%><\/td>\n\t\t\t\t<td style=\"padding: 0; border: 0\"><img src=\"fig-art-locations-2000.png\" width=100%><\/td>\n\t\t\t<\/tr>\n\t\t\t<tr>\n\t\t\t\t<td style=\"padding: 0; border: 0\"><img src=\"fig-art-locations-2010.png\" width=100%><\/td>\n\t\t\t\t<td style=\"padding: 0; border: 0\"><img src=\"fig-art-locations-2020.png\" width=100%><\/td>\n\t\t\t<\/tr>\n\t\t<\/tbody>\n\t<\/table>\n\t<caption>Figure: Art Installation Locations as of Year<\/caption>\n<\/figure> \t\n<p>Although some art is installed along a transit line, the majority of pieces are located within a subway station, near a stop, or near an MTA complex. Calculating the proximity score for each complex (using the list of complexes from the 2023 Origin-Destination dataset), we see that each decade has seen a strong decline in the number of sites with effectively zero proximity to art, from 94% in 1990 to 38% in 2020. The mean <em>p<\/em> score has increased from 0.05 in 1990 to 0.64 in 2020.<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/art-proximity\/fig-proximity-dist-by-decade.png\" alt=\"Proximity scores of 0 decline rapidly, by 2020 \">\n  \n  <figcaption>Figure: Distribution of Proximity (p) to MTA Complexes by Decade<\/figcaption>\n<\/figure>\n\n\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">As of Year<\/th>\n<th style=\"text-align:right\">Mean Proximity (p) Score<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">1990<\/td>\n<td style=\"text-align:right\">0.05<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2000<\/td>\n<td style=\"text-align:right\">0.20<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2010<\/td>\n<td style=\"text-align:right\">0.41<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2020<\/td>\n<td style=\"text-align:right\">0.64<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>To weight the locations by population, we use the MTA&rsquo;s Origin Destination dataset and sum each location&rsquo;s weight by each origin destination pair&rsquo;s average ridership data. To validate this approach, we compare <a href=\"https:\/\/new.mta.info\/agency\/new-york-city-transit\/subway-bus-ridership-2023\">MTA&rsquo;s summary of 2023 ridership<\/a> by our weights. The proportion of ridership of each location versus the 10th ranked position has good alignment.<\/p>\n<table>\n<thead>\n<tr>\n<th>Rank<\/th>\n<th>Station\/complex<\/th>\n<th>Ridership<\/th>\n<th>Ridership vs. 10th<\/th>\n<th>Raw Weight<\/th>\n<th>Raw Weight vs. 10th<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>Times Sq-42 St\/Port Authority Bus Terminal<\/td>\n<td>54,266,441<\/td>\n<td>3.91<\/td>\n<td>19,703,290<\/td>\n<td>3.12<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>Grand Central-42 St<\/td>\n<td>30,517,475<\/td>\n<td>2.20<\/td>\n<td>13,926,937<\/td>\n<td>2.21<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>34 St-Herald Sq<\/td>\n<td>23,680,977<\/td>\n<td>1.71<\/td>\n<td>11,000,955<\/td>\n<td>1.74<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>14 St-Union Sq<\/td>\n<td>21,527,757<\/td>\n<td>1.55<\/td>\n<td>10,301,633<\/td>\n<td>1.63<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>Fulton St<\/td>\n<td>17,887,203<\/td>\n<td>1.29<\/td>\n<td>8,325,765<\/td>\n<td>1.32<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>34 St-Penn Station<\/td>\n<td>16,974,543<\/td>\n<td>1.22<\/td>\n<td>7,652,911<\/td>\n<td>1.21<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>59 St-Columbus Circle<\/td>\n<td>15,842,348<\/td>\n<td>1.14<\/td>\n<td>7,506,844<\/td>\n<td>1.19<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>34 St-Penn Station<\/td>\n<td>15,224,047<\/td>\n<td>1.10<\/td>\n<td>6,914,365<\/td>\n<td>1.10<\/td>\n<\/tr>\n<tr>\n<td>9<\/td>\n<td>Jackson Hts-Roosevelt Av\/74 St-Broadway<\/td>\n<td>14,348,691<\/td>\n<td>1.03<\/td>\n<td>6,555,554<\/td>\n<td>1.04<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>Flushing-Main St<\/td>\n<td>13,876,213<\/td>\n<td>1.00<\/td>\n<td>6,310,435<\/td>\n<td>1.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Weights decline rapidly and the median weight is 732,000. Half of the ridership is accounted for in the top 63 (of 425) sites. When we plot the locations and color them based on their weight, we can see most locations have a dark blue\/purple value. Thus, per normal power law relations, a significant percentage of travellers can be exposed to art with art installations at a fraction of stations.<\/p>\n<div>                        <script type=\"text\/javascript\">window.PlotlyConfig = {MathJaxConfig: 'local'};<\/script>\n        <script charset=\"utf-8\" src=\"https:\/\/cdn.plot.ly\/plotly-2.35.2.min.js\"><\/script>                <div id=\"fee6078b-1220-49b9-a22f-4b48a6f3bc46\" class=\"plotly-graph-div\" style=\"height:100%; width:100%;\"><\/div>            <script type=\"text\/javascript\">                                    window.PLOTLYENV=window.PLOTLYENV || {};                                    if (document.getElementById(\"fee6078b-1220-49b9-a22f-4b48a6f3bc46\")) {                    Plotly.newPlot(                        \"fee6078b-1220-49b9-a22f-4b48a6f3bc46\",                        [{\"hovertemplate\":\"\\u003cb\\u003e%{hovertext}\\u003c\\u002fb\\u003e\\u003cbr\\u003e\\u003cbr\\u003elat=%{lat}\\u003cbr\\u003elon=%{lon}\\u003cbr\\u003ewt=%{marker.color}\\u003cextra\\u003e\\u003c\\u002fextra\\u003e\",\"hovertext\":[\"Myrtle-Willoughby Avs (G)\",\"Chauncey St (J,Z)\",\"33 St-Rawson St (7)\",\"Wakefield-241 St (2)\",\"23 St (C,E)\",\"Kingsbridge Rd (4)\",\"Montrose Av (L)\",\"Elder Av (6)\",\"Hunters Point Av (7)\",\"Franklin St (1)\",\"Myrtle-Wyckoff Avs (L,M)\",\"Forest Hills-71 Av (E,F,M,R)\",\"23 St (R,W)\",\"Franklin Av (2,3,4,5)\\u002fBotanic Garden (S)\",\"Grant Av (A)\",\"WTC Cortlandt (1)\",\"88 St (A)\",\"57 St-7 Av (N,Q,R,W)\",\"Lexington Av-53 St (E,M)\\u002f51 St (6)\",\"Nereid Av (2,5)\",\"33 St (6)\",\"18 Av (F)\",\"Bowling Green (4,5)\",\"69 St (7)\",\"79 St (1)\",\"135 St (2,3)\",\"3 Av-138 St (6)\",\"Cypress Hills (J)\",\"Utica Av (A,C)\",\"Broadway (G)\",\"67 Av (M,R)\",\"Classon Av (G)\",\"125 St (1)\",\"Far Rockaway-Mott Av (A)\",\"Avenue X (F)\",\"Grand St (L)\",\"W 8 St-NY Aquarium (F,Q)\",\"191 St (1)\",\"74-Broadway (7)\\u002fJackson Hts-Roosevelt Av (E,F,M,R)\",\"Clark St (2,3)\",\"121 St (J,Z)\",\"Intervale Av (2,5)\",\"Morgan Av (L)\",\"75 Av (E,F)\",\"E 143 St-St Mary's St (6)\",\"86 St (N)\",\"E 149 St (6)\",\"Roosevelt Island (F)\",\"79 St (D)\",\"135 St (C,B)\",\"190 St (A)\",\"104 St (J,Z)\",\"Pelham Pkwy (5)\",\"Hoyt-Schermerhorn Sts (A,C,G)\",\"169 St (F)\",\"21 St-Queensbridge (F)\",\"Longwood Av (6)\",\"E 180 St (2,5)\",\"Flatbush Av-Brooklyn College (2,5)\",\"5 Av\\u002f59 St (N,R,W)\",\"Castle Hill Av (6)\",\"Court St (R)\\u002fBorough Hall (2,3,4,5)\",\"High St (A,C)\",\"Jefferson St (L)\",\"55 St (D)\",\"5 Av\\u002f53 St (E,M)\",\"Marcy Av (M,J,Z)\",\"161 St-Yankee Stadium (B,D,4)\",\"61 St-Woodside (7)\",\"Howard Beach-JFK Airport (A)\",\"Nostrand Av (3)\",\"Grand Army Plaza (2,3)\",\"174 St (2,5)\",\"85 St-Forest Pkwy (J)\",\"Kew Gardens-Union Tpke (E,F)\",\"Mosholu Pkwy (4)\",\"Fort Hamilton Pkwy (N)\",\"103 St (1)\",\"Graham Av (L)\",\"Seneca Av (M)\",\"14 St (A,C,E)\\u002f8 Av (L)\",\"39 Av-Dutch Kills (N,W)\",\"Central Park North (110 St) (2,3)\",\"Fresh Pond Rd (M)\",\"Astor Pl (6)\",\"Bergen St (2,3)\",\"South Ferry (1)\\u002fWhitehall St (R,W)\",\"Parkchester (6)\",\"52 St (7)\",\"96 St (Q)\",\"116 St (C,B)\",\"Freeman St (2,5)\",\"18 St (1)\",\"175 St (A)\",\"157 St (1)\",\"Morris Park (5)\",\"50 St (D)\",\"96 St (6)\",\"125 St (4,5,6)\",\"Broadway Junction (A,C,J,L,Z)\",\"Crescent St (J,Z)\",\"Fort Hamilton Pkwy (D)\",\"Sutter Av (L)\",\"Liberty Av (C)\",\"Inwood-207 St (A)\",\"225 St (2,5)\",\"Tompkinsville (SIR)\",\"Canarsie-Rockaway Pkwy (L)\",\"Avenue J (Q)\",\"Smith-9 Sts (F,G)\",\"Winthrop St (2,5)\",\"Greenpoint Av (G)\",\"Gun Hill Rd (2,5)\",\"34 St-Hudson Yards (7)\",\"72 St (Q)\",\"Lexington Av (N,R,W)\\u002f59 St (4,5,6)\",\"Avenue N (F)\",\"81 St-Museum of Natural History (C,B)\",\"28 St (1)\",\"Broadway-Lafayette St (B,D,F,M)\\u002fBleecker St (6)\",\"Sutphin Blvd-Archer Av-JFK Airport (E,J,Z)\",\"West Farms Sq-E Tremont Av (2,5)\",\"Halsey St (L)\",\"DeKalb Av (L)\",\"181 St (1)\",\"Hewes St (M,J)\",\"City Hall (R,W)\",\"Hoyt St (2,3)\",\"Mets-Willets Point (7)\",\"181 St (A)\",\"Elmhurst Av (M,R)\",\"New Lots Av (3)\",\"Spring St (C,E)\",\"50 St (1)\",\"20 Av (N)\",\"Wall St (2,3)\",\"25 St (R)\",\"Neck Rd (Q)\",\"155 St (C)\",\"30 Av (N,W)\",\"145 St (A,C,B,D)\",\"Cortelyou Rd (Q)\",\"Bergen St (F,G)\",\"170 St (4)\",\"Myrtle Av (M,J,Z)\",\"Beach 60 St (A)\",\"231 St (1)\",\"Kingston Av (3)\",\"Avenue U (N)\",\"116 St (2,3)\",\"Broad St (J,Z)\",\"Fulton St (G)\",\"Eastchester-Dyre Av (5)\",\"Dyckman St (A)\",\"Rockaway Av (3)\",\"Zerega Av (6)\",\"18 Av (D)\",\"71 St (D)\",\"Middletown Rd (6)\",\"East 105 St (L)\",\"Sutphin Blvd (F)\",\"Beach 98 St (A,S)\",\"20 Av (D)\",\"Aqueduct-N Conduit Av (A)\",\"Kosciuszko St (J)\",\"207 St (1)\",\"Canal St (A,C,E)\",\"Euclid Av (A,C)\",\"Gun Hill Rd (5)\",\"Woodhaven Blvd (J,Z)\",\"Newkirk Plaza (B,Q)\",\"Bedford-Nostrand Avs (G)\",\"7 Av (B,Q)\",\"Jay St-MetroTech (A,C,F,R)\",\"Lorimer St (L)\\u002fMetropolitan Av (G)\",\"86 St (Q)\",\"59 St (N,R)\",\"103 St (6)\",\"28 St (R,W)\",\"Union St (R)\",\"Baychester Av (5)\",\"Central Av (M)\",\"167 St (B,D)\",\"Morrison Av-Soundview (6)\",\"86 St (R)\",\"34 St-Penn Station (A,C,E)\",\"Prince St (R,W)\",\"Avenue H (Q)\",\"116 St (6)\",\"Pelham Pkwy (2,5)\",\"Simpson St (2,5)\",\"168 St (A,C,1)\",\"Wall St (4,5)\",\"Sterling St (2,5)\",\"Bay Ridge-95 St (R)\",\"46 St-Bliss St (7)\",\"18 Av (N)\",\"Avenue I (F)\",\"Nassau Av (G)\",\"Cleveland St (J)\",\"Broadway (N,W)\",\"Christopher St-Sheridan Sq (1)\",\"Grand Av-Newtown (M,R)\",\"Fordham Rd (B,D)\",\"23 St (1)\",\"Livonia Av (L)\",\"Woodhaven Blvd (M,R)\",\"8 Av (N)\",\"182-183 Sts (B,D)\",\"Knickerbocker Av (M)\",\"Park Pl (S)\",\"80 St (A)\",\"36 St (M,R)\",\"Junction Blvd (7)\",\"170 St (B,D)\",\"Bay Pkwy (D)\",\"St Lawrence Av (6)\",\"Prospect Park (B,Q,S)\",\"Kings Hwy (F)\",\"21 St (G)\",\"Bay Pkwy (N)\",\"Hunts Point Av (6)\",\"Van Siclen Av (J,Z)\",\"Lorimer St (M,J)\",\"Beach 44 St (A)\",\"Dyckman St (1)\",\"Briarwood (E,F)\",\"Whitlock Av (6)\",\"Clinton-Washington Avs (C)\",\"Kings Hwy (B,Q)\",\"137 St-City College (1)\",\"149 St-Grand Concourse (2,4,5)\",\"86 St (4,5,6)\",\"50 St (C,E)\",\"23 St (6)\",\"Gates Av (J,Z)\",\"Vernon Blvd-Jackson Av (7)\",\"15 St-Prospect Park (F,G)\",\"Rockaway Blvd (A)\",\"Tremont Av (B,D)\",\"49 St (N,R,W)\",\"77 St (6)\",\"14 St (F,M,1,2,3)\\u002f6 Av (L)\",\"Mt Eden Av (4)\",\"Canal St (J,N,Q,R,W,Z,6)\",\"96 St (1,2,3)\",\"Brooklyn Bridge-City Hall (4,5,6)\\u002fChambers St (J,Z)\",\"Nevins St (2,3,4,5)\",\"Bay Ridge Av (R)\",\"Carroll St (F,G)\",\"125 St (2,3)\",\"Queensboro Plaza (7,N,W)\",\"53 St (R)\",\"4 Av (F,G)\\u002f9 St (R)\",\"34 St-Herald Sq (B,D,F,M,N,Q,R,W)\",\"Westchester Sq-E Tremont Av (6)\",\"Allerton Av (2,5)\",\"Houston St (1)\",\"Nostrand Av (A,C)\",\"Saratoga Av (3)\",\"103 St (C,B)\",\"Woodlawn (4)\",\"Shepherd Av (C)\",\"Jamaica-179 St (F)\",\"Times Sq-42 St (N,Q,R,W,S,1,2,3,7)\\u002f42 St (A,C,E)\",\"233 St (2,5)\",\"Van Cortlandt Park-242 St (1)\",\"7 Av (F,G)\",\"183 St (4)\",\"77 St (R)\",\"Jamaica-Van Wyck (E)\",\"3 Av (L)\",\"Aqueduct Racetrack (A)\",\"Junius St (3)\",\"215 St (1)\",\"Bay 50 St (D)\",\"Beach 90 St (A,S)\",\"Canal St (1)\",\"Ralph Av (C)\",\"72 St (1,2,3)\",\"Kingston-Throop Avs (C)\",\"Jamaica Center-Parsons\\u002fArcher (E,J,Z)\",\"Chambers St (A,C)\\u002fWTC (E)\\u002fPark Pl (2,3)\\u002fCortlandt (R,W)\",\"7 Av (E,B,D)\",\"Avenue U (Q)\",\"Neptune Av (F)\",\"Kings Hwy (N)\",\"Avenue M (Q)\",\"68 St-Hunter College (6)\",\"Steinway St (M,R)\",\"New Utrecht Av (N)\\u002f62 St (D)\",\"Beverley Rd (Q)\",\"Lafayette Av (C)\",\"York St (F)\",\"66 St-Lincoln Center (1)\",\"111 St (7)\",\"Court Sq (E,G,M,7)\",\"36 Av (N,W)\",\"Bronx Park East (2,5)\",\"Jackson Av (2,5)\",\"Newkirk Av-Little Haiti (2,5)\",\"145 St (1)\",\"Flushing Av (M,J)\",\"Bowery (J,Z)\",\"Atlantic Av (L)\",\"Rector St (1)\",\"116 St-Columbia University (1)\",\"Beach 36 St (A)\",\"Bedford Park Blvd (B,D)\",\"Beach 25 St (A)\",\"Prospect Av (2,5)\",\"23 St (F,M)\",\"Fordham Rd (4)\",\"155 St (B,D)\",\"46 St (M,R)\",\"Ozone Park-Lefferts Blvd (A)\",\"Kingsbridge Rd (B,D)\",\"Beverly Rd (2,5)\",\"Astoria-Ditmars Blvd (N,W)\",\"Cathedral Pkwy (110 St) (1)\",\"Rockaway Park-Beach 116 St (A,S)\",\"174-175 Sts (B,D)\",\"Spring St (6)\",\"1 Av (L)\",\"86 St (1)\",\"Delancey St (F)\\u002fEssex St (J,M,Z)\",\"Atlantic Av-Barclays Ctr (B,D,N,Q,R,2,3,4,5)\",\"Coney Island-Stillwell Av (D,F,N,Q)\",\"Grand Central-42 St (S,4,5,6,7)\",\"8 St-NYU (R,W)\",\"138 St-Grand Concourse (4,5)\",\"Bushwick Av-Aberdeen St (L)\",\"Pennsylvania Av (3)\",\"East Broadway (F)\",\"Flushing Av (G)\",\"Grand St (B,D)\",\"Van Siclen Av (3)\",\"Ocean Pkwy (Q)\",\"Van Siclen Av (C)\",\"Franklin Av (C,S)\",\"Flushing-Main St (7)\",\"28 St (6)\",\"110 St (6)\",\"82 St-Jackson Hts (7)\",\"Clinton-Washington Avs (G)\",\"Middle Village-Metropolitan Av (M)\",\"Sheepshead Bay (B,Q)\",\"145 St (3)\",\"Cathedral Pkwy (110 St) (C,B)\",\"167 St (4)\",\"Burke Av (2,5)\",\"72 St (C,B)\",\"Buhre Av (6)\",\"111 St (A)\",\"Chambers St (1,2,3)\",\"57 St (F)\",\"Pelham Bay Park (6)\",\"40 St-Lowery St (7)\",\"President St-Medgar Evers College (2,5)\",\"Sutter Av-Rutland Rd (3)\",\"Marble Hill-225 St (1)\",\"Eastern Pkwy-Brooklyn Museum (2,3)\",\"Burnside Av (4)\",\"Avenue P (F)\",\"Parsons Blvd (F)\",\"176 St (4)\",\"Halsey St (J)\",\"Prospect Av (R)\",\"25 Av (D)\",\"Cypress Av (6)\",\"111 St (J)\",\"Forest Av (M)\",\"104 St (A)\",\"Wilson Av (L)\",\"65 St (M,R)\",\"Beach 105 St (A,S)\",\"45 St (R)\",\"Ditmas Av (F)\",\"103 St-Corona Plaza (7)\",\"Rockaway Av (C)\",\"3 Av-149 St (2,5)\",\"96 St (C,B)\",\"47-50 Sts-Rockefeller Ctr (B,D,F,M)\",\"Fulton St (A,C,J,Z,2,3,4,5)\",\"Church Av (F,G)\",\"Bryant Pk (B,D,F,M)\\u002f5 Av (7)\",\"Rector St (R,W)\",\"34 St-Penn Station (1,2,3)\",\"59 St-Columbus Circle (A,B,C,D,1)\",\"14 St-Union Sq (L,N,Q,R,W,4,5,6)\",\"Crown Hts-Utica Av (3,4)\",\"Avenue U (F)\",\"Church Av (2,5)\",\"Astoria Blvd (N,W)\",\"Fort Hamilton Pkwy (F,G)\",\"DeKalb Av (B,Q,R)\",\"Bedford Park Blvd-Lehman College (4)\",\"W 4 St-Wash Sq (A,C,E,B,D,F,M)\",\"New Lots Av (L)\",\"36 St (D,N,R)\",\"2 Av (F)\",\"90 St-Elmhurst Av (7)\",\"Alabama Av (J)\",\"Bay Pkwy (F)\",\"Parkside Av (Q)\",\"86 St (C,B)\",\"163 St-Amsterdam Av (C)\",\"Lexington Av\\u002f63 St (F,Q)\",\"Bedford Av (L)\",\"Beach 67 St (A)\",\"9 Av (D)\",\"Broad Channel (A,S)\",\"Church Av (B,Q)\",\"125 St (A,C,B,D)\",\"Norwood-205 St (D)\",\"Brighton Beach (B,Q)\",\"Northern Blvd (M,R)\",\"75 St-Elderts Ln (J,Z)\",\"Queens Plaza (E,M,R)\",\"Brook Av (6)\",\"63 Dr-Rego Park (M,R)\",\"Harlem-148 St (3)\",\"238 St (1)\",\"Norwood Av (J,Z)\",\"219 St (2,5)\"],\"lat\":[40.694568,40.682893,40.744587,40.903125,40.745906,40.86776,40.707739,40.828584,40.742216,40.719318,40.699622,40.721691,40.741303,40.670513,40.677044,40.711835,40.679843,40.764664,40.75733,40.898379,40.746081,40.629755,40.704817,40.746325,40.783934,40.814229,40.810476,40.689941,40.679364,40.706092,40.726523,40.688873,40.815581,40.603995,40.58962,40.711926,40.576127,40.855225,40.746746,40.697466,40.700492,40.822181,40.706152,40.718331,40.808719,40.592721,40.812118,40.759145,40.613501,40.817894,40.859022,40.695178,40.858985,40.688484,40.71047,40.754203,40.816104,40.841894,40.632836,40.764811,40.834255,40.693241,40.699337,40.706607,40.631435,40.760167,40.708359,40.82795,40.74563,40.660476,40.669847,40.675235,40.837288,40.692435,40.714441,40.87975,40.631386,40.799446,40.714565,40.702762,40.740335,40.752882,40.799075,40.706186,40.730054,40.680829,40.702578,40.833226,40.744149,40.784318,40.805085,40.829993,40.74104,40.847391,40.834041,40.854364,40.63626,40.785672,40.804138,40.678896,40.683194,40.640914,40.669367,40.674542,40.868072,40.888022,40.636949,40.646654,40.625039,40.67358,40.656652,40.731352,40.87785,40.755882,40.768799,40.762593,40.61514,40.781433,40.747215,40.725606,40.700486,40.840295,40.695602,40.703811,40.849505,40.70687,40.713282,40.690545,40.754622,40.851695,40.742454,40.666235,40.726227,40.761728,40.61741,40.706821,40.660397,40.595246,40.830518,40.766779,40.824783,40.640927,40.686145,40.840075,40.697207,40.592374,40.878856,40.669399,40.597473,40.802098,40.706476,40.687119,40.8883,40.865491,40.662549,40.836488,40.607954,40.619589,40.843863,40.650573,40.70546,40.585307,40.604556,40.668234,40.693342,40.864621,40.720824,40.675377,40.869526,40.693879,40.635082,40.689627,40.67705,40.692259,40.713428,40.777891,40.641362,40.7906,40.745494,40.677316,40.878663,40.697857,40.833771,40.829521,40.622687,40.752287,40.724329,40.62927,40.798629,40.857192,40.824073,40.840638,40.707557,40.662742,40.616622,40.743132,40.620671,40.625322,40.724635,40.679947,40.76182,40.733422,40.737015,40.861296,40.744081,40.664038,40.733106,40.635064,40.856093,40.698664,40.674772,40.679371,40.752039,40.749145,40.839306,40.601875,40.831509,40.661614,40.603217,40.744065,40.611815,40.820948,40.678024,40.703869,40.592943,40.860531,40.709179,40.826525,40.683263,40.60867,40.822008,40.818393,40.779492,40.762456,40.739864,40.68963,40.742626,40.660365,40.680429,40.85041,40.759901,40.77362,40.737796,40.844434,40.718701,40.793919,40.713154,40.688246,40.634967,40.680303,40.807754,40.750582,40.645069,40.670559,40.749643,40.839892,40.865462,40.728251,40.680438,40.661453,40.796092,40.886037,40.67413,40.712646,40.755746,40.893193,40.889248,40.666271,40.858407,40.629742,40.702566,40.732849,40.672097,40.663515,40.869444,40.588841,40.588034,40.722854,40.678822,40.778453,40.679921,40.702147,40.712603,40.762862,40.5993,40.581011,40.603923,40.617618,40.768141,40.756879,40.625657,40.644031,40.686113,40.701397,40.77344,40.75173,40.747141,40.756804,40.848828,40.81649,40.639967,40.826551,40.70026,40.72028,40.675345,40.707513,40.807722,40.595398,40.873244,40.600066,40.819585,40.742878,40.862803,40.830135,40.756312,40.685951,40.866978,40.645098,40.775036,40.803967,40.580903,40.8459,40.722301,40.730953,40.788644,40.718463,40.684162,40.577422,40.751992,40.730328,40.813224,40.682829,40.664635,40.713715,40.700377,40.718267,40.665449,40.576312,40.67271,40.680988,40.7596,40.74307,40.79502,40.747659,40.688089,40.711396,40.586896,40.820421,40.800603,40.835537,40.871356,40.775594,40.84681,40.684331,40.715478,40.763972,40.852462,40.743781,40.667883,40.664717,40.874561,40.671987,40.853453,40.608944,40.707564,40.84848,40.68637,40.665414,40.597704,40.805368,40.697418,40.704423,40.681711,40.688764,40.749669,40.583209,40.648939,40.636119,40.749865,40.67834,40.816109,40.791642,40.758663,40.710089,40.644041,40.754022,40.70722,40.750373,40.768272,40.735066,40.668897,40.596063,40.650843,40.770258,40.650782,40.690635,40.873412,40.732338,40.658733,40.655144,40.723402,40.748408,40.676992,40.620769,40.655292,40.785868,40.836013,40.764629,40.717304,40.590927,40.646292,40.608382,40.650527,40.811109,40.874811,40.577621,40.752885,40.691324,40.748973,40.807566,40.729846,40.82388,40.884667,40.68141,40.883895],\"legendgroup\":\"\",\"lon\":[-73.949046,-73.910456,-73.930997,-73.85062,-73.998041,-73.897174,-73.93985,-73.879159,-73.948916,-74.006886,-73.911986,-73.844521,-73.989344,-73.958688,-73.86505,-74.012188,-73.85147,-73.980658,-73.970487,-73.854376,-73.982076,-73.976971,-74.014065,-73.896403,-73.979917,-73.94077,-73.926138,-73.87255,-73.930729,-73.950308,-73.852719,-73.96007,-73.958372,-73.755405,-73.97425,-73.94067,-73.975939,-73.929412,-73.891366,-73.993086,-73.828294,-73.896736,-73.933147,-73.837324,-73.907657,-73.97823,-73.904098,-73.95326,-74.00061,-73.947649,-73.93418,-73.84433,-73.855359,-73.985001,-73.793604,-73.942836,-73.896435,-73.873488,-73.947642,-73.973347,-73.851222,-73.990642,-73.990531,-73.922913,-73.995476,-73.975224,-73.957757,-73.925741,-73.902984,-73.830301,-73.950466,-73.971046,-73.887734,-73.86001,-73.831008,-73.884655,-74.005351,-73.968379,-73.944053,-73.90774,-74.002134,-73.932755,-73.951822,-73.895877,-73.99107,-73.975098,-74.013329,-73.860816,-73.912549,-73.947152,-73.954882,-73.891865,-73.997871,-73.939704,-73.94489,-73.860495,-73.994791,-73.95107,-73.937594,-73.904356,-73.873785,-73.994304,-73.901975,-73.896548,-73.919899,-73.860341,-74.074835,-73.90185,-73.960803,-73.995959,-73.9502,-73.954449,-73.866256,-74.00191,-73.958424,-73.967613,-73.974197,-73.972143,-73.993365,-73.995432,-73.807969,-73.880049,-73.904084,-73.918425,-73.933596,-73.953431,-74.006978,-73.985065,-73.845625,-73.937969,-73.882017,-73.884079,-74.003739,-73.983849,-73.985026,-74.0091,-73.998091,-73.955161,-73.941514,-73.921479,-73.944216,-73.963891,-73.990862,-73.917791,-73.935657,-73.788522,-73.904834,-73.942161,-73.979137,-73.949625,-74.011056,-73.975375,-73.830834,-73.927271,-73.908946,-73.847036,-74.001736,-73.998864,-73.836322,-73.899485,-73.810708,-73.820558,-73.998168,-73.834058,-73.928814,-73.918822,-74.005229,-73.872106,-73.846384,-73.851576,-73.962793,-73.953522,-73.972367,-73.986642,-73.950847,-73.951787,-74.017881,-73.947478,-73.988691,-73.98311,-73.838591,-73.927397,-73.91844,-73.874516,-74.028398,-73.993391,-73.997702,-73.961639,-73.941617,-73.867615,-73.893064,-73.939847,-74.011862,-73.95085,-74.030876,-73.918435,-73.990414,-73.976127,-73.951277,-73.884639,-73.925508,-74.002906,-73.877223,-73.897749,-73.995657,-73.900571,-73.869229,-74.011719,-73.900741,-73.919711,-73.957624,-73.858992,-73.928781,-73.869527,-73.9134,-73.993728,-73.867618,-73.962246,-73.972361,-73.949724,-73.981848,-73.890549,-73.891688,-73.947408,-73.776013,-73.925536,-73.820574,-73.886283,-73.965838,-73.957734,-73.953676,-73.927034,-73.955589,-73.985984,-73.986599,-73.92227,-73.953581,-73.979493,-73.843853,-73.905227,-73.984139,-73.959874,-73.997732,-73.914685,-74.00058,-73.972323,-74.003766,-73.980492,-74.023377,-73.995048,-73.945495,-73.940202,-74.014034,-73.989041,-73.987887,-73.842952,-73.867352,-74.005367,-73.950426,-73.916327,-73.961454,-73.878751,-73.88075,-73.783817,-73.987581,-73.857473,-73.898583,-73.980305,-73.903879,-74.02551,-73.816859,-73.986122,-73.835919,-73.902447,-73.915279,-73.983765,-73.813641,-74.006277,-73.920786,-73.98197,-73.940858,-73.801109,-74.009552,-73.981637,-73.955929,-73.974574,-73.980353,-73.959399,-73.96387,-73.92074,-73.996624,-73.964492,-73.973946,-73.986751,-73.982209,-73.855334,-73.945032,-73.929575,-73.868457,-73.907807,-73.948411,-73.95036,-73.941126,-73.993915,-73.903097,-74.013783,-73.96411,-73.768175,-73.887138,-73.761353,-73.90177,-73.992821,-73.901034,-73.938209,-73.913333,-73.825798,-73.893509,-73.948959,-73.912034,-73.966847,-73.835592,-73.910136,-73.997141,-73.981628,-73.976218,-73.987776,-73.977789,-73.981233,-73.977359,-73.992629,-73.929849,-73.905249,-73.894895,-73.990173,-73.950234,-73.993753,-73.889395,-73.968501,-73.890358,-73.956337,-73.83003,-73.984264,-73.94425,-73.883697,-73.966839,-73.889601,-73.954155,-73.936245,-73.958161,-73.9214,-73.867164,-73.97641,-73.832569,-73.832163,-74.009266,-73.97745,-73.828121,-73.924016,-73.950683,-73.92261,-73.909831,-73.964375,-73.907684,-73.973022,-73.803326,-73.911794,-73.916559,-73.992872,-73.986829,-73.914042,-73.836345,-73.903077,-73.837683,-73.904046,-73.898453,-73.827559,-74.010006,-73.978172,-73.8627,-73.911946,-73.917757,-73.964696,-73.981329,-74.007838,-73.979678,-73.983266,-74.013342,-73.991057,-73.981833,-73.990416,-73.932942,-73.973357,-73.949575,-73.917843,-73.975776,-73.981824,-73.890064,-74.000495,-73.899232,-74.003549,-73.989938,-73.876613,-73.898654,-73.975264,-73.961495,-73.968916,-73.939892,-73.966113,-73.956872,-73.796924,-73.994324,-73.815925,-73.962982,-73.952343,-73.878855,-73.961376,-73.906006,-73.867139,-73.937243,-73.91924,-73.861604,-73.93647,-73.90087,-73.880039,-73.862633],\"marker\":{\"color\":[569658.4826000003,284932.4588,826019.9515000003,378545.7307999999,2050753.3262000005,864264.0972999998,776631.4164999999,556412.3038999998,493764.29030000005,627860.7701,2264307.5242,2596041.874700001,2996196.2327000005,1537253.8901,525021.9095999999,1492594.4247,264005.13060000003,5515004.154099999,5334725.634300001,261585.4472,2612893.100599999,329600.7808000002,2314562.5302999993,629435.1970999999,1618996.0617000004,1032907.3599999998,783555.8220999999,132807.53290000005,1341171.6770000001,514592.6958,756626.7990000003,608569.8288000001,930463.5484,380131.9230999998,231862.5086999999,702114.1815,294869.6149,762446.3269,6555553.985099999,720221.0327,218323.18080000003,263272.9863000001,825166.9236000001,318937.4725000001,117439.85840000001,253674.48690000002,436413.8873000002,773900.5669999998,590690.2636999997,496154.33869999996,440910.6487,269638.16759999987,255914.87790000002,1319931.7765000002,737552.5105000001,732018.7578000003,314395.5042,777341.2432000001,1703862.2832999995,2446599.7576000006,484420.58730000013,2956502.7034,1050800.3066,968724.5486000001,188180.3239,2250852.108,1323294.4759000004,2448146.4574000007,1935039.1305999998,283564.4961999998,367764.08909999987,772459.7281,509811.2605999997,384606.90390000003,2326019.8055,795310.2291999998,685834.1642999998,1326368.5735000002,993229.8192,276975.7507000001,5201485.477400001,565950.5625,738944.0454000001,559342.5365,1760530.8631000002,460150.9258,2785593.6859999998,1487460.8491000007,648652.4992999998,1740413.4074999997,673193.6039999998,373720.5066000001,863480.6932999999,1013113.6549999997,1060823.9642,234079.45679999996,338707.25120000006,1755444.3024,2241432.6892,780600.0864000003,355401.9530999999,398459.25690000004,220785.72920000003,225244.4884000001,782020.1533000004,265196.84760000004,1.6831,320033.36010000005,541687.3419000001,451375.6248,743974.9372,1146861.4789,534184.3522,2581095.829,2971942.7905,5563377.5216999985,272508.96620000014,1618871.5676000002,1423289.7279999997,4442850.247200001,2551503.4299999997,459622.3291,779801.0376999999,1256075.0057999985,1075036.7409999995,279985.20070000004,800292.6027000006,801150.915,883759.8553999998,812040.7125999999,1250246.0932999998,506083.16659999994,1405739.1256000004,2728913.7233999996,416660.0404000002,1780375.0194999997,568382.0093000002,434308.51600000006,245234.70139999996,2071258.0799000014,2189408.3504999992,592925.0075000003,1244131.2146,665941.202,1188572.6909000003,161091.94979999994,875938.5737000002,450107.6574,399108.89159999986,932120.4567999998,590568.1092000001,734312.8932000004,349513.4018,593012.8846999998,376673.0204,227209.77719999998,605509.5332,483670.36979999975,188362.42239999998,267659.32369999995,396939.3916,45917.72440000001,506074.01740000024,76217.87799999997,545911.3684999997,630339.7830999999,1894164.6026999997,977540.6596000004,424024.0505999999,257367.97809999986,992292.6383000003,900671.9961000001,1503656.9225000003,3594262.8299000002,1847014.8671000004,2450793.2497000005,1939438.5873999994,1196660.3075999995,1908055.2147000008,1138304.9725999995,245904.44720000005,367641.4051,646507.8489000001,553155.7385000001,1798508.1442,7652911.154099997,2203569.9888999998,300473.3656,1270696.1505,758626.1823999999,611827.77,2421271.0700000008,1586504.6311000008,557834.373,813894.9576000005,1344113.0713,542206.2127000003,173683.58750000002,1094567.3608000004,204411.81499999992,1903190.7591,1328113.6243000003,1810722.9326999998,834546.5101000001,1481711.2983999995,275125.0155,1952938.6609,1162664.294900001,362504.85379999987,378356.52340000006,155917.07150000008,412118.6160999999,371138.2886000001,2939101.418200002,454705.92949999997,774973.2721999999,363762.4573000002,1528494.8432999998,328983.23,213383.49980000008,727067.3377000003,947966.6050999998,230136.60689999993,418071.18869999994,64665.74719999998,674994.0512999997,481515.3387999999,173794.50930000003,727888.2592000002,1969225.6260999995,1270094.1918000001,975137.3213999998,4882292.2638,2253774.6601000004,2545011.8551000003,656826.5814,1693690.5505000001,641823.9228999999,713808.8345,543552.9179000001,3850456.540600001,2890809.2748,5467117.732999999,440181.8845999999,4983932.259099998,3633088.1235999996,2804250.990199999,1015224.4306000003,1207667.4549,1193562.2985999994,1251955.5047,1682842.3613000005,1363239.8841999997,1335531.3196000005,11000954.943499994,355577.25659999996,397153.9576999999,1310613.4888,1725406.5340999998,434939.4959000001,472258.75120000006,605254.5103999999,256076.7368000001,1831605.6693,19703289.800399996,378993.9575,651332.2478999998,1246118.3156999997,525025.4524000001,810827.6235999996,461969.02650000015,745020.5550000004,101258.68039999995,233660.50879999995,222511.76520000005,264125.67610000016,81545.64109999998,546709.3855999999,483770.9432999999,4237267.0600999985,656628.3102,2612370.9214999992,5714295.455200001,1841298.1603,726950.5814999999,147247.59650000004,571029.9077999998,558187.4020000004,2443981.509899999,1270494.459999999,545312.8510000003,364085.00529999996,595798.2276000001,1555079.0351000004,2232278.985700001,946399.3729000001,2564491.2618000004,1016805.719,206444.98500000002,420290.4371,651641.8881,974577.2264999996,738140.5135999997,427532.4412000001,162173.8747,747618.3122,1586914.7794000003,76671.15560000001,549699.5426999999,143715.37370000005,562453.6771,2554135.9958000006,1007101.3836000002,263352.1125999999,768862.9341999998,607470.5059000003,513049.91870000015,423921.89040000015,2252152.4928,1326054.3742000002,68172.01919999998,349071.4342,1400527.3549000004,2669971.6453000004,1842443.6897000005,3342624.6989999996,4584694.789800001,1524474.2920000004,13926936.971699994,2669895.063099999,356850.99929999997,249546.18300000002,389642.5987999999,1724192.1264000004,334911.8326,2749768.699799999,224272.04129999992,341583.48369999987,239156.29799999998,559036.3777999999,6310435.016,2028472.1948000002,965750.1717000001,1257151.873,590095.3106000003,370121.0659,1292040.0993999997,269784.4607,769964.6023000004,856886.4777000004,237895.88989999995,1081720.3109000004,342389.2827999999,215176.00090000004,1810661.0370999998,1228592.6106000002,562507.2281000001,1014980.2287999997,339932.6256,541805.5134,491931.89650000003,527722.8232,839498.2748,243238.51649999985,713642.1853999998,466798.6308999999,571321.8816000001,896921.4536000004,433839.8926,232835.9773,205579.32389999984,416878.61770000006,167070.2094999999,438435.06240000005,330837.12730000017,30151.43179999999,1261951.6064999998,377520.3491,2795320.2671000008,487178.4025999999,1542680.4053000002,997023.2257,5378794.4229999995,8325765.3465,974486.9110000004,5148990.025,778641.4292000001,6914364.861800001,7506843.661800004,10301632.769000001,2317746.9242999996,176260.26350000003,886998.9271999998,1620501.2033000002,610549.2135999999,3035027.727500003,545419.3122,4770323.611700002,318721.9403000001,2409851.3159999996,1989567.4441999998,2047239.0597,176965.9425000001,128072.88029999993,706399.6635999999,1151558.9115000004,388716.32279999985,1760154.5362,3912164.1766,217217.87760000015,571656.2101,26490.338300000003,1467901.5726,2583512.5006999997,552796.7006000002,1209986.901899999,653112.7602000004,188405.90899999996,1752258.8332999987,498337.2091,1399378.6455,306996.60680000007,441553.88779999997,213943.01559999996,218239.5564],\"coloraxis\":\"coloraxis\"},\"mode\":\"markers\",\"name\":\"\",\"showlegend\":false,\"subplot\":\"map\",\"type\":\"scattermap\"}],                        {\"template\":{\"data\":{\"histogram2dcontour\":[{\"type\":\"histogram2dcontour\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"choropleth\":[{\"type\":\"choropleth\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"histogram2d\":[{\"type\":\"histogram2d\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"heatmap\":[{\"type\":\"heatmap\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"heatmapgl\":[{\"type\":\"heatmapgl\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"contourcarpet\":[{\"type\":\"contourcarpet\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"contour\":[{\"type\":\"contour\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"surface\":[{\"type\":\"surface\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]}],\"mesh3d\":[{\"type\":\"mesh3d\",\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}],\"scatter\":[{\"fillpattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2},\"type\":\"scatter\"}],\"parcoords\":[{\"type\":\"parcoords\",\"line\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterpolargl\":[{\"type\":\"scatterpolargl\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"bar\":[{\"error_x\":{\"color\":\"#2a3f5f\"},\"error_y\":{\"color\":\"#2a3f5f\"},\"marker\":{\"line\":{\"color\":\"#E5ECF6\",\"width\":0.5},\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"bar\"}],\"scattergeo\":[{\"type\":\"scattergeo\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterpolar\":[{\"type\":\"scatterpolar\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"histogram\":[{\"marker\":{\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"histogram\"}],\"scattergl\":[{\"type\":\"scattergl\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatter3d\":[{\"type\":\"scatter3d\",\"line\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}},\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scattermapbox\":[{\"type\":\"scattermapbox\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scatterternary\":[{\"type\":\"scatterternary\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"scattercarpet\":[{\"type\":\"scattercarpet\",\"marker\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}}}],\"carpet\":[{\"aaxis\":{\"endlinecolor\":\"#2a3f5f\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"minorgridcolor\":\"white\",\"startlinecolor\":\"#2a3f5f\"},\"baxis\":{\"endlinecolor\":\"#2a3f5f\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"minorgridcolor\":\"white\",\"startlinecolor\":\"#2a3f5f\"},\"type\":\"carpet\"}],\"table\":[{\"cells\":{\"fill\":{\"color\":\"#EBF0F8\"},\"line\":{\"color\":\"white\"}},\"header\":{\"fill\":{\"color\":\"#C8D4E3\"},\"line\":{\"color\":\"white\"}},\"type\":\"table\"}],\"barpolar\":[{\"marker\":{\"line\":{\"color\":\"#E5ECF6\",\"width\":0.5},\"pattern\":{\"fillmode\":\"overlay\",\"size\":10,\"solidity\":0.2}},\"type\":\"barpolar\"}],\"pie\":[{\"automargin\":true,\"type\":\"pie\"}]},\"layout\":{\"autotypenumbers\":\"strict\",\"colorway\":[\"#636efa\",\"#EF553B\",\"#00cc96\",\"#ab63fa\",\"#FFA15A\",\"#19d3f3\",\"#FF6692\",\"#B6E880\",\"#FF97FF\",\"#FECB52\"],\"font\":{\"color\":\"#2a3f5f\"},\"hovermode\":\"closest\",\"hoverlabel\":{\"align\":\"left\"},\"paper_bgcolor\":\"white\",\"plot_bgcolor\":\"#E5ECF6\",\"polar\":{\"bgcolor\":\"#E5ECF6\",\"angularaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"radialaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"}},\"ternary\":{\"bgcolor\":\"#E5ECF6\",\"aaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"baxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"},\"caxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\"}},\"coloraxis\":{\"colorbar\":{\"outlinewidth\":0,\"ticks\":\"\"}},\"colorscale\":{\"sequential\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]],\"sequentialminus\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]],\"diverging\":[[0,\"#8e0152\"],[0.1,\"#c51b7d\"],[0.2,\"#de77ae\"],[0.3,\"#f1b6da\"],[0.4,\"#fde0ef\"],[0.5,\"#f7f7f7\"],[0.6,\"#e6f5d0\"],[0.7,\"#b8e186\"],[0.8,\"#7fbc41\"],[0.9,\"#4d9221\"],[1,\"#276419\"]]},\"xaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\",\"title\":{\"standoff\":15},\"zerolinecolor\":\"white\",\"automargin\":true,\"zerolinewidth\":2},\"yaxis\":{\"gridcolor\":\"white\",\"linecolor\":\"white\",\"ticks\":\"\",\"title\":{\"standoff\":15},\"zerolinecolor\":\"white\",\"automargin\":true,\"zerolinewidth\":2},\"scene\":{\"xaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2},\"yaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2},\"zaxis\":{\"backgroundcolor\":\"#E5ECF6\",\"gridcolor\":\"white\",\"linecolor\":\"white\",\"showbackground\":true,\"ticks\":\"\",\"zerolinecolor\":\"white\",\"gridwidth\":2}},\"shapedefaults\":{\"line\":{\"color\":\"#2a3f5f\"}},\"annotationdefaults\":{\"arrowcolor\":\"#2a3f5f\",\"arrowhead\":0,\"arrowwidth\":1},\"geo\":{\"bgcolor\":\"white\",\"landcolor\":\"#E5ECF6\",\"subunitcolor\":\"white\",\"showland\":true,\"showlakes\":true,\"lakecolor\":\"white\"},\"title\":{\"x\":0.05},\"mapbox\":{\"style\":\"light\"}}},\"map\":{\"domain\":{\"x\":[0.0,1.0],\"y\":[0.0,1.0]},\"center\":{\"lat\":40.72991396705882,\"lon\":-73.93293752941176},\"zoom\":10,\"style\":\"carto-darkmatter\"},\"coloraxis\":{\"colorbar\":{\"title\":{\"text\":\"wt\"}},\"colorscale\":[[0.0,\"#0d0887\"],[0.1111111111111111,\"#46039f\"],[0.2222222222222222,\"#7201a8\"],[0.3333333333333333,\"#9c179e\"],[0.4444444444444444,\"#bd3786\"],[0.5555555555555556,\"#d8576b\"],[0.6666666666666666,\"#ed7953\"],[0.7777777777777778,\"#fb9f3a\"],[0.8888888888888888,\"#fdca26\"],[1.0,\"#f0f921\"]]},\"legend\":{\"tracegroupgap\":0},\"margin\":{\"t\":1,\"l\":1,\"r\":1,\"b\":1},\"mapbox\":{\"center\":{\"lat\":40.72991396705882,\"lon\":-73.93293752941176},\"zoom\":10,\"style\":\"carto-darkmatter\"}},                        {\"responsive\": true}                    )                };                            <\/script>        <\/div>\n<p>Using these weights and using the proximity scores for each site by decade, we can calculate the weighted proximity, or \\(P_w\\) value, for each site. Grouping them by decade, we calculate:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">As of Year<\/th>\n<th style=\"text-align:right\">Mean Weighted Proximity (\\(P_w\\)) Score<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">1990<\/td>\n<td style=\"text-align:right\">0.0002<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2000<\/td>\n<td style=\"text-align:right\">0.0007<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2010<\/td>\n<td style=\"text-align:right\">0.0012<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2020<\/td>\n<td style=\"text-align:right\">0.0014<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Public exposure to art, as measured by \\(P_w\\), has increased every decade, but the marginal increase has slowed. Using the 2020 data (see table below), about half of the weight of travellers has a <em>p<\/em> score greater than 0.8, indicating close proximity to public art. A quarter of the weight have some public art within 500m, but more than 25m. The last quarter are more than 500m from public art.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Criteria<\/th>\n<th style=\"text-align:right\"># of Sites (2020)<\/th>\n<th style=\"text-align:right\">% of Weight<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">p &gt; 1<\/td>\n<td style=\"text-align:right\">94<\/td>\n<td style=\"text-align:right\">31<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">p &gt; 0.8<\/td>\n<td style=\"text-align:right\">220<\/td>\n<td style=\"text-align:right\">53<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">0.8 &gt; p &gt; 0<\/td>\n<td style=\"text-align:right\">74<\/td>\n<td style=\"text-align:right\">25<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">p = 0<\/td>\n<td style=\"text-align:right\">131<\/td>\n<td style=\"text-align:right\">22<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The slowdown in the increase may be attributed to some sites hosting multiple art pieces. There are 94 sites with more than one public artwork. This model limits <em>P<\/em> to at most one maximal proximity piece, so having two pieces in close proximity does not double the exposure.<\/p>\n<h2 id=\"highest-impact-locations-by-2030\">Highest Impact Locations by 2030<\/h2>\n<p>If the MTA wished to increase their \\(P_w\\) score for 2030, they should install art within sites with the highest remaining impact, namely sites without existing art but a high weight. (We are assuming no artwork will be removed from an existing site.)<\/p>\n<h3 id=\"comparison-to-mtas-2021-2023-art-installations\">Comparison to MTA&rsquo;s 2021-2023 Art Installations<\/h3>\n<p>The MTA permanent art catalog lists 23 installations between 2021 and 2023 (below). Some of these sites, such as the two Mount Vernon bridges and the Roosevelt Island Ventilation Structure (&ldquo;Double Take&rdquo;), represent installations visible along transit, versus installed at a stop. Note that the model does not include any anchor points along the transit routes, so the model does not value installations of this type.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Agency<\/th>\n<th style=\"text-align:left\">2021-2023 Installation Site<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">86 St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">138 St-Grand Concourse<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Metro-North<\/td>\n<td style=\"text-align:left\">Port Jervis<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Metro-North<\/td>\n<td style=\"text-align:left\">Mount Vernon 3rd Avenue Bridge<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">1 Av<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">Bedford Av<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Metro-North<\/td>\n<td style=\"text-align:left\">Mount Vernon 10th Avenue Bridge<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">New Hyde Park<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Westbury<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">Tremont Av<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Grand Central Madison<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">Times Sq-42 St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Elmont-UBS Arena<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Grand Central Madison<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">181 St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">Lorimer St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">7 Av<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">34 St-Penn Station<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Roosevelt Island Ventilation Structure<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LIRR<\/td>\n<td style=\"text-align:left\">Mineola<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">Grand St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">NYCT<\/td>\n<td style=\"text-align:left\">E 149 St<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">SIR<\/td>\n<td style=\"text-align:left\">New Dorp<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Based on a stop\/complex&rsquo;s weight and the lack of public art in the proximity, we can rank the top 20 sites where the addition of public art will yield the greatest increase in MTA&rsquo;s \\(P_w\\) score. Of this list, the MTA has installed art since 2020 at three of the twenty locations. (As of this writing, the data does not cover 2024 or later.)<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Complex ID<\/th>\n<th style=\"text-align:left\">Recommended Site<\/th>\n<th style=\"text-align:left\">Art Installation Since 2020?<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">120<\/td>\n<td style=\"text-align:left\">Bedford Av (L)<\/td>\n<td style=\"text-align:left\">Yes<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">451<\/td>\n<td style=\"text-align:left\">Junction Blvd (7)<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">450<\/td>\n<td style=\"text-align:left\">103 St-Corona Plaza (7)<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">231<\/td>\n<td style=\"text-align:left\">Grand St (B,D)<\/td>\n<td style=\"text-align:left\">Yes<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">119<\/td>\n<td style=\"text-align:left\">1 Av (L)<\/td>\n<td style=\"text-align:left\">Yes<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">261<\/td>\n<td style=\"text-align:left\">Forest Hills-71 Av (E,F,M,R)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">153<\/td>\n<td style=\"text-align:left\">125 St (A,C,B,D)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">605<\/td>\n<td style=\"text-align:left\">168 St (A,C,1)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">259<\/td>\n<td style=\"text-align:left\">Kew Gardens-Union Tpke (E,F)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">151<\/td>\n<td style=\"text-align:left\">145 St (A,C,B,D)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">165<\/td>\n<td style=\"text-align:left\">23 St (C,E)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">452<\/td>\n<td style=\"text-align:left\">90 St-Elmhurst Av (7)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">232<\/td>\n<td style=\"text-align:left\">2 Av (F)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">51<\/td>\n<td style=\"text-align:left\">Kings Hwy (B,Q)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">264<\/td>\n<td style=\"text-align:left\">Woodhaven Blvd (M,R)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">35<\/td>\n<td style=\"text-align:left\">59 St (N,R)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">254<\/td>\n<td style=\"text-align:left\">Jamaica-179 St (F)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">265<\/td>\n<td style=\"text-align:left\">Grand Av-Newtown (M,R)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">234<\/td>\n<td style=\"text-align:left\">East Broadway (F)<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">312<\/td>\n<td style=\"text-align:left\">79 St (1)<\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Note that this list is based solely on proximity to MTA&rsquo;s permanent artworks and the number of passengers; some sites may not be appropriate locations for art due to legal restrictions, lack of percent funding, physical restrictions, or prominent non-MTA artwork.<\/p>\n<h2 id=\"limitations-of-approach-and-data\">Limitations of Approach and Data<\/h2>\n<p>This study&rsquo;s effectiveness is limited by both elements of the approach and of the data. First, our identification of art locations is based solely on the MTA&rsquo;s permanent art catalog and our semi-manual mapping to latitudes and longitudes. This catalog is not  inclusive of the range of MTA&rsquo;s artwork; MTA&rsquo;s Art &amp; Design program also includes digital art, photography exhibitions, posters, poetry, and live music. These latter forms may not necessarily fulfill the definition of &ldquo;public art,&rdquo; rather often being &ldquo;art in public,&rdquo; but the public is being exposed to forms of art. It is up to the viewer how much stress should be placed on &ldquo;public art&rdquo; versus &ldquo;art in public.&rdquo; MTA travellers will also be exposed to art along transit lines and directly outside stops that are not controlled by the MTA and thus not catalogued. Ideally, the MTA should not be obliged to install artwork in an area already well-serviced by other parties.<\/p>\n<p>The study&rsquo;s method for weighting locations with passenger volumes is based on a data source and an origin-destination model for 2023. Ideally, we would use historical passenger volumes when computing historical <em>P<\/em> values.<\/p>\n<p>The correlation between proximity and exposure may be poor; it is certainly unmeasured. A traveller may not notice an artwork depending on the layout of the station and the flow of traffic. Over time, artwork may also blend into the background due to changes at the site or public familiarity. Temporary or rotating art may yield greater exposure due to its novelty. We expect a series of traveller surveys would illuminate what artwork is being seen and remembered by travellers.<\/p>\n<h2 id=\"production-notes\">Production Notes<\/h2>\n<p>We used <a href=\"https:\/\/duckdb.org\/\">DuckDB<\/a> for the majority of the data analytics and <a href=\"https:\/\/www.python.org\">Python<\/a> for &ldquo;glue&rdquo; and specialized purposes. Visualizations were created using <a href=\"https:\/\/plotly.com\/\">plotly<\/a>.<\/p>\n<p>Primary data sources used for this study:<\/p>\n<ul>\n<li><a href=\"https:\/\/data.ny.gov\/Transportation\/MTA-Permanent-Art-Catalog-Beginning-1980\/4y8j-9pkd\">MTA Permanent Art Catalog: Beginning 1980<\/a><\/li>\n<li><a href=\"https:\/\/data.ny.gov\/Transportation\/MTA-Subway-Origin-Destination-Ridership-Estimate-2\/uhf3-t34z\/about_data\">MTA Subway Origin-Destination Ridership Estimate: 2023<\/a><\/li>\n<\/ul>\n<p>Secondary data sources used for this study:<\/p>\n<ul>\n<li><a href=\"https:\/\/data.ny.gov\/Transportation\/MTA-Subway-Stations-and-Complexes\/5f5g-n3cz\/about_data\">MTA Subway Stations and Complexes<\/a><\/li>\n<li><a href=\"https:\/\/data.ny.gov\/Transportation\/MTA-General-Transit-Feed-Specification-GTFS-Static\/fgm6-ccue\/about_data\">MTA General Transit Feed Specification (GTFS) Static Data<\/a><\/li>\n<\/ul>\n<h3 id=\"art-locations\">Art Locations<\/h3>\n<p>The MTA Permanent Art Catalog identifies the locations of artwork by agency, station name, and line. For the model, we need a latitude and longitude. To derive a latitude and longitude, we attempt to map the agency, station name, and line to a singular location through the secondary data sources. However, since station names and lines change, and are not always represented in a canonical form, this is a messy join process. The <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/mta-2024-challenge\/-\/blob\/master\/namemunging.py\">namemunging.py<\/a> script describes the processing steps we took to create the &ldquo;lookup.csv&rdquo; file. The <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/mta-2024-challenge\/-\/blob\/master\/data\/corrections.csv\">corrections.csv<\/a> contains the manual corrections used to associate art with locations when the location metadata did not match any of the current locations.<\/p>\n<h2 id=\"disclaimer\">Disclaimer<\/h2>\n<p>Although this study leveraged open data provided by the MTA and the New York government, the study was conducted independently of those parties.<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>(Backer 2004) Backer, Jack. 2004. \u201cPublic Art: An Essential Component of Creating Communities.\u201d Americans for the Arts Monograph. Americans for the Arts. <a href=\"https:\/\/www.americansforthearts.org\/by-program\/reports-and-data\/legislation-policy\/naappd\/monograph-public-art-an-essential-component-of-creating-communities\">https:\/\/www.americansforthearts.org\/by-program\/reports-and-data\/legislation-policy\/naappd\/monograph-public-art-an-essential-component-of-creating-communities<\/a>.<\/p>\n<p>(Cheung 2022) Cheung, Ming, Natasha Smith, and Owen Craven. 2022. \u201cThe Impacts of Public Art on Cities, Places and People\u2019s Lives.\u201d The Journal of Arts Management, Law, and Society 52 (1): 37\u201350. <a href=\"https:\/\/doi.org\/10.1080\/10632921.2021.1942361\">https:\/\/doi.org\/10.1080\/10632921.2021.1942361<\/a>.<\/p>\n<p>(Wilkerson 2009) Wilkerson, April. 2009. \u201cWhat Becomes Public Art? How Oklahoma Officials Determine What Visually Impacts Locals, Newcomers.\u201d Journal Record, September 8, 2009. Gale General OneFile.<\/p>\n"},{"title":"PostScript\u00ae 1.0 - A Code Study","link":"https:\/\/ztoz.blog\/posts\/postscript-code\/","pubDate":"Thu, 26 Sep 2024 09:55:38 -0600","guid":"https:\/\/ztoz.blog\/posts\/postscript-code\/","description":"<p>In December 2022, Adobe, through the Computer History Museum (CHM), released the source code for PostScript\u00ae, version 1.0. PostScript is one of the foundational technologies of the desktop publishing revolution of the early 1980s, along with laser printers, the graphical user interface of the Apple Macintosh, and Aldus PageMaker. PostScript is a programming language and a page description format for translating visual content into printed documents.<\/p>\n<p>Adobe immediately enjoyed business success through licensing PostScript to laser printer manufacturers and it became the <em>de facto<\/em> digital publishing format. While multiple histories have studied this event through a business lens, what historical questions may be answered through the source code? Further, as software practitioners, what can we learn from the source code to apply to present and future designs?<\/p>\n<p>We argue:<\/p>\n<ol>\n<li>PostScript&rsquo;s design and implementation benefited from a long lineage of other software programs (as Adobe has always admitted),<\/li>\n<li>The software architecture aligned with the interests of creatives, printer services, and printer manufacturers,<\/li>\n<li>Design choices, including modularity and semantics, added value to the product, and<\/li>\n<li>Pursuing the &ldquo;print anything&rdquo; objective, rather than page printing throughput, yielded a superior implementation.<\/li>\n<\/ol>\n<blockquote>\n<p>A note on trademarks: PostScript\u00ae is a trademark of Adobe Inc. For reading clarity, we will omit the registered trademark symbol for the remainder of the article.<\/p>\n<\/blockquote>\n<h2 id=\"table-of-contents\">Table of Contents<\/h2>\n<ol>\n\t<li><a href=\"#background-and-article-outline\">Background and Article Outline<\/a><\/li>\n\t<li><a href=\"#timeline\">Timeline<\/a><\/li>\n\t<li><a href=\"#developers-and-their-influences\">Developers and their Influences<\/a><\/li>\n\t<li><a href=\"#development-and-test-environment\">Development and Test Environment<\/a><\/li>\n\t<li><a href=\"#design-and-source-code\">Design and Source Code<\/a><\/li>\n\t<li><a href=\"#interesting-algorithms-and-designs\">Interesting Algorithms and Designs<\/a><\/li>\n\t<li><a href=\"#alternative-paths--competitors\">Alternative Paths \/ Competitors<\/a><\/li>\n\t<li><a href=\"#print-anything-philosophy\">\"Print Anything\" Philosophy<\/a><\/li>\n\t<li><a href=\"#deployment-and-history-after-version-10\">Deployment and History after Version 1.0<\/a><\/li>\n\t<li><a href=\"#references\">References<\/a><\/li>\n<\/ol>\n<h2 id=\"background-and-article-outline\">Background and Article Outline<\/h2>\n<p>The origin story of Adobe has been told many times. In short, Charles Geschke and John Warnock left Xerox PARC to co-found Adobe after being frustrated at Xerox&rsquo;s seeming disinterest in marketing their publishing technologies. Steve Jobs needed software for Apple&rsquo;s upcoming laser printer and, after seeing what Adobe was developing, convinced the two founders to license their software. Other laser printers similarly licensed PostScript and with the release of the early Macintosh and Aldus Pagemaker, the desktop publishing revolution was born. Two extended write-ups of this story are Pamela Pfiffner&rsquo;s <a href=\"https:\/\/archive.org\/details\/insidepublishing00pfif\/\"><em>Inside the Publishing Revolution: The Adobe Story<\/em><\/a>, which is a 2003 book-length business history, and Tekla Perry&rsquo;s 1988 Spectrum article, <a href=\"https:\/\/spectrum.ieee.org\/adobe-postscript\"><em>Inventing Postscript, the Tech That Took the Pain out of Printing<\/em><\/a>, which focuses more on the engineering efforts and prior art.<\/p>\n<p>At its core, the PostScript interpreter reads in a PostScript program that describes the desired printed pages and then the interpreter renders each page as a raster image to a device, typically a printer. Graphics are composed of paths, which may be straight-lines or curves and may be furthered defined by strokes, fills, or masks. The graphics model is described in the &ldquo;Red&rdquo; book, but PostScript closely follows a model by (Warnock and Wyatt 1982) shortly before Adobe was founded. Text is rendered as paths, with some clever techniques required to produce high-quality typography. Long treated as an Adobe trade secret, John Warnock described the font rendering algorithms in an article for the American Philosophical Society (Warnock 2012). The approach is based on modifying the character outlines, subject to constraints (&ldquo;blue&rdquo; and &ldquo;yellow&rdquo; hints) and an erosion process, to better align with the raster grid and aesthetic needs.<\/p>\n<p>Within this article, we will first present two timelines. The first contextual timeline will place the development of the code in the larger history of the desktop publishing revolution. The second timeline will focus on the development of the source code. Secondly, the article will cover biographical details of the developers and what education and work experiences they brought towards the development. Then, we will cover the development process and resources available to the development team.<\/p>\n<p>Turning to the source code, we will examine how it is designed and implemented. PostScript pushed forward the state of the art and we will cover some illustrative examples (although Adobe has kept some details as trade secrets and many aspects were developed after the 1.0 release). Finally, we will compare the design to competitor&rsquo;s designs and the impact of Adobe&rsquo;s &ldquo;print anything&rdquo; philosophy.<\/p>\n<h3 id=\"postscript-language\">PostScript Language<\/h3>\n<p>While our study will focus on the PostScript <em>server<\/em> or <em>implementation<\/em>, it is important to note the <em>language<\/em> that the code supports. In (Reid 1988, pg 2), the author states:<\/p>\n<blockquote>\n<p>The three most important aspects of the PostScript programming language are that it is <em>interpreted<\/em>, that it is <em>stack-based<\/em>, and that it uses a unique data structure called a <em>dictionary<\/em>.<\/p>\n<\/blockquote>\n<p>The interpreted nature of the language made it simpler to produce and understand (since you did not need any decompiler to read the code) as well as encouraging experimentation via interactive development. The stack-based model reduced the execution requirements, particularly for memory. Although the Apple LaserWriter infamously had more processing power and memory than the connected Macintosh, resources were at a premium. Further, the content of the stack could be easily modified and composed, allowing programmers great flexibility. The dictionaries allowed existing operators to be replaced or augmented (and then reset to normal for the next job), further providing flexibility to the users.<\/p>\n<p>To give a sense of the language, the code below (<a href=\"https:\/\/rosettacode.org\/wiki\/Sierpinski_triangle\/Graphical#PostScript\">source: Rosetta Code<\/a>) produces a Sierpinski Triangle. Depending on the output device, this will render either to a printed page or a display (Figure A).<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-postscript\" data-lang=\"postscript\"><span style=\"color:#75715e\">%!PS\n<\/span><span style=\"color:#75715e\"><\/span>\n\/sierp { <span style=\"color:#75715e\">% level ax ay bx by cx cy\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#ae81ff\">6<\/span> <span style=\"color:#a6e22e\">cpy<\/span> <span style=\"color:#a6e22e\">triangle<\/span>\n    <span style=\"color:#a6e22e\">sierpr<\/span>\n} <span style=\"color:#a6e22e\">bind<\/span> <span style=\"color:#a6e22e\">def<\/span>\n\n\/sierpr {\n    <span style=\"color:#ae81ff\">12<\/span> <span style=\"color:#a6e22e\">cpy<\/span>\n    <span style=\"color:#ae81ff\">10<\/span> <span style=\"color:#ae81ff\">-4<\/span> <span style=\"color:#ae81ff\">2<\/span> {\n        <span style=\"color:#ae81ff\">5<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#a6e22e\">exch<\/span> <span style=\"color:#ae81ff\">4<\/span> <span style=\"color:#ae81ff\">-1<\/span> <span style=\"color:#a6e22e\">roll<\/span>\n        <span style=\"color:#a6e22e\">add<\/span> <span style=\"color:#ae81ff\">0.5<\/span> <span style=\"color:#a6e22e\">mul<\/span> <span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#a6e22e\">roll<\/span>\n        <span style=\"color:#a6e22e\">add<\/span> <span style=\"color:#ae81ff\">0.5<\/span> <span style=\"color:#a6e22e\">mul<\/span> <span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#ae81ff\">-1<\/span> <span style=\"color:#a6e22e\">roll<\/span>\n        <span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#a6e22e\">roll<\/span>\n    } <span style=\"color:#a6e22e\">for<\/span>       <span style=\"color:#75715e\">% l a b c bc ac ab\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#ae81ff\">13<\/span> <span style=\"color:#ae81ff\">-1<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#a6e22e\">dup<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#a6e22e\">gt<\/span> {\n        <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#a6e22e\">sub<\/span>\n        <span style=\"color:#a6e22e\">dup<\/span> <span style=\"color:#ae81ff\">4<\/span> <span style=\"color:#a6e22e\">cpy<\/span> <span style=\"color:#ae81ff\">18<\/span> <span style=\"color:#ae81ff\">-2<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#a6e22e\">sierpr<\/span>\n        <span style=\"color:#a6e22e\">dup<\/span> <span style=\"color:#ae81ff\">7<\/span> <span style=\"color:#a6e22e\">index<\/span> <span style=\"color:#ae81ff\">7<\/span> <span style=\"color:#a6e22e\">index<\/span> <span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#a6e22e\">cpy<\/span> <span style=\"color:#ae81ff\">16<\/span> <span style=\"color:#ae81ff\">-2<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#a6e22e\">sierpr<\/span>\n        <span style=\"color:#ae81ff\">9<\/span> <span style=\"color:#ae81ff\">3<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#a6e22e\">index<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#a6e22e\">index<\/span> <span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#a6e22e\">cpy<\/span> <span style=\"color:#ae81ff\">13<\/span> <span style=\"color:#ae81ff\">4<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#a6e22e\">sierpr<\/span>\n    } { <span style=\"color:#ae81ff\">13<\/span> <span style=\"color:#ae81ff\">-6<\/span> <span style=\"color:#a6e22e\">roll<\/span> <span style=\"color:#ae81ff\">7<\/span> { <span style=\"color:#a6e22e\">pop<\/span> } <span style=\"color:#a6e22e\">repeat<\/span> } <span style=\"color:#a6e22e\">ifelse<\/span>\n    <span style=\"color:#a6e22e\">triangle<\/span>\n} <span style=\"color:#a6e22e\">bind<\/span> <span style=\"color:#a6e22e\">def<\/span>\n\n\/cpy { { <span style=\"color:#ae81ff\">5<\/span> <span style=\"color:#a6e22e\">index<\/span> } <span style=\"color:#a6e22e\">repeat<\/span> } <span style=\"color:#a6e22e\">bind<\/span> <span style=\"color:#a6e22e\">def<\/span>\n\n\/triangle {\n    <span style=\"color:#a6e22e\">newpath<\/span> <span style=\"color:#a6e22e\">moveto<\/span> <span style=\"color:#a6e22e\">lineto<\/span> <span style=\"color:#a6e22e\">lineto<\/span> <span style=\"color:#a6e22e\">closepath<\/span> <span style=\"color:#a6e22e\">stroke<\/span>\n} <span style=\"color:#a6e22e\">bind<\/span> <span style=\"color:#a6e22e\">def<\/span>\n\n<span style=\"color:#ae81ff\">6<\/span> <span style=\"color:#ae81ff\">50<\/span> <span style=\"color:#ae81ff\">100<\/span> <span style=\"color:#ae81ff\">550<\/span> <span style=\"color:#ae81ff\">100<\/span> <span style=\"color:#ae81ff\">300<\/span> <span style=\"color:#ae81ff\">533<\/span> <span style=\"color:#a6e22e\">sierp<\/span>\n<span style=\"color:#a6e22e\">showpage<\/span>\n<\/code><\/pre><\/div>\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/postscript-code\/Sierpinski-PS.png\" alt=\"Triangle composed of many other triangles fractally\">\n  \n  <figcaption>Figure A: Sierpinski Triangle produced via PostScript code<\/figcaption>\n<\/figure>\n\n\n<p>The PostScript code requires less than 700 bytes to describe the image, which can be scaled and translated arbitrarily or, with some tweaking of the parameters, rendered to a greater depth. In comparison, the compressed PNG requires 15,000 bytes of storage (even with space optimization) and a Scalable Vector Graphics representation 2,800 bytes (or 560 bytes gzipped). The SVG representation builds the image out of vector-based triangles, which can be arbitrarily scaled and translated, but SVG lacks the ability to increase or decrease the depth of the fractal.<\/p>\n<h3 id=\"study-limitations\">Study Limitations<\/h3>\n<p>Although all historical studies are limited by the unavailability of certain artifacts and the distance of time, there are a few limitations particular to this study. First, Adobe has chosen not to release certain files that are still covered by trade secrets. The distribution is missing the files:<\/p>\n<ul>\n<li>except.h<\/li>\n<li>bezier.c<\/li>\n<li>curvefit.c<\/li>\n<li>gray.c and .h<\/li>\n<li>reducer.c and .h<\/li>\n<\/ul>\n<p>The effect of these omissions is that the software will not build &ldquo;as-is&rdquo;. By examining references in existing code, we counted 19 functions or macros that are referenced in the distributed code but are missing implementations.<\/p>\n<p>Secondly, Adobe and CHM have released the code under a custom <a href=\"https:\/\/22740101.fs1.hubspotusercontent-na1.net\/hubfs\/22740101\/chm-eula-postscript-agreement.pdf\">EULA<\/a> that prohibits various activities. In personal correspondence with the CHM, we have received authorization to display sections of the source code with our comments as they consider public scholarship to fall under educational use.<\/p>\n<h2 id=\"timeline\">Timeline<\/h2>\n<h3 id=\"contextual-timeline\">Contextual Timeline<\/h3>\n<p style=\"text-align: center\"><strong>Table 1: Historical Context Timeline<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Date<\/th>\n<th style=\"text-align:left\">Event<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">1965<\/td>\n<td style=\"text-align:left\"><strong>University of Utah&rsquo;s establishes graphics center of excellence<\/strong> Using a grant from ARPA, the university launches a <a href=\"https:\/\/ieee-region6.org\/2023\/utah-computer-graphics-and-visualization-milestone\/\">broad research program<\/a> into computer graphics.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1971?<\/td>\n<td style=\"text-align:left\"><strong>Harbor Pilot Simulator<\/strong>. Evans and Sutherland (E&amp;S) awarded contract to build a harbor pilot simulator for the New York Maritime Academy. To populate the 3D database, Warnock&rsquo;s team invented a stack-based language that created the geometries and textures.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1971<\/td>\n<td style=\"text-align:left\"><strong>Xerox invents the laser printer<\/strong>. Laser printer technology enabled high quality, high throughput printing. The first commercial printer came in 1976 (at least one that supported multiple fonts), IBM&rsquo;s 3800, and soon thereafter by offerings from Xerox and Canon.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1974<\/td>\n<td style=\"text-align:left\"><strong>Xerox develops Press<\/strong>. The first successful page description format is developed at PARC, incorporating lessons from two previous systems. Press supports text, with multiple fonts, images, and filled objects to be incorporated on the same page. Over 200,000 documents are created in Press by 1983.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1977<\/td>\n<td style=\"text-align:left\"><strong>E&amp;S Design System<\/strong>. E&amp;S evolve the language from the Harbor Pilot simulator  into a CAD system.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1977<\/td>\n<td style=\"text-align:left\"><strong>TeX Project Launched<\/strong>. In May, Donald Knuth writes a memo describing the core features of TeX, a new typesetting system. This project would see the first major release in 1982 and the design frozen with release 3.0 in 1990.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1978<\/td>\n<td style=\"text-align:left\"><strong>JaM invented at Xerox PARC<\/strong>. The John and Martin language is used to experiment with graphical models. This language was inspired by the E&amp;S Design System language and the graphical model inspired PostScript&rsquo;s model.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1981<\/td>\n<td style=\"text-align:left\"><strong>Interpress started<\/strong>. Xerox starts work on a successor to Press.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1981-09<\/td>\n<td style=\"text-align:left\"><strong>Command Language Patent filed<\/strong>. John Gaffney files patent for a &ldquo;Command language system for interactive computer&rdquo; based on his work for the Evans and Sutherland Design System.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1982-01<\/td>\n<td style=\"text-align:left\"><strong>Interpress 82<\/strong>. Specifications for &ldquo;version 1&rdquo; of Interpress, Xerox&rsquo;s language and page description system to replace Press are published for internal consumption.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1982-12<\/td>\n<td style=\"text-align:left\"><strong>Adobe Founded<\/strong>. Chuck Geschke and John Warnock found Adobe.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-04<\/td>\n<td style=\"text-align:left\"><strong>DEC interested in license agreement<\/strong> Gordon Bell of DEC meets with team and expresses interest in PostScript as a printer protocol. DEC eventually licenses the technology.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-05<\/td>\n<td style=\"text-align:left\"><strong>Steve Jobs visit<\/strong>. Steve Jobs meets with Adobe and expresses interest in PostScript as a printer protocol for the upcoming Apple laser printer.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-12<\/td>\n<td style=\"text-align:left\"><strong>Adobe signs license agreement with Apple<\/strong> Adobe will provide software for the LaserWriter and ensure PostScript&rsquo;s quality at 300 dpi.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1984<\/td>\n<td style=\"text-align:left\"><strong>PostScript released<\/strong>.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1984<\/td>\n<td style=\"text-align:left\"><strong>Linotronic 300 Imagesetter released<\/strong>. First commercial printer to support PostScript.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1984-04<\/td>\n<td style=\"text-align:left\"><strong>Xerox announces documentation for Interpress available<\/strong>. Version 2.1 of the Interpress Standard is first released externally.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1985<\/td>\n<td style=\"text-align:left\"><strong>PostScript manuals published<\/strong> The &ldquo;Red&rdquo; book and &ldquo;Green&rdquo; book provide a high-quality reference manual and tutorial for PostScript. The reference manual states that it covers PostScript version 23.0.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1985<\/td>\n<td style=\"text-align:left\"><strong>Apple LaserWriter, Aldus Pagemaker 1.0 released<\/strong>. Creatives could now generate documents competitive to commercial printers in quality and without specialized programming or typesetting skills.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1991<\/td>\n<td style=\"text-align:left\"><strong>PostScript Level 2 released<\/strong>. Expanded the capabilities and reliability of the language, but saw limited commercial impact as Level 1 satisfied most needs.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1993<\/td>\n<td style=\"text-align:left\"><strong>PDF 1.0 released<\/strong>. Increased page throughput and reduced variability by sharply limiting the programming language power.<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1997<\/td>\n<td style=\"text-align:left\"><strong>PostScript Level 3 released<\/strong>. Last release of PostScript.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"development-timeline\">Development Timeline<\/h3>\n<p>Based on the file edit history captured in comments within each file, we can attempt to reconstitute the order of development. Table 2, below, lists the creation date (year and month) for each file (denoted in <em>italics<\/em>) and dates that the file was edited (denoted in normal typeface). The comments do not state what was edited, just the author and date. A file may be listed multiple times in a month depending on the edit history.<\/p>\n<p style=\"text-align: center\"><strong>Table 2: Development Timeline<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Date<\/th>\n<th style=\"text-align:left\">Files Created or Edited<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">1983-01<\/td>\n<td style=\"text-align:left\"><em>STbuild.c<\/em>, <em>errors.h<\/em>, <em>globals.h<\/em>, <em>postscript.h<\/em>, <em>procs.h<\/em>, <em>scanner.c<\/em>, <em>types.h<\/em><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-02<\/td>\n<td style=\"text-align:left\"><em>array.c<\/em>, <em>control.c<\/em>, <em>debug.c<\/em>, <em>dict.c<\/em>, <em>error.c<\/em>, <em>exec.c<\/em>, <em>math.c<\/em>, <em>name.c<\/em>, <em>stack.c<\/em>, <em>stream.c<\/em>, <em>string.c<\/em>, <em>type.c<\/em><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-03<\/td>\n<td style=\"text-align:left\"><em>matrix.c<\/em>, error.c, errors.h, globals.h, math.c, procs.h, stream.c, types.h<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-04<\/td>\n<td style=\"text-align:left\"><em>banddevice.c<\/em>, <em>user.c<\/em>, <em>xylock.c<\/em>, banddevice.c, matrix.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-05<\/td>\n<td style=\"text-align:left\"><em>fontcache.c<\/em>, <em>graphics.c<\/em>, <em>graphics.h<\/em>, <em>inputdevice.c<\/em>, <em>memdevice.c<\/em>, <em>nulldevice.c<\/em>, <em>path.c<\/em>, <em>stroke.c<\/em>, dict.c, fontcache.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-06<\/td>\n<td style=\"text-align:left\">fontcache.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-07<\/td>\n<td style=\"text-align:left\"><em>fontdisk.c<\/em>, <em>fonts.h<\/em>, <em>fontshow.c<\/em>, math.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-08<\/td>\n<td style=\"text-align:left\"><em>graphicspriv.h<\/em>, <em>sundev.c<\/em><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-09<\/td>\n<td style=\"text-align:left\"><em>framedevice.c<\/em>, <em>framedevice.h<\/em>, <em>image.c<\/em>, <em>versatecdev.c<\/em>, versatecdev.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-10<\/td>\n<td style=\"text-align:left\">fontdisk.c, fontshow.c, framedevice.h, sundev.c, user.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-11<\/td>\n<td style=\"text-align:left\"><em>device.c<\/em>, <em>font.c<\/em>, <em>fontbuild.c<\/em>, <em>fontcrypt.c<\/em>, <em>graphpak.c<\/em>, <em>unix.c<\/em>,  <em>vm.c<\/em>, error.c, errors.h, framedevice.h, graphics.c, graphpak.c, image.c, inputdevice.c, memdevice.c, nulldevice.c, stroke.c, types.h<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1983-12<\/td>\n<td style=\"text-align:left\">banddevice.c, banddevice.c, device.c, font.c, fontcrypt.c, globals.h, inputdevice.c, matrix.c, matrix.c, memdevice.c, nulldevice.c, postscript.h, sundev.c, unix.c, user.c, versatecdev.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1984-01<\/td>\n<td style=\"text-align:left\">STbuild.c, font.c, graphics.h, graphics.h, math.c<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1984-02<\/td>\n<td style=\"text-align:left\">array.c, control.c, debug.c, device.c, dict.c, exec.c, fontbuild.c, fontbuild.c, fontcache.c, fontdisk.c, fonts.h, fontshow.c, framedevice.c, framedevice.c, graphics.c, graphicspriv.h, graphicspriv.h, image.c, name.c, path.c, path.c, procs.h, scanner.c, stack.c, stream.c, string.c, stroke.c, type.c, types.h, vm.c, xylock.c<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This timeline suggests that the major language components and processing loop for PostScript were completed in the first three months, or January to March 1983. The team then began focusing on the graphics engine, including printing to their borrowed LN01 laser printer, as denoted by the work on the <code>banddevice<\/code> and <code>xylock<\/code> files, as well as <code>graphics<\/code>, <code>path<\/code>, and <code>stroke<\/code>. This effort took most of April and May. Work on fonts, at least initial support, started in May and was the predominant effort in June and July. In August, September, and October, the team started rendering images on the Sun-1 framebuffer, along with additional improvements to fonts. November sees the last big fleshing out of the design, with additional work on fonts and graphics, refinement of interfaces to input and output devices, and virtual memory support for running within restricted devices. The last three months, December to February 1984, see very diffuse efforts, implying integration changes, performance enhancements, and bug fixes.<\/p>\n<h2 id=\"developers-and-their-influences\">Developers and their Influences<\/h2>\n<p>In the forward to the 1985 <em>PostScript Language Reference Manual<\/em>, John Warnock attributes the beginning of the PostScript design to John Gaffney and the Design System, developed at Evans and Sutherland. Furthermore, he notes the contributions of JaM, a language he developed with Martin Newell at Xerox. PostScript came from a rich intellectual pedigree and benefited from prior art.<\/p>\n<h3 id=\"co-founders\">Co-Founders<\/h3>\n<p>Charles Geschke and John Warnock were the two co-founders of Adobe. Both were &ldquo;technical&rdquo; founders and brought relevant expertise to the design and implementation of PostScript.<\/p>\n<h4 id=\"charles-chuck-geschke\">Charles (Chuck) Geschke<\/h4>\n<p>Charles (Chuck) Geschke was awarded a PhD in Computer Science from Carnegie Mellon University. His thesis was titled &ldquo;Global Program Optimizations&rdquo; and his advisor was William Wulf.<\/p>\n<p>Joining Xerox PARC in 1972, he created the Imaging Sciences Laboratory in 1978, and hired John Warnock soon after. While at PARC, Geschke was involved in the design and implementation of a Mesa language compiler. <a href=\"https:\/\/www.softwarepreservation.org\/projects\/mesa\/\">Mesa<\/a> features a module system and a strong type system. Although Mesa was not used to implement PostScript (the language was proprietary to Xerox), Mesa-isms can be seen in the PostScript code as described in the later section, &ldquo;Programming Language and Dialect&rdquo;.<\/p>\n<p>The majority of changes in the version 1.0 source code are attributed to Geschke.<\/p>\n<p><a href=\"https:\/\/www.computerhistory.org\/blog\/remembering-charles-m-geschke\/\">CHM Obituary<\/a><\/p>\n<h4 id=\"john-warnock\">John Warnock<\/h4>\n<p>John Warnock was awarded his PhD from the University of Utah in 1972 for the thesis &ldquo;A hidden surface algorithm for computer generated halftone pictures&rdquo;. His advisor was David Evans who, along with fellow committee member Ivan Sutherland, hired him into Evans and Sutherland. This was a heady period as the University of Utah had been recently established as a national center for computer graphics research by DARPA and Evans and Sutherland were pioneering many graphical and computer simulation techniques.<\/p>\n<p>Warnock stayed at Evans and Sutherland for six years. One of the earliest influences on PostScript&rsquo;s design was the Harbor Pilot Simulator. Warnock&rsquo;s team was responsible for creating the 3D environment for the harbor. A member of Warnock&rsquo;s team, John Gaffney, developed a stack-based language that could populate the database. This language later developed into The Design System (1976).<\/p>\n<p>In 1978, Warnock joined Xerox PARC and Geschke&rsquo;s team. Along with Martin Newell, Warnock developed JaM, a programming language based on The Design System. JaM influenced Interpress, a programming language and page description format based on learnings from Xerox&rsquo;s Press system. The paper (Warnock and Wyatt 1982) describes a graphical model used in some Xerox efforts that later is the basis for PostScript&rsquo;s graphical model.<\/p>\n<p><a href=\"https:\/\/www.computerhistory.org\/blog\/in-memoriam-john-warnock\/\">CHM Obituary<\/a><\/p>\n<h3 id=\"contributors-to-version-10\">Contributors to Version 1.0<\/h3>\n<p>Using the changelog attributions, the list of developers (excluding the two co-founders) for version 1.0 were Tom Boynton, Doug Brotz, and Andrew Shore. In (Warnock and Geschke 2019)&rsquo;s description of the founding of Adobe, the team listing includes Tom Boynton and Dan Putnam (both described as electronic engineers), Doug Brotz, and Bill Paxton, but drops Shore. ACM&rsquo;s Software System Award for PostScript includes the two co-founders and Brotz, Paxton, and Edward Taft. Paxton joined Adobe in 1983 and Taft in February 1984, both early enough to make contributions to PostScript but missed contributing to the 1.0 source code.<\/p>\n<h4 id=\"tom-boynton\">Tom Boynton<\/h4>\n<p>Tom Bonyton is credited with the original versions of banddevice.c and math.c, but edited other files as well. A band device does not support representing the page as a full array of pixels, but rather requires being fed a sequence of bands with one being consumed while the program produces the other. DEC had lent Adobe an early laser printer (LN01, based on the Xerox 2700) which, since it was severely limited in memory, used two band buffers for printing. Boynton worked at Xerox PARC previously to Adobe, but we were unable to find additional biographical details. It is likely that when Adobe decided to be a software-only firm, the electronic engineers left for other companies.<\/p>\n<h4 id=\"doug-brotz\">Doug Brotz<\/h4>\n<p><a href=\"https:\/\/computerhistory.org\/profile\/doug-brotz\/\">Doug Brotz<\/a> was awarded a PhD in Computer Science from Stanford University. His thesis was on <a href=\"https:\/\/searchworks.stanford.edu\/view\/810229\">automated theorem proving<\/a>. After a period of time at the University of Arizona, he joined Xerox in 1977. This soon led him to PARC where he worked on the <a href=\"https:\/\/www.youtube.com\/watch?v=tngrLvyiNEI\">Laurel email client<\/a>. Laurel was implemented in the <a href=\"https:\/\/xeroxalto.computerhistory.org\/xerox_alto_file_system_archive.html#Mesa\">Mesa programming language<\/a>. During the development of Laurel, Brotz met Warnock when Warnock recommended using Laurel&rsquo;s plugin capability to integrate with JaM and allow the sending of graphics in email messages.<\/p>\n<p>He joined Adobe in March 1983 and made many contributions to PostScript, particularly in graphics and font-handling. His earliest contributions date to May 1983. Based on the changelog, he is the second most prolific author behind Geschke.<\/p>\n<p>For someone who made many fundamental contributions to PostScript, Brotz surprisingly lacked any background in graphics programming. He recounts his involvement with the reducer algorithm in (Perry 1988):<\/p>\n<blockquote>\n<p>\u201cAbout a week after I had joined Adobe in 1983,\u201d Brotz recalls, \u201cJohn Warnock mentioned this rather important algorithm that had to be written. And I, with no graphics background, volunteered. Several months later, older and wiser, I realized it truly was one of the world\u2019s hardest problems.\u201d<\/p>\n<\/blockquote>\n<blockquote>\n<p>But Brotz did not give up, and he says, \u201cWe have now an exactly correct reducer algorithm. It is the heart of the graphics system in PostScript.\u201d And a tally Brotz keeps reveals that no bugs have been discovered in the Reducer in more than two years.<\/p>\n<\/blockquote>\n<h4 id=\"andrew-andy-shore\">Andrew (Andy) Shore<\/h4>\n<p>We were unable to find many details about Andrew Shore. The changelog indicates he wrote the initial version of the unix.c file and made edits to fontbuild.c. He was active on the fa.Laser Lovers USENET channel in 1984 and 1985 using an Adobe email address. Adobe celebrated his presence in a <a href=\"https:\/\/blog.adobe.com\/en\/publish\/2021\/06\/08\/glow-up-show-up-lift-up\">2021 blog post<\/a>, but the posting yields few biographical details.<\/p>\n<p>However, he is anecdotally remembered via &ldquo;Andy&rsquo;s Stupid Input Device.&rdquo; (Perry 1988) recounts how PostScript came to support scanners and other graphic input devices:<\/p>\n<blockquote>\n<p>As originally conceived, PostScript was to have been independent of the output, but not the input, device. Warnock had thought that PostScript, to take in scanned images, would need to contain information about a wide range of optical scanners. But Brotz, after programming the parameters of just two of many scanner types, realized that the task was not only horrendous and repetitive but ate up a lot of memory.<\/p>\n<\/blockquote>\n<blockquote>\n<p>Andy Shore, an Adobe computer scientist, overheard him complaining one day and suggested writing a PostScript procedure that would pretend that it was an input device and spit out the image information in a standard format, regardless of the characteristics of the actual standard. Brotz did not think it would work and \u201cWarnock promptly labeled it \u2018Andy\u2019s Stupid Input Device.\u2019\u201d<\/p>\n<\/blockquote>\n<blockquote>\n<p>Still, Brotz thought it might be helpful for generating test patterns, and when he implemented it, \u201cit turned out that Andy\u2019s Stupid Input Device was the lowest common denominator and all the special-case code could disappear.\u201d Problems arise only when the image data has been compressed for transmission or storage; the programmer then has to insert a routine to decompress the data before it is handed to the image algorithm.\u201d<\/p>\n<\/blockquote>\n<p>This anecdote demonstrates two aspects of PostScript; first, the interface designed around a &ldquo;lowest common denominator&rdquo; of functionality facilitated a wide variety of devices to plug-in to the system, greatly increasing its value. Furthermore, supporting compressed images was a matter of composing procedures, a matter very straight-forward using the stacks and dictionaries within the language.<\/p>\n<h3 id=\"the-design-system\">The Design System<\/h3>\n<p>John Gaffney was a member of Warnock&rsquo;s Evans and Sutherland team when they were building  the Harbor Pilot Simulator. The project was behind schedule and they needed to rapidly build a 3D model of the harbor. Gaffney built a language that programmatically constructed database elements. Coupled with a menu-driven user interface, the team finished the database in time. Later, the language was expanded into the Design System. Gaffney&rsquo;s 1981 patent &ldquo;Command language system for interactive computer&rdquo; describes &ldquo;The Design System&rdquo;&rsquo;s language thusly:<\/p>\n<blockquote>\n<p>A command language system is disclosed wherein memory stacks register specific definitions for generic names, which definitions are appropriately selected in timely response to a name, on the basis of stack arrangement. A structure is included for searching the stack in order and detecting the first definition for a name of current interest. Thus, the stack is used to define the order of the name searching. As a consequence, in the interpretation of command languages, the user is given control over the context in which the names are executed. Specifically, a command program will behave according to the definitions of the commands in a current context. The system further includes structure for deleting definitions from the stack which have been used and for sensing the bottom of the stack as a function of control.<\/p>\n<\/blockquote>\n<p>Put simply, the language defines stacks of dictionaries under the programmer&rsquo;s control. When a name is looked up, the stack is searched in order. Thus, a programmer can replace or augment functionality by adding or deleting entries within a dictionary, creating new dictionaries, and changing the order of dictionaries within the stack. Along with an interactive interface and a defined entity model, the earlier Harbor Pilot Simulator implementation facilitated rapid creation of a 3D database.<\/p>\n<p>Only one copy of The Design System was sold and, when the director died, the project ended (Perry 1988). However, the key aspects of the Design System were identified in (Reid 1988) as the key aspects of the PostScript language.<\/p>\n<h3 id=\"xerox\">Xerox<\/h3>\n<p>In the book <em>Introduction to Interpress<\/em> (Sproull and Reid 1983, 221-223), the authors describe the lessons learned over multiple iterations of publishing tools at Xerox. Although the Adobe developers may have disagreed with some of these lessons, they all, especially the founders, would have been aware of them.<\/p>\n<p>Shortly after PARC was founded in 1970, one of the employees developed the <code>listing<\/code> program which translated ASCII documents into a raster image for a printer. Users could insert inline escape sequences to control font selection, insert images, and control simple formatting choices like subscripts and superscripts. However, the program made many stylistic choices for the document which often did not comply with the user&rsquo;s wishes. The developers learned that style and layout decisions should not be coupled in the same layer with rasterization. For instance, PostScript does not include logic for text wrapping; those decisions are handled at a higher level.<\/p>\n<p>PARC engineers also developed systems that were deeply tied to a single printer&rsquo;s capabilities. Users sent page descriptions tailored to that printer. Although a user could maximize the performance of a given printer, this approach required too much from users and was considered to be too low-level.<\/p>\n<p>Based on these learnings, PARC engineers designed Press in 1974. They designed Press to be:<\/p>\n<ul>\n<li>Device independent<\/li>\n<li>Single file containing embedded text and graphics<\/li>\n<li>Font libraries were delegated to the printer<\/li>\n<li>Un-opinionated by making no formatting decisions<\/li>\n<\/ul>\n<p>Press was very successful. The authors state it was used in over 200,000 documents, many created by non-technical employees.<\/p>\n<p>In 1981, limitations in Press led to a new design, Interpress. Interpress differentiated itself from Press by:<\/p>\n<ul>\n<li>Press is a data structure while Interpress (and PostScript) is a program<\/li>\n<li>Press has a single coordinate system rather than supporting multiple coordinate systems. PostScript and Interpress could translate and scale multiple pages onto a single page programmatically.<\/li>\n<li>Press&rsquo;s graphic model only supports filled objects and pixel arrays<\/li>\n<li>Press lacks a <code>correct<\/code> facility for correcting font approximations<\/li>\n<li>Press files contain both printer data and application information<\/li>\n<\/ul>\n<p>PostScript and Interpress were very similar in the first three bullet points. PostScript approached fonts differently than Interpress and thus did not need a <code>correct<\/code> facility.<\/p>\n<p>With the benefit of hindsight of the success of PDF, we may question why Interpress and Postscript both chose to be a program rather than just a page description. In (Warnock 2012), the author lists the advantages and disadvantages of program as a representation:<\/p>\n<p>Disadvantages:<\/p>\n<ul>\n<li>Programs might never stop<\/li>\n<li>Indeterminate number of pages can be produced<\/li>\n<li>Program dictates order of pages produced<\/li>\n<\/ul>\n<p>Advantages:<\/p>\n<ul>\n<li>Operators can be redefined, so programs can fix or extend the language<\/li>\n<li>Programs can interpret other printer protocols and emulate them<\/li>\n<li>Character encodings (e.g. Japanese, Farsi) can be implemented programmatically<\/li>\n<li>Subroutines can efficiently implement sub-components on pages<\/li>\n<\/ul>\n<p>The second advantage was used for the Apple LaserWriter as a PostScript program could read and translate Apple&rsquo;s QuickDraw graphical model into the PostScript graphical model, saving time and effort versus a separate translation program.<\/p>\n<h2 id=\"development-and-test-environment\">Development and Test Environment<\/h2>\n<p>Adobe engineers developed on a leased VAX 750 running Berkeley Unix. For testing, the team used a Sun-1 workstation that included a primitive frame buffer and a LN01 laser printer borrowed from DEC (Perry 1988). The LN01 was a rebranded Xerox 2700.<\/p>\n<p>The printer was capable of 300 dpi and 12 pages\/minute. It used an Intel 8086 processor with either 64kb or 256kb of RAM and used two band buffers to enable reading additional data while printing the contents of the other buffer (Xerox 2022).<\/p>\n<blockquote>\n<p>Roket: In the file <code>framebuffer.c<\/code>, there is a function called <code>psFrameToRoket<\/code> which is only enabled for Sun builds. The function outputs a bitmap to <code>\/dev\/roket<\/code> and uses ioctl to adjust values such as width and height. We have been unable to determine what hardware or interface is being used here as roket isn&rsquo;t a term in the Sun-1 user manuals.<\/p>\n<\/blockquote>\n<h2 id=\"design-and-source-code\">Design and Source Code<\/h2>\n<p>The following sections analyze the version 1.0 source code artifacts.<\/p>\n<h3 id=\"physical-form\">Physical Form<\/h3>\n<p>Per David A. Wheeler&rsquo;s <code>SLOCCount<\/code>, the distribution contains approximately 12,000 lines of C code (.c and .h files) and a makefile. Per <code>cloc<\/code>, there are 1,171 lines of comments. However, since each file contains a commented copyright notice and edit history, there are less than 600 &ldquo;true&rdquo; comments in the code.<\/p>\n<p>The makefile organizes the source files into five groups: kernel, graphics, device, fonts, and user. The kernel files implement the basic language and core operators, as well as the virtual memory subsystem. The graphics files implement paths, strokes, matrices and other graphic primitives, and the reducer. Device files handle input and output devices. Font files handle reading, caching, and rendering fonts. User files handle some hosting concerns on Unix machines.<\/p>\n<p>Although we are missing some files, we can count lines of code per file group (Table 3) to get a sense of the complexity of the groups, at least once they are rendered to code. Per comments from the co-founders, font handling and graphics reduction were considered open problems as they started implementation. Although they had experience with both problems at Interpress, the co-founders considered the previous solutions as having insufficient quality and flexibility. Legally, the co-founders also wanted designs that were free from Xerox intellectual property. Adobe licensed the Design System, so the &ldquo;kernel&rdquo; was legally protected.<\/p>\n<p style=\"text-align: center\"><strong>Table 3: Lines of Code by File Group<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">File Group<\/th>\n<th style=\"text-align:right\">Lines of Code<\/th>\n<th style=\"text-align:right\">LOC Percent<\/th>\n<th style=\"text-align:left\">Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Kernel<\/td>\n<td style=\"text-align:right\">5564<\/td>\n<td style=\"text-align:right\">46<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Graphics<\/td>\n<td style=\"text-align:right\">2410<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:left\">Missing gray and reducer<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Fonts<\/td>\n<td style=\"text-align:right\">2014<\/td>\n<td style=\"text-align:right\">17<\/td>\n<td style=\"text-align:left\">Missing bezier and curvefit<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Device<\/td>\n<td style=\"text-align:right\">1647<\/td>\n<td style=\"text-align:right\">14<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">User<\/td>\n<td style=\"text-align:right\">123<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:left\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3 id=\"oddball-files\">Oddball Files<\/h3>\n<p>The distribution includes four &ldquo;oddball&rdquo; files or files that do not seem to properly belong to the software. These files are <code>begin.c<\/code>, <code>end.c<\/code>, <code>bug.c<\/code>, and <code>padPS.c<\/code>. None include the customary copyright header or comments describing authorship or edit history. None are included in the Makefile. Due to their nature, we usually excluded them from code analysis.<\/p>\n<p>The pair <code>begin.c<\/code> and <code>end.c<\/code> each define a single function, BEGIN and END respectively, that simply sets the local variable <code>i<\/code> to zero. Based on code in <code>control.c<\/code>, these two functions are only included if the release flag is not set and are used as part of some performance monitoring function.<\/p>\n<p>The standalone program <code>bug.c<\/code> tests that the value of a local unsigned short is equal to an extracted short from a <code>struct<\/code> bitfield. The PostScript code uses struct bitfields in various places, so this was likely used to either test a potential compiler bug or to verify some behavior of the C language.<\/p>\n<p>Another standalone program, <code>padPS.c<\/code>, reads a hexadecimal number from the command line arguments and two decimal numbers from <code>stdin<\/code>. The program then performs a short calculation of a padding value and either prints out the padding value or an error message. Since the calculation does not use the second number decimal number, this was likely some ad-hoc program used during development.<\/p>\n<p>There are two programs that have their own entry points (<code>main<\/code> functions) and have the standard copyright headers, but are not part of the makefile. These are <code>STbuild.c<\/code> and <code>fontcrypt.c<\/code>. The purpose of STBuild.c is to produce scan tables incorporated in <code>scanner.c<\/code>. Because the output tables are incorporated into the code, rather than being generated every build, the code served a development purpose but was not included in the PostScript program itself. The <code>fontcrypt.c<\/code> programs takes a file, presumably a font file, and encrypts it by <code>xor<\/code>-ing each byte with a pseudo-random byte (via a linear congruential generator). This process can be reversed in <code>scanner.c<\/code> or <code>fontbuild.c<\/code> which can reproduce the random byte stream. The encryption method (and constants) were updated by the time the Font 1 Specification was released (Adobe Systems Incorporated 1993).<\/p>\n<h3 id=\"architecture\">Architecture<\/h3>\n<p>John Warnock advocated for a Unix-like approach to the PostScript design. Thus, PostScript&rsquo;s design mirrors a Unix filter in that it reads from stdin and writes to either stdout or stderr. Although PostScript is primarily used to interact with printers, the design does not include features related to queuing multiple jobs or keeping track of spend. Instead, the assumption is that an intermediary program would handle those jobs. This act of composing functionality through multiple programs is also part of the Unix philosophy. Figure B describes the deployment architecture.<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/postscript-code\/PostScript-deployment.png\" alt=\"PostScript daemon reads via stdin a program (with optional queuing program intermediary) and outputs to a raster image processor which leads to a printer or display\">\n  \n  <figcaption>Figure B: PostScript Deployment Components and Flow<\/figcaption>\n<\/figure>\n\n\n<p>Since there is a large cost to start-up (primarily due to loading fonts), PostScript is designed to initialize once and then cheaply reset to a known state after every job. (Since many implementations printed a test page on start-up, minimizing the number of start-ups also helped save ink and paper.) The virtual memory storage is a copy of the memory or global state of the program.<\/p>\n<p>On the right-side of Figure B is the raster image processor (RIP) which translates the rasterized pages into the appropriate final representation for the output device. The 1.0 source code does not include the acronym RIP nor includes the full term. However, RIPs were part of the architecture by the time the first commercial printers became available. (Industry use of the term RIP is confusing. A RIP could be an embeddable component which includes PostScript, or RIP could be equivalent to a PostScript output device.)<\/p>\n<p>With simple streaming inputs and outputs, PostScript could be embedded within other systems flexibly and remain agnostic about client interfaces. As a streaming system, PostScript could operate in both a batched and interactive manner. This design was attractive to both creatives and printer manufacturers. For the former, they could target a single language for output and then take that file to any printer (or similar device). Manufacturers could add special sauce at the RIP layer and continue to improve quality and throughput on the output-side without requiring changes to the input. Printer services could embed PostScript into their printing orchestration services and anyone could print, at whatever quality they could afford.<\/p>\n<h3 id=\"programming-language-and-dialect\">Programming Language and Dialect<\/h3>\n<p>The source code is written in K&amp;R C (the first Standard Draft was released in 1985 and the first standard followed in 1989). However, the developers used the preprocessor and typedefs to make the language more Mesa-like. Mesa was a systems programming language at Xerox and Charles Geschke had been deeply involved in the design of the language and the compiler. The <code>PrimeSize<\/code> function from <code>dict.c<\/code> serves as an example of this dialect:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\">private cardinal <span style=\"color:#a6e22e\">PrimeSize<\/span>(s)\n\tcardinal s;\n{\n  cardinal i, size <span style=\"color:#f92672\">=<\/span> (s <span style=\"color:#f92672\">&lt;=<\/span><span style=\"color:#ae81ff\">500<\/span>)<span style=\"color:#f92672\">?<\/span> (s <span style=\"color:#f92672\">+<\/span> s<span style=\"color:#f92672\">\/<\/span><span style=\"color:#ae81ff\">2<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#f92672\">:<\/span> (s <span style=\"color:#f92672\">+<\/span> s<span style=\"color:#f92672\">\/<\/span><span style=\"color:#ae81ff\">5<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>);\n  <span style=\"color:#66d9ef\">if<\/span> (MAXdictLength <span style=\"color:#f92672\">!=<\/span> <span style=\"color:#ae81ff\">2000<\/span>)\n    BUG(<span style=\"color:#e6db74\">&#34;This version not prepared for dictionaries over 2000 elements!&#34;<\/span>);\n  <span style=\"color:#66d9ef\">if<\/span> (s <span style=\"color:#f92672\">&gt;<\/span> MAXdictLength) ERROR(limitcheck);\n  <span style=\"color:#66d9ef\">switch<\/span> (s) {\n    <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">0<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">3<\/span>;\n    <span style=\"color:#75715e\">\/* [...] *\/<\/span>\n    <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#ae81ff\">10<\/span><span style=\"color:#f92672\">:<\/span> <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#ae81ff\">13<\/span>;\n    endswitch}\n  i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n  until ((size <span style=\"color:#f92672\">&gt;=<\/span> primes[i])<span style=\"color:#f92672\">&amp;&amp;<\/span>(size <span style=\"color:#f92672\">&lt;<\/span> primes[i<span style=\"color:#f92672\">+<\/span><span style=\"color:#ae81ff\">1<\/span>])) {i<span style=\"color:#f92672\">++<\/span>;}\n  <span style=\"color:#66d9ef\">return<\/span> primes[i];\n}\n<\/code><\/pre><\/div><p>Mesa featured a strong module system and the developers wished to capture the documentary value of modules, even if the compiler lacked the semantics. The developers marked functions as <code>private<\/code> or <code>public<\/code> and these mapped to C&rsquo;s <code>static<\/code> or default scope. If a function did not return a value, they used <code>procedure<\/code> as the return type. The developers typedef the basic types to their Mesa names, such as <code>cardinal<\/code> for <code>unsigned short int<\/code>, <code>real<\/code> for <code>float<\/code>, and <code>character<\/code> to <code>unsigned char<\/code>.<\/p>\n<p>Also similar to Mesa, the <code>BUG<\/code> and <code>ERROR<\/code> procedures raised signals. The <code>endswitch<\/code> macro added an empty default case while the <code>until<\/code> macro replaced the clause with a <code>while<\/code> loop and negated the expression.<\/p>\n<h3 id=\"historical-constraints\">Historical Constraints<\/h3>\n<p>The rasterization process requires encoding the color value (which admittedly fit within a single bit for the initial devices) for each pixel or dot on a page. The storage requirements, if the entire page was rasterized at once, were significant at the time (Table 4). Thus, the design of PostScript had to accommodate limited memory capacity. Indeed, their test printer, the LN01 \/ Xerox 2700, could only hold two sets of 32 scan lines of 300 dpi at a time. Although the price of memory soon collapsed (Figure C) in 1984, prices were relatively stable in 1982 to 1984 during the development of PostScript. Without the collapse in prices, laser printers would have not been commercially viable and the desktop publishing revolution would have been postponed.<\/p>\n<p style=\"text-align: center\"><strong>Table 4: Storage Requirements for Rasterization on Extant Devices<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Output Device<\/th>\n<th style=\"text-align:left\">Resolution<\/th>\n<th style=\"text-align:right\">DPI<\/th>\n<th style=\"text-align:right\">Bytes to Raster One Page<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Macintosh 128k<\/td>\n<td style=\"text-align:left\">512x342<\/td>\n<td style=\"text-align:right\">72<\/td>\n<td style=\"text-align:right\">21,888<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">LaserWriter<\/td>\n<td style=\"text-align:left\">7.66&quot;x10.16&quot;<\/td>\n<td style=\"text-align:right\">300<\/td>\n<td style=\"text-align:right\">875,475<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Linotronic 300<\/td>\n<td style=\"text-align:left\">8.5&quot;x11&quot;<\/td>\n<td style=\"text-align:right\">2540<\/td>\n<td style=\"text-align:right\">75,403,075<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>(The LaserWriter had a non-printable border of 0.42&quot; per the user manual.)<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/postscript-code\/historical-cost-of-computer-memory-and-storage.svg\" alt=\"Memory cost ~$7B\/tb in 1983 but dropped to $900M\/tb in 1985\">\n  \n  <figcaption>Figure C: Historical Cost of Computer Memory and Storage<\/figcaption>\n<\/figure>\n\n\n<h2 id=\"interesting-algorithms-and-designs\">Interesting Algorithms and Designs<\/h2>\n<p>Reading through the source code, we found two items that are both interesting and representative of the whole: the design for handling input devices and the mini-reducer algorithm for strokes.<\/p>\n<h3 id=\"input-device\">Input Device<\/h3>\n<p>Although it is unclear if this code represents &ldquo;Andy&rsquo;s stupid input device,&rdquo; the <code>InputDevice<\/code> design demonstrates many of the common patterns seen within the code. Input devices are used to insert graphics into the document.<\/p>\n<p>The definition of an input device (from <code>graphics.h<\/code>):<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\">  <span style=\"color:#66d9ef\">typedef<\/span> <span style=\"color:#66d9ef\">struct<\/span>\n    {\n    procedure (<span style=\"color:#f92672\">*<\/span>SetUpImage)( <span style=\"color:#75715e\">\/* StreamHandle sh; *\/<\/span> );\n      <span style=\"color:#75715e\">\/* function that sets up the input device for subsequent calls to all\n<\/span><span style=\"color:#75715e\">         other input device procedures.  The StreamHandle is assumed to\n<\/span><span style=\"color:#75715e\">         refer to an open stream on an image file (of the type understood by\n<\/span><span style=\"color:#75715e\">         this input device) positioned at the beginning. *\/<\/span>\n    procedure (<span style=\"color:#f92672\">*<\/span>InputMatrix)( <span style=\"color:#75715e\">\/* Matrix matrix;*\/<\/span> );\n      <span style=\"color:#75715e\">\/* function that fills in the supplied matrix with values that map\n<\/span><span style=\"color:#75715e\">         from input DeviceCoord&#39;s to positions in the unit square in user\n<\/span><span style=\"color:#75715e\">         space.  *\/<\/span>\n    ImageSlice (<span style=\"color:#f92672\">*<\/span>NextSlice)();\n      <span style=\"color:#75715e\">\/* function that returns an ImageSlice that refers to a portion of the\n<\/span><span style=\"color:#75715e\">         current image.  A returned ImageSlice-&gt;width of zero indicates that\n<\/span><span style=\"color:#75715e\">         no more image remains.  The image algorithm requires that a one-pixel\n<\/span><span style=\"color:#75715e\">         buffer zone surround the rectangle actually used for imaging.  Thus,\n<\/span><span style=\"color:#75715e\">         the minimum useful ImageSlice returned must be at least 3 pixels\n<\/span><span style=\"color:#75715e\">         high and wide.  At internal boundaries in the image, the input device\n<\/span><span style=\"color:#75715e\">         must buffer two lines at the boundary of a previous slice and\n<\/span><span style=\"color:#75715e\">         include those lines in a subsequent slice.  I.e., the input device\n<\/span><span style=\"color:#75715e\">         must supply slices consisting of a rectangular tile and a one-pixel\n<\/span><span style=\"color:#75715e\">         wide frame around that tile such that the tiles completely cover the\n<\/span><span style=\"color:#75715e\">         input image.  The ImageSliceRec and pixel storage returned by this\n<\/span><span style=\"color:#75715e\">         function may be reused for subsequent calls.  *\/<\/span>\n    }\nInputDeviceRec, <span style=\"color:#f92672\">*<\/span>InputDevice;\n<\/code><\/pre><\/div><p>An input device is an interface, implemented via function pointers within a struct, that contains three operations: a <code>SetUpImage<\/code> function that &ldquo;constructs&rdquo; the internal data structures via a handle to a stream, a <code>InputMatrix<\/code> function for accessing the transformation matrix (a <code>Matrix<\/code> is for transformation, not general matrix storage), and a <code>NextSlice<\/code> function that acts as an iterator for the output pixels. Since images can easily exceed the memory capacity of devices of the time, and potentially exceed the disk storage capacity, iterators are commonly used to &ldquo;chunk&rdquo; large amounts of data into manageable increments.<\/p>\n<p>We can see how the input devices are used through the <code>image<\/code> PostScript command. <code>image<\/code> takes five arguments: width and heigh of the image, the bits per pixel, the transformation matrix, and a PostScript procedure for the image data. For example, this <a href=\"https:\/\/paulbourke.net\/dataformats\/postscript\/\">example from Paul Bourke<\/a> renders a small document icon:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-postscript\" data-lang=\"postscript\"><span style=\"color:#ae81ff\">100<\/span> <span style=\"color:#ae81ff\">200<\/span> <span style=\"color:#a6e22e\">translate<\/span>\n<span style=\"color:#ae81ff\">26<\/span> <span style=\"color:#ae81ff\">34<\/span> <span style=\"color:#a6e22e\">scale<\/span>\n<span style=\"color:#ae81ff\">26<\/span> <span style=\"color:#ae81ff\">34<\/span> <span style=\"color:#ae81ff\">8<\/span> [<span style=\"color:#ae81ff\">26<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#ae81ff\">-34<\/span> <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#ae81ff\">34<\/span>]\n{&lt;\n<span style=\"color:#a6e22e\">ffffffffffffffffffffffffffffffffffffffffffffffffffff<\/span>\n<span style=\"color:#a6e22e\">ff000000000000000000000000000000000000ffffffffffffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef0000ffffffffffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef00ce00ffffffffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef00cece00ffffffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef00cecece00ffffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef00cececece00ffff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefef00000000000000ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efef000000ef000000ef000000ef0000ef0000efefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efef000000ef00000000ef00000000ef000000efefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efef0000ef00000000000000ef000000ef0000efefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff00efefefefefefefefefefefefefefefefefefefefefef00ff<\/span>\n<span style=\"color:#a6e22e\">ff000000000000000000000000000000000000000000000000ff<\/span>\n<span style=\"color:#a6e22e\">ffffffffffffffffffffffffffffffffffffffffffffffffffff<\/span>\n&gt;}\n<span style=\"color:#a6e22e\">image<\/span>\n<\/code><\/pre><\/div><p>The PostScript <code>image<\/code> command is implemented by the <code>psImage<\/code> procedure in <code>image.c<\/code> (the developers use the <code>ps<\/code> prefix to identify functions that are called as PostScript commands). When invoked, <code>psImage<\/code> processes its arguments and initializes the input device with the image data and the transformation matrix. Then, for each slice (rectangular section of the input image), the code defines a new trapezoidal path with transformed coordinates, and then calls <code>Reduce<\/code>. Although <code>Reduce<\/code> is not included in the source code distribution, we know it would rasterize the image onto the output by pulling from the input via the inverse transformation. (We have added \/\/-style comments for clarity.)<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\">private procedure <span style=\"color:#a6e22e\">psImage<\/span>()\n{\n<span style=\"color:#75715e\">\/\/ ... setup ...\n<\/span><span style=\"color:#75715e\"><\/span>(<span style=\"color:#f92672\">*<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>inputDevice<span style=\"color:#f92672\">-&gt;<\/span>SetUpImage)(sh);\n(<span style=\"color:#f92672\">*<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>inputDevice<span style=\"color:#f92672\">-&gt;<\/span>InputMatrix)(<span style=\"color:#f92672\">&amp;<\/span>imageMatrix);\n<span style=\"color:#75715e\">\/\/ ... math to transform inputs to outputs and for the inverse ...\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#66d9ef\">do<\/span>\n  {\n  curImageSlice <span style=\"color:#f92672\">=<\/span> (<span style=\"color:#f92672\">*<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>inputDevice<span style=\"color:#f92672\">-&gt;<\/span>NextSlice)();\n  <span style=\"color:#66d9ef\">if<\/span> (curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>width <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>) <span style=\"color:#66d9ef\">break<\/span>;\n  NewPathIsClip(true);\n  FeedPathToReducer(<span style=\"color:#f92672\">&amp;<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>clip, CallNewPoint, ReducerClosePath);\n  c.x <span style=\"color:#f92672\">=<\/span> curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>origin.x <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;  c.y <span style=\"color:#f92672\">=<\/span> curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>origin.y <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n  NewPathIsClip(false);\n  CallNewPoint(TransformCoord(c, <span style=\"color:#f92672\">&amp;<\/span>imageTransform));\n  c.x <span style=\"color:#f92672\">+=<\/span> curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>width <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n  CallNewPoint(TransformCoord(c, <span style=\"color:#f92672\">&amp;<\/span>imageTransform));\n  c.y <span style=\"color:#f92672\">+=<\/span> curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>height <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n  CallNewPoint(TransformCoord(c, <span style=\"color:#f92672\">&amp;<\/span>imageTransform));\n  c.x <span style=\"color:#f92672\">-=<\/span> curImageSlice<span style=\"color:#f92672\">-&gt;<\/span>width <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n  CallNewPoint(TransformCoord(c, <span style=\"color:#f92672\">&amp;<\/span>imageTransform));\n  ReducerClosePath();\n  Reduce(CallDeviceImageTrap, true, false);\n  } <span style=\"color:#66d9ef\">while<\/span> (true);\n} <span style=\"color:#75715e\">\/* end of psImage *\/<\/span>\n\n<\/code><\/pre><\/div><p>The code is surprisingly generalizable &mdash; once an image is rasterized, it can then be inserted into the document. Later versions of PostScript included support for compressed images, but that was more for convenience than necessity. As long as chunks of the image could fit within memory, they could be translated into the final output, even with additional transformations.<\/p>\n<h3 id=\"mini-reducer-for-simple-quadrilaterals\">Mini-Reducer for Simple Quadrilaterals<\/h3>\n<p>The &ldquo;heart of the graphics engine,&rdquo; the reducer algorithm translates arbitrary shapes into a raster image. Although the reducer algorithm is still considered a trade secret by Adobe forty years later, and thus not part of the source code release, (Warnock and Geschke 2019) provided a high-level description of the algorithm. The algorithm uses a plane-sweep approach to convert arbitrary shapes into non-overlapping trapezoids. The trapezoids are then rasterized to the final output. The original implementation used floating point, but this led to errors where line segments would &ldquo;braid&rdquo; around each other. Apple loaned one of their programmers, Jerome Coonen, who helped convert the algorithm to use fixed point.<\/p>\n<p>While the full reducer is absent, the source code does include a &ldquo;mini-reducer&rdquo;, as it is commented, within <code>stroke.c<\/code>. One of the better commented sections of code, the algorithm reduces simple cases of convex quadrilaterals (i.e. a four-sided polygon with all interior angles less than 180 degrees) that lie completely within the clipping region to a rasterized image. In the code below, the \/* comments are original while the \/\/ comments are added by the author.<\/p>\n<div class=\"highlight\"><div style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\">\n<table style=\"border-spacing:0;padding:0;margin:0;border:0;width:auto;overflow:auto;display:block;\"><tr><td style=\"vertical-align:top;padding:0;margin:0;border:0;\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 1\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 2\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 3\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 4\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 5\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 6\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 7\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 8\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\"> 9\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">10\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">11\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">12\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">13\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">14\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">15\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">16\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">17\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">18\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">19\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">20\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">21\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">22\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">23\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">24\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">25\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">26\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">27\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">28\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">29\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">30\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">31\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">32\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">33\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">34\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">35\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">36\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">37\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">38\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">39\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">40\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">41\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">42\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">43\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">44\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">45\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">46\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">47\n<\/span><span style=\"margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f\">48\n<\/span><\/code><\/pre><\/td>\n<td style=\"vertical-align:top;padding:0;margin:0;border:0;;width:100%\">\n<pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\">private procedure <span style=\"color:#a6e22e\">FastFillQuad<\/span>(c1, c2, c3, c4)  Coord c1, c2, c3, c4;\n{ <span style=\"color:#75715e\">\/* this mini-reducer is for use only on convex quadrilaterals completely\n<\/span><span style=\"color:#75715e\">     contained in the clipping region *\/<\/span>\nQuadCorner pq[<span style=\"color:#ae81ff\">4<\/span>]; <span style=\"color:#75715e\">\/* to hold pointers to qc[0-3] in bottom-up order *\/<\/span>\nQuadCorner qcptr, t;\n<span style=\"color:#75715e\">\/\/ y top, y bottom, x top left, x top right, x bottom left, x bottom right, temporary\n<\/span><span style=\"color:#75715e\"><\/span>Fixed yt, yb, xtl, xtr, xbl, xbr, tfixed;\n\n<span style=\"color:#75715e\">\/\/ Fix(a.b) returns the fractional portion, b, or the lower 16 bits\n<\/span><span style=\"color:#75715e\"><\/span>qc[<span style=\"color:#ae81ff\">0<\/span>].c.x <span style=\"color:#f92672\">=<\/span> Fix(c1.x); qc[<span style=\"color:#ae81ff\">0<\/span>].c.y <span style=\"color:#f92672\">=<\/span> Fix(c1.y);\nqc[<span style=\"color:#ae81ff\">1<\/span>].c.x <span style=\"color:#f92672\">=<\/span> Fix(c2.x); qc[<span style=\"color:#ae81ff\">1<\/span>].c.y <span style=\"color:#f92672\">=<\/span> Fix(c2.y);\nqc[<span style=\"color:#ae81ff\">2<\/span>].c.x <span style=\"color:#f92672\">=<\/span> Fix(c3.x); qc[<span style=\"color:#ae81ff\">2<\/span>].c.y <span style=\"color:#f92672\">=<\/span> Fix(c3.y);\nqc[<span style=\"color:#ae81ff\">3<\/span>].c.x <span style=\"color:#f92672\">=<\/span> Fix(c4.x); qc[<span style=\"color:#ae81ff\">3<\/span>].c.y <span style=\"color:#f92672\">=<\/span> Fix(c4.y);\n\n<span style=\"color:#75715e\">\/* bubble sort pointers to qc[0 - 3] in pq[0 - 3] so that\n<\/span><span style=\"color:#75715e\">   lowest (y,x) comes first. *\/<\/span>\n<span style=\"color:#75715e\">\/\/ sorts only on y; x coordinates are handled in the sweep\n<\/span><span style=\"color:#75715e\"><\/span>pq[<span style=\"color:#ae81ff\">0<\/span>] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>qc[<span style=\"color:#ae81ff\">0<\/span>];  pq[<span style=\"color:#ae81ff\">1<\/span>] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>qc[<span style=\"color:#ae81ff\">1<\/span>];  pq[<span style=\"color:#ae81ff\">2<\/span>] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>qc[<span style=\"color:#ae81ff\">2<\/span>];  pq[<span style=\"color:#ae81ff\">3<\/span>] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>qc[<span style=\"color:#ae81ff\">3<\/span>];\nBubble(pq[<span style=\"color:#ae81ff\">0<\/span>],pq[<span style=\"color:#ae81ff\">1<\/span>])\nBubble(pq[<span style=\"color:#ae81ff\">1<\/span>],pq[<span style=\"color:#ae81ff\">2<\/span>])\nBubble(pq[<span style=\"color:#ae81ff\">2<\/span>],pq[<span style=\"color:#ae81ff\">3<\/span>])\nBubble(pq[<span style=\"color:#ae81ff\">1<\/span>],pq[<span style=\"color:#ae81ff\">2<\/span>])\nBubble(pq[<span style=\"color:#ae81ff\">0<\/span>],pq[<span style=\"color:#ae81ff\">1<\/span>])\nBubble(pq[<span style=\"color:#ae81ff\">1<\/span>],pq[<span style=\"color:#ae81ff\">2<\/span>])\n\nyb <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">0<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.y;  xbl <span style=\"color:#f92672\">=<\/span> xbr <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">0<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;  yt <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">1<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.y;\n<span style=\"color:#66d9ef\">if<\/span> (yt <span style=\"color:#f92672\">==<\/span> yb)  <span style=\"color:#75715e\">\/* horizontal bottom -- no initial trapezoid *\/<\/span>\n  {xbr <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">1<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x; <span style=\"color:#66d9ef\">if<\/span> (xbr <span style=\"color:#f92672\">&lt;<\/span> xbl) {tfixed <span style=\"color:#f92672\">=<\/span> xbr; xbr <span style=\"color:#f92672\">=<\/span> xbl; xbl <span style=\"color:#f92672\">=<\/span> tfixed;}}\n<span style=\"color:#66d9ef\">else<\/span>  QuadTrap(pq[<span style=\"color:#ae81ff\">1<\/span>],pq[<span style=\"color:#ae81ff\">0<\/span>])\nyt <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">2<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.y;\n<span style=\"color:#66d9ef\">if<\/span> (yt <span style=\"color:#f92672\">!=<\/span> yb)  QuadTrap(pq[<span style=\"color:#ae81ff\">2<\/span>],pq[<span style=\"color:#ae81ff\">3<\/span>])\nyt <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.y;\n<span style=\"color:#66d9ef\">if<\/span> (yt <span style=\"color:#f92672\">==<\/span> yb)\n  {\n  <span style=\"color:#66d9ef\">if<\/span> (yt <span style=\"color:#f92672\">==<\/span> pq[<span style=\"color:#ae81ff\">0<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.y)\n    {\n    <span style=\"color:#66d9ef\">if<\/span>      (xbl <span style=\"color:#f92672\">&gt;<\/span> pq[<span style=\"color:#ae81ff\">2<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x) xbl <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">2<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;\n    <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#a6e22e\">if<\/span> (xbr <span style=\"color:#f92672\">&lt;<\/span> pq[<span style=\"color:#ae81ff\">2<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x) xbr <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">2<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;\n    <span style=\"color:#66d9ef\">if<\/span>      (xbl <span style=\"color:#f92672\">&gt;<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x) xbl <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;\n    <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#a6e22e\">if<\/span> (xbr <span style=\"color:#f92672\">&lt;<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x) xbr <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;\n    }\n  <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#66d9ef\">return<\/span>;\n  }\nxtr <span style=\"color:#f92672\">=<\/span> xtl <span style=\"color:#f92672\">=<\/span> pq[<span style=\"color:#ae81ff\">3<\/span>]<span style=\"color:#f92672\">-&gt;<\/span>c.x;\n(<span style=\"color:#f92672\">*<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>outputDevice<span style=\"color:#f92672\">-&gt;<\/span>ColorTrap)\n  (yt, yb, xtl, xtr, xbl, xbr, gs<span style=\"color:#f92672\">-&gt;<\/span>color, gs<span style=\"color:#f92672\">-&gt;<\/span>screen);\n}  <span style=\"color:#75715e\">\/* end of FastFillQuad *\/<\/span>\n\n<\/code><\/pre><\/td><\/tr><\/table>\n<\/div>\n<\/div><p><code>QuadTrap<\/code> is a macro defined as:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\"><span style=\"color:#75715e\">#define QuadTrap(qcp1,qcp2) { \\\n<\/span><span style=\"color:#75715e\">  xtr = qcp1-&gt;c.x; \\\n<\/span><span style=\"color:#75715e\">  <\/span><span style=\"color:#75715e\">\/\/ ptr1 is &#34;next&#34; while ptr2 is &#34;previous&#34; in the doubly-linked list of the path\n<\/span><span style=\"color:#75715e\"><\/span>  qcptr <span style=\"color:#f92672\">=<\/span> (qcp2<span style=\"color:#f92672\">-&gt;<\/span>ptr1 <span style=\"color:#f92672\">==<\/span> qcp1) <span style=\"color:#f92672\">?<\/span> qcp2<span style=\"color:#f92672\">-&gt;<\/span>ptr2 : qcp2<span style=\"color:#f92672\">-&gt;<\/span>ptr1; \\\n  xtl <span style=\"color:#f92672\">=<\/span> qcp2<span style=\"color:#f92672\">-&gt;<\/span>c.x \\\n   <span style=\"color:#f92672\">+<\/span> Fix(FReal(yt <span style=\"color:#f92672\">-<\/span> qcp2<span style=\"color:#f92672\">-&gt;<\/span>c.y) <span style=\"color:#f92672\">*<\/span> FReal(qcptr<span style=\"color:#f92672\">-&gt;<\/span>c.x <span style=\"color:#f92672\">-<\/span> qcp2<span style=\"color:#f92672\">-&gt;<\/span>c.x) \\\n         <span style=\"color:#f92672\">\/<\/span> FReal(qcptr<span style=\"color:#f92672\">-&gt;<\/span>c.y <span style=\"color:#f92672\">-<\/span> qcp2<span style=\"color:#f92672\">-&gt;<\/span>c.y)); \\\n  <span style=\"color:#66d9ef\">if<\/span> (xtr <span style=\"color:#f92672\">&lt;<\/span> xtl) {tfixed <span style=\"color:#f92672\">=<\/span> xtr; xtr <span style=\"color:#f92672\">=<\/span> xtl; xtl <span style=\"color:#f92672\">=<\/span> tfixed;} \\\n  (<span style=\"color:#f92672\">*<\/span>gs<span style=\"color:#f92672\">-&gt;<\/span>outputDevice<span style=\"color:#f92672\">-&gt;<\/span>ColorTrap) \\\n    (yt, yb, xtl, xtr, xbl, xbr, gs<span style=\"color:#f92672\">-&gt;<\/span>color, gs<span style=\"color:#f92672\">-&gt;<\/span>screen); \\\n  yb <span style=\"color:#f92672\">=<\/span> yt;  xbl <span style=\"color:#f92672\">=<\/span> xtl;  xbr <span style=\"color:#f92672\">=<\/span> xtr; \\\n  }\n<\/code><\/pre><\/div><p>Figure D illustrates how an example quadrilateral is filled with trapezoids. The left-most image traces the algorithm through line 26, the second through line 30, the third through line 43, and the fourth through the final statement of the function on lines 45 and 46. In this algorithm, a trapezoid may be a triangle (one side has length zero) and, in this example, two triangles are produced. Starting from the bottom of the polygon, the algorithm splits the polygon via horizontal scan beams based on the polygon&rsquo;s vertices. The next trapezoid&rsquo;s bottom edge is based on the former trapezoid&rsquo;s top edge. Since the algorithm is only given limited, albeit common, inputs (convex quadrilateral completely within the clipping region), the algorithm only needs to handle various rotations of the quadrilateral.<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/postscript-code\/FastFillQuad-example2.png\" alt=\"Three steps to fill quadrilateral with trapezoids, bottom-up\">\n  \n  <figcaption>Figure D: Algorithm fills quadrilateral with trapezoids<\/figcaption>\n<\/figure>\n\n\n<p>A similar algorithm for decomposing convex polygons into trapezoids is briefly described in <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/800224.806789\">Parallel Processing and image synthesis and anti-aliasing<\/a>, while a more general version is Vatti&rsquo;s polygon clipping algorithm (<a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/129902.129906\">A generic solution to polygon clipping<\/a>). We consider it possible that Adobe&rsquo;s reducer is very similar to Vatti&rsquo;s algorithm as both use a plane sweep approach.<\/p>\n<p>Implementation-wise, it is notable that while the algorithm stores positions in fixed point representation, the actual operations are performed using floating point operations. The <code>FReal<\/code> macro casts the argument to a float and then divides by <code>1&lt;&lt;16<\/code>, while <code>Fix<\/code> does the opposite by casting the argument to an integer and multiplying by <code>1&lt;&lt;16<\/code>. In both cases, the compiler could highly optimize these operations as they are modifications by powers of two.<\/p>\n<p><code>FastFillQuad<\/code> avoids making any function calls except for the ColorTrap calls. Macros are used to encapsulate reused code but make free use of referencing and changing variables that are not passed as arguments. While unhygienic, code profiling may have indicated the loss of clarity as worth the increase in performance.<\/p>\n<h2 id=\"alternative-paths--competitors\">Alternative Paths \/ Competitors<\/h2>\n<p>While there were a variety of proprietary printer languages that supported various raster effects, the only digital competitor to PostScript was Xerox&rsquo;s Interpress. The non-commercial TeX project solved similar issues of rasterization using Device Independent files, but their solution had limited capability.<\/p>\n<h3 id=\"interpress\">Interpress<\/h3>\n<p>An observer in the mid-1980s would have likely considered PostScript the underdog to Xerox&rsquo;s Interpress. Interpress had several advantages: it had been under development for at least twice as long as PostScript, was funded by Xerox which had pioneered many desktop publishing technologies and had lots of capital, and had a captive PARC community to validate the design. However, Xerox management dithered away their advantages (as they did for the rest of their PARC investment) and Adobe won.<\/p>\n<p>Ignoring the business side, based on their technical merits, how did Interpress and PostScript compare? In a <a href=\"https:\/\/www.tech-insider.org\/unix\/research\/1985\/0301.html\">pair of Usenet postings to the fa.laser-lovers newsgroup<\/a> in 1985, Brian Reid and Jerry Mendelson compared the two systems. Brian Reid based his analysis on the publicly released 1984 Interpress manual (which he noted to his surprise was created via Press, not Interpress). Reid had worked for Xerox as a consultant on the Interpress project alongside Geschke, Warnock, Lampson, and Sproull. Sproull was formerly Reid&rsquo;s thesis advisor and Reid later wrote the &ldquo;Green Book,&rdquo; an introduction to the PostScript language. When Mendelson wrote his reply, he was a retired Xerox research fellow, but an active Xerox consultant. He had also worked on the transition of Interpress from a research project into a commercial product. Thus, both writers had first-hand experience with the technology and Xerox&rsquo;s culture.<\/p>\n<p>Both writers agreed that Interpress had better support for interacting and controlling printers, such as controlling page order, informing accounting systems of printing expenses, and selecting optional features such as stapling. Interpress files also imposed a structure on the program that meant the expected page count could be discerned from the structure and individual pages could be printed in any order. (This feature would be part of the PDF format.) While the lack of this structure could be used to an advantage for PostScript, for instance if a PostScript program was translating another format such as QuickDraw to itself, commercial printers and companies with shared printers appreciated the deterministism of Interpress.<\/p>\n<p>Mendelson considered Interpress' binary encoding format superior to PostScript&rsquo;s character stream due to the binary format&rsquo;s reduced space requirements. Mendelson also considered the binary format a better fit for the Xerox Network Systems environment and, indeed, Interpress supported fetching resources remotely via XNS paths, while PostScript required users to place any external files into the printer&rsquo;s local filesystem (if it existed). Reid was more ambivalent, treating the two formats as expressing different trade-offs.<\/p>\n<p>Language-wise, both PostScript and Interpress were stack-based languages with very similar functions and design. Both writers considered PostScript more flexible and &ldquo;open&rdquo; as users could write operators that replaced system operators and PostScript could generate and execute its own code.<\/p>\n<p>Graphically, although both systems had similar origins, PostScript supported curves and arbitrary rotations of objects in 1985, while Interpress did not. Mendelson expected Interpress to close the gap over time, while Reid wondered why the gap existed at all since the underlying algorithms were publicly documented and Interpress had had more development time.<\/p>\n<p>Interpress did not document its approach to fonts in 1985, so Reid was unable to compare the two systems, but Mendelson considered their approach to fonts equivalent. However, if we look at how Interpress handled fonts as described in the 1988 <em>Interpress: the Source Book<\/em>, fonts were restricted to 90 degree rotations in the Commercial and Publication feature sets. Only the most expensive (and limited in availability) Professional feature set supported arbitrary rotations. Further, the book warns that fonts may drift over time and, if a requested font is not available, the Interpress system will choose an approximate font. This means that character spacing and alignment may be broken if a different font than the original is chosen. Interpress contained a <code>correct<\/code> operator that attempted to fix these issues, but PostScript, with the ability to embed fonts, never required a similar function.<\/p>\n<p>The strongest technical advantage for Interpress was the document structure, which PDF incorporated in the early 90s. However, PostScript&rsquo;s adaptability was highly useful for integrating systems at the time, such as Apple&rsquo;s QuickDraw. In terms of the other technical aspects, the fact that PostScript was fully-featured and available was more important than features that were <em>expected<\/em> to be as good or better.<\/p>\n<h3 id=\"device-independent-dvi-files\">Device Independent (DVI) Files<\/h3>\n<p>Designed in 1979 as part of the TeX project, Device Independent (DVI) files act as an intermediary page description language. DVI files are read by a driver, which could render the DVI to the screen, a printer, or another representation. DVI shares similar advantages as the later developed PDF standard in that, while the format programatically builds a page, the language is not Turing-complete and thus can be interpreted with  predictable performance, capacity requirements, and assurance that processing will terminate. Additionally, the page count is fixed and each page can be rendered in any order.<\/p>\n<p>However, unlike PostScript and PDFs, DVI does not have a means to embed fonts. Fonts can only be referenced from DVIs, meaning they must be installed before being rendered. Further, DVI only supports rectangles as graphics. Graphics are normally embedded within the file as a &lsquo;special,&rsquo; and the driver is responsible for rendering them. DVI was never a real competitor to PostScript, but rather DVI was a bridge from TeX to PostScript, as authors would embed PostScript as the graphics format and convert everything to PostScript (and later PDF) for the printer.<\/p>\n<h2 id=\"print-anything-philosophy\">&ldquo;Print Anything&rdquo; Philosophy<\/h2>\n<p>The founder&rsquo;s disagreements with Xerox may be boiled down to a philosophical difference between &ldquo;print anything; throughput be damned&rdquo; and &ldquo;print most things; preserve our business model&rdquo;. If a Press document was too complex for the printer, the job would be rejected. Xerox&rsquo;s primary business billed by pages printed and thus throughput was key. This philosophy was carried forward in Interpress by limiting functionality by market segments and restricting the flexibility of the programs to improve page rendering times. In contrast (Perry 1988):<\/p>\n<blockquote>\n<p>\u201cThroughout the design of PostScript, speed was regularly traded off to ensure that any image would print. The group reasoned that if they built in all this functionality, they could eventually improve the performance; but if they left out functions, they might never be able to add them back in.\u201d<\/p>\n<\/blockquote>\n<p>Within the source code, we find the building blocks of documents such as paths and sampled arrays are treated in the most general of fashion. This generality imposes more work on the program (and on development and testing effort), but the end user benefits by having fewer arbitrary limits placed on their design and creativity.<\/p>\n<p>For example, PostScript decomposed fonts into paths. PostScript allowed paths to be translated, rotated, skewed, and scaled arbitrarily. In contrast, contemporary systems treated fonts as a special case and <em>might<\/em> allow fonts to be rotated 90 degrees, but heavily restricted scaling and skewing. As part of their market segmentation, Interpress only allowed font rotation by 90 in their lowest priced offering. (It is unclear if the Interpress code supported arbitrary rotation or not; a full-featured version was never released.) Decomposing fonts into paths required inventing new algorithms and, to maintain quality, entirely new ways of rendering fonts. Creatives immediately embraced the flexibility. This flexibility was a key competitive advantage for Adobe until the Font War of the 1990s led to competitors implementing similar algorithms.<\/p>\n<h3 id=\"limits-of-print-anything\">Limits of &ldquo;Print Anything&rdquo;<\/h3>\n<p>As a Turing-complete language, PostScript programs could exceed the limits of the device, either by exceeding the stack depth, requiring more than the allocated number of dictionary keys, or otherwise exhausting memory or patience. The &ldquo;Red Book&rdquo; documented many of these static limits. The same language power, however, allowed programmers the possibility of rewriting the program to fit within the device&rsquo;s capabilities.<\/p>\n<h2 id=\"deployment-and-history-after-version-10\">Deployment and History after Version 1.0<\/h2>\n<p>After 1.0, PostScript&rsquo;s font handling algorithms were overhauled and dramatically improved in quality. PostScript was embedded into Apple&rsquo;s new LaserWriter in 1985 and, three years later, had been adopted by 23 manufacturers (Perry 1988). Typefaces proved to be a fertile marketplace as foundries converted their back catalogs into digital forms and designers reveled in the freedom to easily experiment with new fonts. Adobe released two major updates to PostScript in 1991 and 1997, improving reliability and expanding support for color, page support, and ways to embed images and other media.<\/p>\n<p>Adobe&rsquo;s introduction of PDF in 1993 began to obsolete PostScript. PDF used the same graphics model as PostScript, but dropped the Turing-complete language. PDF could thus be printed at the same or higher throughput as PostScript, but also gained the advantage that individual pages could be referenced (an advantage Interpress had) and rendered separately. As PDFs gained support for forms and interactivity, clients adopted PDFs as a key workflow technology and their use exploded in popularity.<\/p>\n<h2 id=\"references\">References<\/h2>\n<p>(Adobe 1993) Adobe Systems Incorporated. 1993. \u201cAdobe Type 1 Font Format.\u201d Addison-Wesley Publishing Company, Inc. <a href=\"https:\/\/adobe-type-tools.github.io\/font-tech-notes\/pdfs\/T1_SPEC.pdf\">https:\/\/adobe-type-tools.github.io\/font-tech-notes\/pdfs\/T1_SPEC.pdf<\/a>.<\/p>\n<p>(Perry 1988) Perry, Tekla. 1988. \u201cInventing Postscript, the Tech That Took the Pain out of Printing.\u201d IEEE Spectrum, May 1, 1988. <a href=\"https:\/\/spectrum.ieee.org\/adobe-postscript\">https:\/\/spectrum.ieee.org\/adobe-postscript<\/a>.<\/p>\n<p>(Reid 1988) Reid, Glenn. 1988. PostScript Language Program Design. Addison-Wesley Publishing Company, Inc.<\/p>\n<p>(Sproull and Reid 1983) Sproull, Robert F., and Brian Reid. 1983. <em>Introduction to Interpress<\/em>. XSIG 038306. El Segundo, CA: Xerox Corporation. <a href=\"http:\/\/www.bitsavers.org\/pdf\/xerox\/interpress\/XSIG_038306_Introduction_to_Interpress_Jun1983.pdf\">http:\/\/www.bitsavers.org\/pdf\/xerox\/interpress\/XSIG_038306_Introduction_to_Interpress_Jun1983.pdf<\/a>.<\/p>\n<p>(Warnock and Wyatt 1982) Warnock, John E, and Douglas K Wyatt. 1982. \u201cA Device Independent Graphics Imaging Model for Use with Raster Devices.\u201d Proceedings of the 9th Annual Conference on Computer Graphics and Interactive Techniques 16 (3): 313\u201319. <a href=\"https:\/\/doi.org\/10.1145\/965145.801297\">https:\/\/doi.org\/10.1145\/965145.801297<\/a>.<\/p>\n<p>(Warnock 2012) Warnock, John E. 2012. \u201cSimple Ideas That Changed Printing and Publishing.\u201d Proceedings of the American Philosophical Society 156 (4): 363\u201378. <a href=\"http:\/\/www.jstor.org\/stable\/23558230\">http:\/\/www.jstor.org\/stable\/23558230<\/a><\/p>\n<p>(Warnock and Geschke 2019) Warnock, John E, and Charles Geschke. 2019. \u201cFounding and Growing Adobe Systems, Inc.\u201d IEEE Annals of the History of Computing 41 (3): 24\u201334. <a href=\"https:\/\/doi.org\/10.1109\/MAHC.2019.2923397\">https:\/\/doi.org\/10.1109\/MAHC.2019.2923397<\/a>.<\/p>\n<p>(Xerox 2022) \u201cThe Xerox 2700 Story.\u201d 2022. Association of Retired Xerox Employees. <a href=\"https:\/\/archive.org\/details\/the-xerox-2700-story\">https:\/\/archive.org\/details\/the-xerox-2700-story<\/a>.<\/p>"},{"title":"ZLister Announcement","link":"https:\/\/ztoz.blog\/posts\/zlister-announcement\/","pubDate":"Thu, 22 Aug 2024 20:50:39 -0600","guid":"https:\/\/ztoz.blog\/posts\/zlister-announcement\/","description":"<p>ZLister, a to-do list management application, is now available. <a href=\"https:\/\/zlister.ztoz.blog\">ZLister<\/a> allows you to create lists with entries and entries can be marked complete or incomplete (&ldquo;to do&rdquo;). ZLister&rsquo;s user interface is designed for mobile devices, but may also be used on the desktop. ZLister does not store your data in the cloud and can run offline.<!-- more --><\/p>\n<p><img src=\"zlister-index-small.png\" alt=\"ZLister Start Page\" title=\"ZLister's start page showing lists\"><\/p>\n<h2 id=\"motivation-and-goals\">Motivation and Goals<\/h2>\n<p>For the last several years, I have been using Amazon&rsquo;s Alexa app to store checklists of grocery items, books to read, and other miscellanea. However, as checklists are a side function within the app, I found the interface often clunky and the app required large amounts of resources to run for functionality I did not use. Furthermore, the app has recently added intrusive recommendations into the interface which emphasized its role as a channel for data exfiltration and ads.<\/p>\n<p>As a programmer, I decided to write my own application with the following goals:<\/p>\n<ul>\n<li>Local-only. Beyond installing the application, there should not be any server or network calls.<\/li>\n<li>Low maintenance. The tech stack was chosen to work on a majority of devices and libraries and APIs that were unlikely to require any changes.<\/li>\n<li>Data freedom. Users can export and import data from files without any special intermediaries.<\/li>\n<\/ul>\n<h2 id=\"features\">Features<\/h2>\n<ul>\n<li>Create named lists. Lists can contain entries and each entry has a short description.<\/li>\n<li>Entries can be toggled between incomplete and complete states.<\/li>\n<li>Lists and entries can be renamed, soft deleted, undeleted, and hard deleted.<\/li>\n<li>List data can be exported or imported.<\/li>\n<\/ul>\n<p><img src=\"zlister-shopping-small.png\" alt=\"Example List Page\" title=\"Shopping list page showing items to pick up at store\"><\/p>\n<h2 id=\"is-zlister-novel\">Is ZLister novel?<\/h2>\n<p>No. To-do apps are often used as tutorial exercises and checklist functionality is widely available in many applications.<\/p>\n<p>Due to the commodification of this functionality, many implementations are distributed for free in return for data on user&rsquo;s shopping habits and as a channel for advertisements. In contrast, ZLister was designed not to collect data and to have maintenance costs so low that monetization isn&rsquo;t necessary.<\/p>\n<h2 id=\"does-zlister-cost-anything-what-is-its-license\">Does ZLister cost anything? What is its license?<\/h2>\n<p>There is no installation nor subscription costs for using ZLister. My costs are time and running a static site, which, barring unexpected exorbitant bandwidth costs, should be minimal.<\/p>\n<p>The source code is licensed under the GPLv3.<\/p>\n<h2 id=\"what-is-the-tech-stack\">What is the tech stack?<\/h2>\n<p>The app is written in <a href=\"https:\/\/www.typescriptlang.org\/\">Typescript<\/a> and uses HTML and standard Web APIs (no frameworks). There is one production dependency, a uuid library, which I might vendor as it is fairly small.<\/p>\n<p>For development, tests are in <a href=\"https:\/\/jestjs.io\/\">Jest<\/a>, browser tests use <a href=\"https:\/\/playwright.dev\/\">Playwright<\/a>, and <a href=\"https:\/\/esbuild.github.io\/\">esbuild<\/a> is the bundler\/transpiler.<\/p>\n<h2 id=\"what-browsers-are-supported-system-requirements\">What browsers are supported? System requirements?<\/h2>\n<p>Theoretically, all browser should work. I primarily test against Firefox, although <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1907304\">certain<\/a> <a href=\"https:\/\/bugzilla.mozilla.org\/buglist.cgi?quicksearch=offline+cache\">bugs<\/a> make this difficult. I do not have an easy way to test Safari or iOS. The browser test suite runs against Firefox, Chrome, and Webkit, although certain tests are skipped against Webkit due to apparent test framework limitations.<\/p>\n<p>In my experience, offline mode has been the most error prone aspect.<\/p>\n<p>Disk storage requirements for the application are about 512 kb.<\/p>\n"},{"title":"Study of Historical Code","link":"https:\/\/ztoz.blog\/posts\/code-study\/","pubDate":"Wed, 01 May 2024 16:02:15 -0600","guid":"https:\/\/ztoz.blog\/posts\/code-study\/","description":"<p>I&rsquo;ve started studying a larger historical code base. Within this post, I want to summarize the sort of historical questions we might ask and notes on how to approach them.<\/p>\n<h1 id=\"objective\">Objective<\/h1>\n<p>My objectives for studying and writing about historical source code is to understand and communicate the:<\/p>\n<ol>\n<li>Intent and purpose of the software<\/li>\n<li>Design, engineering trade offs, and technical decisions<\/li>\n<li>Significance and influence of the code, its other forms, and how it was used<\/li>\n<li>Authorship, the process of development, inspirations, and why it was written<\/li>\n<\/ol>\n<p>In his speech about history writing, [Knuth]&rsquo;s (paraphrased) list was:<\/p>\n<ol>\n<li>Understand the process of discovery<\/li>\n<li>Understand the process of failure<\/li>\n<li>Celebrate the contributions of many cultures<\/li>\n<li>Telling historical stories as the best way to teach<\/li>\n<li>Learn how to cope with life<\/li>\n<li>Become more familiar with the world, and to know how science fits into the overall history of mankind<\/li>\n<\/ol>\n<p>In contrast to Knuth&rsquo;s list, my list is less focused on the &ldquo;lives of scientists&rdquo; angle, although I am similarly interested in the process of development and process of failure and recognizing sources of influence and contribution. For these kinds of studies, I am less interested in the development of particular algorithms or discoveries and more about larger scale engineering efforts, which by nature tend to be more impersonal.<\/p>\n<p>There appear to be very few studies of historical code. [Charoenwet] is motivated by historical analysis rather than algorithmic analysis, however, the paper is focused on an methodological experiment using LDA rather than the source code as text. The field of archaeogaming, which focuses on using archaeological techniques on digital games and worlds, has featured papers focusing on technical methods used in games (e.g. [Aycock] with its analysis of a maze generation algorithm). Thus, as more historical sources come to light, this appears to be a wide-open field for new insights and methods.<\/p>\n<h1 id=\"historical-questions\">Historical Questions<\/h1>\n<p>[Wardhaugh] discusses how to read historical mathematics. Paralleling that list, we can similarly analyze source code.<\/p>\n<h2 id=\"what-does-it-say--do\">What does it say \/ do?<\/h2>\n<ul>\n<li>Programs are (almost always) written to perform some functional purpose. What was that purpose?<\/li>\n<\/ul>\n<p>The source code describes the computation of some business logic, within some constraints. I suspect we will usually have more than just the source code, which can shed additional light on the code.<\/p>\n<ul>\n<li>\n<p>What data types encode the business domain? What algorithms are used to compute the results? What input\/output is used to read in data and communicate the result?<\/p>\n<\/li>\n<li>\n<p>What programming language or languages are used? Does the construction follow modern ideals or does it follow unusual or archaic patterns? How would this program interface with the execution environment, both hardware, operating system, and other programs?<\/p>\n<\/li>\n<\/ul>\n<p>Older programs are likely to be batch-oriented, reading a stream of records (likely passed in via cards or tape), with variables and control logic either provided in-band or out-of-band. Later programs may be more file-oriented or interactive. These clues may inform a &ldquo;potsherd&rdquo;-like system enabling dating of programs.<\/p>\n<ul>\n<li>Can the program be translated into a modern language or structure?<\/li>\n<\/ul>\n<p>This question is less about &ldquo;can&rdquo; and more about understandability for modern readers.<\/p>\n<h2 id=\"who-developed-it\">Who developed it?<\/h2>\n<ul>\n<li>Are there parts of the design, implementation, or documentation that indicate authorship and place of development?<\/li>\n<\/ul>\n<p>Differing code styles may point to multiple authors, or development across time, although a consistent code style may just indicate multiple authors were working from the same guide or were similarly educated\/trained.<\/p>\n<p>Names of authors may be hidden as easter eggs or as <a href=\"https:\/\/en.wikipedia.org\/wiki\/DOS_MZ_executable\">magic numbers<\/a>.<\/p>\n<ul>\n<li>Why did the author or authors develop it? Did the author&rsquo;s background or circumstances influence the result?<\/li>\n<\/ul>\n<p>Typically requires research beyond the source code, but [Wardhaugh] includes a letter which obliquely informs the author&rsquo;s circumstances. Source code is rarely narrative, though.<\/p>\n<ul>\n<li>Who were the author&rsquo;s colleagues? mentors? enemies?<\/li>\n<\/ul>\n<p>Research beyond the source code, although developers have often expressed frustrations with partners \/ hardware \/ customers in source code comments. If the source code was originally commercial, however, these comments are likely to be scrubbed.<\/p>\n<p>Authorship may be afflicted with the &ldquo;most famous person associated&rdquo; curse, so we need to be careful in interpreting the evidence.<\/p>\n<h2 id=\"how-was-it-built\">How was it built?<\/h2>\n<ul>\n<li>\n<p>Was this code meant to serve a short-term use (like a specific study or job), or was it intended for long-term use?<\/p>\n<\/li>\n<li>\n<p>If long-term use, were choices made to make it more maintainable?<\/p>\n<\/li>\n<li>\n<p>How much effort or time was involved?<\/p>\n<\/li>\n<\/ul>\n<p>If we have access to source code control history, we can infer the timeline with considerable accuracy. If not, and we do not have external data on the development, there are also models based on lines of code (e.g. COCOMO) to provide some suggestion.<\/p>\n<ul>\n<li>\n<p>Did it have a clarity of purpose or did the design change over time?<\/p>\n<\/li>\n<li>\n<p>Who did the development team expect to use the program? Who actually did?<\/p>\n<\/li>\n<li>\n<p>Why has this source code survived (and become available for study)?<\/p>\n<\/li>\n<\/ul>\n<p>Although some code survives as printouts stored in a garage, most code is only available if there is a deliberate decision to retain and release it. Filters, such as media decay, lack of archival, and companies failing all lead to the loss of code.<\/p>\n<ul>\n<li>Has this code been translated or modified before becoming available for study?<\/li>\n<\/ul>\n<h2 id=\"who-consumed-it\">Who consumed it?<\/h2>\n<ul>\n<li>\n<p>Who read or used this program (which may differ from the intended audience)? In what kind of computing environment was it used? Under what licensing terms?<\/p>\n<\/li>\n<li>\n<p>If they had a choice, why did consumers choose this program over alternatives?<\/p>\n<\/li>\n<li>\n<p>Does this program fall into a genre?<\/p>\n<\/li>\n<\/ul>\n<h2 id=\"constraints-of-research\">Constraints of Research<\/h2>\n<ul>\n<li>Disclose constraints and limitations of research, including any licensing or contractual limitations on the research, as well as limitations in terms of the information available to do the study.<\/li>\n<\/ul>\n<p>For example, the Computer History Museum&rsquo;s EULA.<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>[Aycock] Aycock, John, and Tara Copplestone. 2018. \u201cEntombed: An Archaeological Examination of an Atari 2600 Game.\u201d The Art, Science, and Engineering of Programming 3 (2). <a href=\"https:\/\/doi.org\/10.22152\/programming-journal.org\/2019\/3\/4\">https:\/\/doi.org\/10.22152\/programming-journal.org\/2019\/3\/4<\/a>.<\/p>\n<p>[Charoenwet] Charoenwet, Wachiraphan. 2018. \u201cA Digital Collection Study and Framework Exploration &mdash; Applying Textual Analysis on Source Code Collection.\u201d In 2018 3rd Digital Heritage International Congress (DigitalHERITAGE), 1\u20138. <a href=\"https:\/\/doi.org\/10.1109\/DigitalHeritage.2018.8810105\">https:\/\/doi.org\/10.1109\/DigitalHeritage.2018.8810105<\/a>.<\/p>\n<p>[Knuth] Knuth, Donald, and Len Shustek. 2021. \u201cLet\u2019s Not Dumb down the History of Computer Science.\u201d Communications of the ACM 64 (2): 33\u201335. <a href=\"https:\/\/doi.org\/10.1145\/3442377\">https:\/\/doi.org\/10.1145\/3442377<\/a>.<\/p>\n<p>[Wardhaugh] Wardhaugh, Benjamin. 2010. How to Read Historical Mathematics. Princeton and Oxford: Princeton University Press. <a href=\"https:\/\/press.princeton.edu\/books\/hardcover\/9780691140148\/how-to-read-historical-mathematics\">https:\/\/press.princeton.edu\/books\/hardcover\/9780691140148\/how-to-read-historical-mathematics<\/a>.<\/p>"},{"title":"Reserve Studies - A Policy Analysis","link":"https:\/\/ztoz.blog\/posts\/reserve-studies\/","pubDate":"Tue, 16 Apr 2024 10:50:52 -0600","guid":"https:\/\/ztoz.blog\/posts\/reserve-studies\/","description":"<p><a href=\"https:\/\/www.caionline.org\/Advocacy\/Priorities\/ReserveStudy\/Pages\/default.aspx\">Twelve states<\/a> require community associations, such as homeowners associations or HOAs, to conduct periodic reserve studies. A reserve study is a financial planning tool for estimating an association&rsquo;s ability to fund current and future planned maintenance costs for large projects, such as roofs, pools, and asphalt roads. An underfunded reserve may lead to special assessments. Yet, due to restrictions on how reserve accounts can be used, an over funded account denies funds that may be better used by the community or the individual homeowners. In this post, I describe the reserve study methodology, its strengths and weaknesses, and my recommendations for improvements.<\/p>\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/reserve-studies\/iStock-1291582571-800w.jpg\" alt=\"Tree roots damaging sidewalk because the trees were misplanted\">\n  \n  <figcaption>Sidewalk damage from tree roots<\/figcaption>\n<\/figure>\n\n\n<h1 id=\"conducting-a-reserve-study\">Conducting a Reserve Study<\/h1>\n<p>Community associations have used a variety of methods to maintain shared assets. In 1964, the Federal Housing Administration funded the Urban Land Institute to study the organization and running of American community associations. As documented in section 14.52 of (ULI, 1964), about half of the surveyed associations held &ldquo;reserves,&rdquo; financial accounts dedicated to cover the depreciation and replacement cost of certain assets<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>. Homeowners would pay into these reserves every year as part of their assessment fees, and the associations would draw upon the account to pay for major projects.<\/p>\n<p>The study found that some associations used a &ldquo;special assessment&rdquo; mechanism as an alternative, or at least a complement, to a reserve account. When the organization needed large amounts of funds to replace an asset, the homeowners could vote to oblige everyone to pay a special, single year fee. Although the study did not advocate for special assessments becoming common, it did advocate for the <em>option<\/em> to be available and included special assessments in its model covenant agreements.<\/p>\n<p>Given the existence of a reserve account, how much money should go into it? Although (ULI, 1964) does not mention a reserve study per se, it does describe a model based on depreciation from accounting. The study uses the example of a tractor that needs periodic replacement. The cost for a replacement is spread evenly (&ldquo;straight line&rdquo;) over the estimated useful life. By changing the term from depreciation to deterioration, which removes tax law quirks, we get an inventory and one of the financial models used in a reserve study.<\/p>\n<p>A basic reserve study consists of:<\/p>\n<ul>\n<li>Physical analysis, including a component inventory, an assessment of each component&rsquo;s current maintenance state, remaining life, and estimated cost of replacement<\/li>\n<li>Financial analysis, including a review of the current reserve fund status and a projection of the balance against current and future maintenance needs<\/li>\n<\/ul>\n<p>Periodically, after the inventory is formed, the practitioner conducting the study will visually inspect components within the inventory and update the remaining life. Component&rsquo;s life and costs are tabulated on a yearly basis and, based on inflation, interest, and reserve contributions, the yearly balances of the reserve fund will be estimated.<\/p>\n<h2 id=\"variation-and-standardization\">Variation and Standardization<\/h2>\n<p>Founded in 1973, the Community Associates Institute (CAI) provides and develops resources for community association management. In 1998, they published the <a href=\"https:\/\/www.reservestudy.com\/wp-content\/uploads\/2019\/02\/NRSS.pdf\">Reserve Study (RS) Standards<\/a> to establish common language and methodology for the practice. Updated in <a href=\"https:\/\/www.caionline.org\/Advocacy\/Priorities\/ReserveStudy\/Documents\/CAI%20Reserve%20Study%20Standards%20May%202023%20-%20FINAL.pdf\">May 2023<\/a>, these standards are influential but do not bear legal weight in most states.  Other professional organizations, such as the <a href=\"https:\/\/www.apra-usa.com\/\">Association of Professional Reserve Analysts<\/a> and <a href=\"https:\/\/www.capitalbudgeting.org\/\">International Capital Budgeting Institute<\/a> have developed their own, parallel standards.<\/p>\n<p>Statutory requirements for reserves and reserve studies differ state by state (Florida, for example, in 2022 added a structural integrity reserve study), but they share a common approach, often inspired by the CAI standard. Utah, for example, first adopted laws requiring HOAs to conduct a reserve analysis in 2010, but left open the contents of that analysis. (The bill originally called for a &lsquo;reserve study,&rsquo; but the language was changed to &lsquo;reserve analysis&rsquo; in committee.) In 2021, Utah updated the law and specified the contents of the reserve analysis, paralleling the CAI definition.<\/p>\n<p>As another example of a state law, <a href=\"https:\/\/nevada.public.law\/statutes\/nrs_116.31152\">Nevada&rsquo;s Reserve Study Requirements<\/a> (NRS 116.31152) are:<\/p>\n<blockquote>\n<ol start=\"3\">\n<li>The study of the reserves must include, without limitation:\n(a) A summary of an inspection of the major components of the common elements and any other portion of the common-interest community that the association is obligated to maintain, repair, replace or restore;\n(b) An identification of the major components of the common elements and any other portion of the common-interest community that the association is obligated to maintain, repair, replace or restore which have a remaining useful life of less than 30 years;\n(c) An estimate of the remaining useful life of each major component of the common elements and any other portion of the common-interest community that the association is obligated to maintain, repair, replace or restore identified pursuant to paragraph (b);\n(d) An estimate of the cost of maintenance, repair, replacement or restoration of each major component of the common elements and any other portion of the common-interest community identified pursuant to paragraph (b) during and at the end of its useful life; and\n(e) An estimate of the total annual assessment that may be necessary to cover the cost of maintaining, repairing, replacement or restoration of the major components of the common elements and any other portion of the common-interest community identified pursuant to paragraph (b), after subtracting the reserves of the association as of the date of the study, and an estimate of the funding plan that may be necessary to provide adequate funding for the required reserves.<\/li>\n<\/ol>\n<\/blockquote>\n<p>Note that the Nevada law does not require any specific methodology for how estimates are obtained nor how they are calculated. While not specifying a specific methodology <em>per se<\/em>, a recent 2024 update to <a href=\"https:\/\/trackbill.com\/bill\/tennessee-senate-bill-863-real-property-as-enacted-requires-a-board-of-directors-for-a-unit-owners-association-to-either-update-a-reserve-study-or-conduct-a-reserve-study-if-certain-conditions-are-met-amends-tca-title-66-chapter-27\/2347430\/\">Tennessee law<\/a>, certainly treats the CAI as a de jure standard by defining a reserve study as:<\/p>\n<blockquote>\n<p>&ldquo;Reserve study&rdquo; means an analysis, prepared in conformity with the latest edition\nof the Reserve Study Standards published by the Community Associations Institute, or\nsimilar standards by another nationally recognized organization, by a reserve specialist who is credentialed through the Community Associations Institute or a similarly recognized organization, or a licensed engineer or architect,<\/p>\n<\/blockquote>\n<p>While most states allow anyone to conduct a reserve study, including the community board, some states apply restrictions. Florida requires a licensed engineer or architect to conduct the visual inspections of the components, but does not require a license for the other sections. Hawaii requires the study to be reviewed by an &ldquo;independent reserve study preparer.&rdquo; Maryland <a href=\"https:\/\/mgaleg.maryland.gov\/2022RS\/chapters_noln\/Ch_664_hb0107E.pdf\">requires the preparer<\/a> to either be:<\/p>\n<ol>\n<li>A preparer of at least 30 reserve studies within the past 3 years,<\/li>\n<li>A participant in the preparation of at least 30 reserve studies within the past 3 years while employed by a firm that prepares reserve studies,<\/li>\n<li>Holds a license with the State Board of Architects or State Board for Professional Engineers, or<\/li>\n<li>Is a designated reserve specialist by the Community Association Institute or professional reserve analyst by the Association of Professional Reserve Analysts.<\/li>\n<\/ol>\n<p>To be a <a href=\"https:\/\/www.caionline.org\/LearningCenter\/credentials\/Pages\/RS.aspx\">designated reserve specialist<\/a>, one needs three years of practice preparing at least 30 reserve studies, a bachelors degree, and the payment of an application fee plus annual renewal fees. A <a href=\"https:\/\/www.apra-usa.com\/Apply-for-PRA-status\">professional reserve analyst<\/a> requires five years of experience, at least fifty reserve studies, a membership fee, plus eight hours of annual continuing education.<\/p>\n<h2 id=\"an-example\">An Example<\/h2>\n<blockquote>\n<p>If you are interested in playing with this model or simulating your own inventory, our simulation tool is <a href=\"https:\/\/ztoz.blog\/reserve-form\/\">available online<\/a>. To keep costs low, the hardware is limited. Running the model will take at least twenty seconds to complete. Larger models may take a couple minutes to complete. Serious use cases should fork the <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/reserve\">source code<\/a> and run on their own hardware. Locally, models can complete in a few seconds.<\/p>\n<\/blockquote>\n<p>Below is an example inventory, used as an example by CAI. UL is the Useful Life of each component and RUL is the Remaining Useful Life, both in years. A RUL of zero indicates the need for replacement this year. Costs are in USD and &ldquo;present dollars.&rdquo; We&rsquo;ve extended the model by adding lower and upper bounds for the cost of each project\/component. (We&rsquo;ve kept RULs constant, but our tool allows low and high RUL values.)<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Component<\/th>\n<th style=\"text-align:right\">UL<\/th>\n<th style=\"text-align:right\">RUL<\/th>\n<th style=\"text-align:right\">Cost_Original<\/th>\n<th style=\"text-align:right\">Cost_Low<\/th>\n<th style=\"text-align:right\">Cost_High<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">Pool Furniture - Replace<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">4600<\/td>\n<td style=\"text-align:right\">3680<\/td>\n<td style=\"text-align:right\">5520<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Pool - Resurface<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">10000<\/td>\n<td style=\"text-align:right\">8000<\/td>\n<td style=\"text-align:right\">12000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Roof - Replace<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:right\">18<\/td>\n<td style=\"text-align:right\">80000<\/td>\n<td style=\"text-align:right\">64000<\/td>\n<td style=\"text-align:right\">96000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Asphalt - Seal<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">5000<\/td>\n<td style=\"text-align:right\">4000<\/td>\n<td style=\"text-align:right\">6000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Asphalt - Resurface<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">25000<\/td>\n<td style=\"text-align:right\">20000<\/td>\n<td style=\"text-align:right\">30000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Building - Repaint<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">50000<\/td>\n<td style=\"text-align:right\">40000<\/td>\n<td style=\"text-align:right\">60000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Elevator - Modernize<\/td>\n<td style=\"text-align:right\">20<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">80000<\/td>\n<td style=\"text-align:right\">64000<\/td>\n<td style=\"text-align:right\">96000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Hallways - Refurbish<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">6<\/td>\n<td style=\"text-align:right\">24000<\/td>\n<td style=\"text-align:right\">19200<\/td>\n<td style=\"text-align:right\">28800<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we simulate this model over 30 years, assuming an inflation rate between 2% and 4%, we can plot the expenses per year using a box plot:<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/reserve-studies\/expenses-per-year.svg\" alt=\"Expenses per year, showing many spikes\">\n  \n  <figcaption>Simulated Expenses Per Year<\/figcaption>\n<\/figure>\n\n\n<p>Although expenses may seem rather bumpy in this model, it is likely smoother than real inventories from medium and larger sized associations. This model also lacks the clutter from the many smaller projects that populate a real association&rsquo;s inventory.<\/p>\n<p>In order to meet current and future costs, associations are expected to fund reserves to meet cash flow needs while keeping assessments stable (e.g. increasing with inflation). The standards refer to a Fully Funded Balance (FFB) as an indicator of a conservative and healthy level of funding. For each component in the inventory, the FFB is calculated as:<\/p>\n<p>$$ FFB = C \\frac{a}{u} $$<\/p>\n<p>where C is the replacement cost, a is the effective age, and u is the useful life. FFB assumes a linear degradation of the asset, until the remaining life hits zero, at which point the full replacement cost is realized. For a site, the aggregate FFB is the sum of the FFB for each component. If an association&rsquo;s reserve balance is equal to the FFB (or within a few percentage points), then the balance is considered &ldquo;fully funded&rdquo;.<\/p>\n<p>Continuing the example, we compute the (most likely) expenses per year and plot the FFB at the beginning and end of the year for a ten year span. The contribution is the amount necessary to fill the gap.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Category<\/th>\n<th style=\"text-align:right\">Year 1<\/th>\n<th style=\"text-align:right\">Year 2<\/th>\n<th style=\"text-align:right\">Year 3<\/th>\n<th style=\"text-align:right\">Year 4<\/th>\n<th style=\"text-align:right\">Year 5<\/th>\n<th style=\"text-align:right\">Year 6<\/th>\n<th style=\"text-align:right\">Year 7<\/th>\n<th style=\"text-align:right\">Year 8<\/th>\n<th style=\"text-align:right\">Year 9<\/th>\n<th style=\"text-align:right\">Year 10<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">FFB (Initial)<\/td>\n<td style=\"text-align:right\">154.1<\/td>\n<td style=\"text-align:right\">174.8<\/td>\n<td style=\"text-align:right\">148.4<\/td>\n<td style=\"text-align:right\">142.1<\/td>\n<td style=\"text-align:right\">169.0<\/td>\n<td style=\"text-align:right\">197.5<\/td>\n<td style=\"text-align:right\">114.5<\/td>\n<td style=\"text-align:right\">113.3<\/td>\n<td style=\"text-align:right\">135.9<\/td>\n<td style=\"text-align:right\">166.3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Expenses<\/td>\n<td style=\"text-align:right\">4.6<\/td>\n<td style=\"text-align:right\">51.5<\/td>\n<td style=\"text-align:right\">31.8<\/td>\n<td style=\"text-align:right\">0.0<\/td>\n<td style=\"text-align:right\">0.0<\/td>\n<td style=\"text-align:right\">109.7<\/td>\n<td style=\"text-align:right\">28.6<\/td>\n<td style=\"text-align:right\">6.1<\/td>\n<td style=\"text-align:right\">0.0<\/td>\n<td style=\"text-align:right\">0.0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">Contribution<\/td>\n<td style=\"text-align:right\">25.3<\/td>\n<td style=\"text-align:right\">25.1<\/td>\n<td style=\"text-align:right\">25.6<\/td>\n<td style=\"text-align:right\">27.0<\/td>\n<td style=\"text-align:right\">28.5<\/td>\n<td style=\"text-align:right\">26.7<\/td>\n<td style=\"text-align:right\">27.3<\/td>\n<td style=\"text-align:right\">28.8<\/td>\n<td style=\"text-align:right\">30.4<\/td>\n<td style=\"text-align:right\">32.1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">FFB (EoY)<\/td>\n<td style=\"text-align:right\">174.8<\/td>\n<td style=\"text-align:right\">148.4<\/td>\n<td style=\"text-align:right\">142.1<\/td>\n<td style=\"text-align:right\">169.0<\/td>\n<td style=\"text-align:right\">197.5<\/td>\n<td style=\"text-align:right\">114.5<\/td>\n<td style=\"text-align:right\">113.3<\/td>\n<td style=\"text-align:right\">135.9<\/td>\n<td style=\"text-align:right\">166.3<\/td>\n<td style=\"text-align:right\">198.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Although expenses are bumpy (e.g. no expenses in years 4 and 5), contributions are much smoother because they are the aggregate of many individual &ldquo;glide slopes&rdquo;.<\/p>\n<p>This table assumes the association is already fully funding their fund. A practitioner would need to examine the association&rsquo;s current funding and contributions, as well as any existing deferred maintenance or debts in order to complete a financial analysis. Since an association may not start their fund fully funded, a practitioner would provide advice on how to smoothly escalate contributions over time while avoiding a negative reserve balance.<\/p>\n<h1 id=\"strengths-of-the-methodology\">Strengths of the Methodology<\/h1>\n<p>Annually, a community uses a reserve study for two primary purposes:<\/p>\n<ol>\n<li>Decide how much budget to contribute to the reserve fund, and<\/li>\n<li>Decide what components\/projects to fund this year<\/li>\n<\/ol>\n<p>The first purpose is covered by the financial plan of the study, while the second purpose is handled via the inventory. In some states like Florida there may also be a safety inspection, but we will ignore that in this analysis.<\/p>\n<h2 id=\"easy-to-implement-audit-and-understand\">Easy to implement, audit, and understand<\/h2>\n<p>The reserve study methodology is freely available<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup> and, except for the remaining life estimates, requires no skills beyond high school math. The primary output of the study is a table that can be easily audited by walking around the community and useful life numbers can be compared (or retrieved) from <a href=\"https:\/\/www.reservedataanalyst.com\/blog\/useful-life\/\">publicly available sources<\/a> or references like the <em>The Whitestone Facility Maintenance and Repair Cost Reference 2009-2010<\/em>.<\/p>\n<p>Estimating the remaining life of components requires specialized knowledge and is sometimes called an art rather than a science. However, these estimates simply populate a column in the table and most estimates are not safety-critical. With further development of video-based prognostic systems, we can expect computer-aided decision tools and further automation of the process.<\/p>\n<h2 id=\"portable\">Portable<\/h2>\n<p>The community board can easily switch to a different provider of the study. As a work for hire, the results of previous studies are owned by the community association and can be used to &lsquo;jump start&rsquo; a new study. Vendor lock-in is difficult because the study does not include &lsquo;black box&rsquo; or proprietary features, although a study may include &lsquo;value add&rsquo; elements.<\/p>\n<p>Based on member directories for the professional organizations and listed companies by internet search, this appears to be a competitive market, although one with limited price transparency. In most states, associations are free to balance vendors' prices, experience, certifications, and other factors. States with licensing requirements or considering measures to restrict trade should carefully consider the evidence for the claim that restrictions promote quality and safety. The report <a href=\"https:\/\/obamawhitehouse.archives.gov\/sites\/default\/files\/docs\/licensing_report_final_nonembargo.pdf\">Occupational Licensing: A Framework for Policymakers<\/a> is a good resource, detailing the costs and benefits of licensure and the many policy alternatives.<\/p>\n<h2 id=\"reduced-need-of-forecast-accuracy-via-inexpensive-updates\">Reduced need of forecast accuracy via inexpensive updates<\/h2>\n<p>Reserve studies need to be updated periodically and some states mandate an update frequency, typically from one to six years. Studies have different &ldquo;levels&rdquo; of detail and updates in many years may require less effort. Due to these updates, uncertainties in the model are less critical because there are more opportunities for review and revision, particularly as components approach their end of useful life and estimates become more accurate.<\/p>\n<h1 id=\"weaknesses\">Weaknesses<\/h1>\n<p>Although reserve studies are claimed as a best practice (CAI, 2020), we have failed to find any study of their efficacy. This weakness was identified in (Winkour, 1998) and stems from a lack of public data. Our preference would be an assessment of their contribution to efficient fiscal management, but lacking data for such a study, our criticisms are based on our experiences in modeling and simulation, rather than historical analysis or comparative analysis of outcomes.<\/p>\n<p>In terms of efficacy, associations have voiced their desires in various surveys. A recent representative one is documented in (Foundation, 2020):<\/p>\n<blockquote>\n<p>\u201cMore than three-quarters (80%) of those surveyed felt it was critical that their association have adequate reserves in the event of a major infrastructure failure or construction need. Nearly half (40%) of those surveyed considered deteriorating infrastructure as a top-ranked concern. More than two-thirds (70%) of survey respondents indicated that maintaining property values was of primary importance.\u201d<\/p>\n<\/blockquote>\n<p>Restated, the desired outcomes of the reserve study process are:<\/p>\n<ol>\n<li>Protection of property values, by avoiding deferred maintenance or a weak fiscal position,<\/li>\n<li>Avoidance of special assessments<sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote-ref\" role=\"doc-noteref\">3<\/a><\/sup>, and<\/li>\n<li>Discovery, prioritization, and remediation of imminent infrastructure failures and decay.<\/li>\n<\/ol>\n<p>An additional outcome in some states is fulfilling regulatory reporting requirements, but paperwork is not a good by itself.<\/p>\n<p>We identify four weaknesses in how the methodology addresses these key desired outcomes. They are:<\/p>\n<ol>\n<li>Lacks a top-down measure or &ldquo;smell test&rdquo;<\/li>\n<li>Fails to model risk<\/li>\n<li>Lacks sensitivity analysis<\/li>\n<li>Fails to stress need to mitigate inflation<\/li>\n<\/ol>\n<h2 id=\"lacks-a-top-down-measure-or-smell-test\">Lacks a top-down measure or &ldquo;smell test&rdquo;<\/h2>\n<p>Reserve studies use a purely bottom-up approach. By bottom-up, we mean that the practitioner catalogs individual items and then sums them up to estimate the overall deterioration. While this is a detail-oriented approach, there is always a methodological risk that a component may be forgotten or underestimated. In contrast, a top-down approach will start at the community and make estimates from the top, often parameterised by broad questions such as &ldquo;does the community have a pool?&rdquo;. The top-down estimate may have greater uncertainty and will lack the detailed inventory to satisfy the year-by-year maintenance cycle, but it can be directly compared to other communities to ferret out gross inaccuracies or signal that a community is out-of-line with its peers.<\/p>\n<p>We are not advocating the abandonment of the bottom-up inventory approach for a top-down approach or model, but rather that they should be used together. With two measurements, an association will have better confidence that an analysis is complete and valid. For an example of an existing top-down model, (Whitestone, 2009) includes cost profiles for a variety of building types and typical 50 year maintenance and repair costs and tasks which can be used as benchmarks.<\/p>\n<p>In (CAI, 2020), they state &ldquo;THE MOST CRITICAL decision in conducting a reserve study is selecting which components to include. This requires analysis of the association\u2019s\ngoverning documents, application of the National Reserve Study Standards four-part test, and applicable state statutes and civil codes in addition to the physical inspection of each component.&rdquo; The addition of top-down methods will diminish the criticality of the inventory selection.<\/p>\n<h2 id=\"fails-to-model-risk\">Fails to model risk<\/h2>\n<p>One may be confused by the claim that reserve studies do not model risk. After all, (CAI, 2023) speaks of three funding goals and their respective levels of risk:<\/p>\n<ol>\n<li>Baseline Funding - meet immediate cash flow requirements and minimize reserve contributions; highest risk<\/li>\n<li>Threshold Funding - maintain a reserve balance above a specified percentage <em>or dollar amount<\/em><\/li>\n<li>Full Funding - maintain a reserve balance within a few percent of the fully-funded balance (FFB); lowest risk<\/li>\n<\/ol>\n<p><strong>Aside:<\/strong> Contrary to their definitions, threshold funding is really two separate goals. A minimum dollar balance can have a basis as a true risk model, while, as we argue below, a goal based on a percentage of the FFB is not a true risk model but rather an acknowledgement that associations can use cash flow to their advantage.<\/p>\n<p>However, the standards do not define risk nor attempt to quantify it. In this section, we will define risk and uncertainty using generally accepted definitions, describe how we might translate FFB into risk terms and why we disagree with that approach, and then ways we can enhance the method to manage risk.<\/p>\n<p>Before going into the reasons why the reserve model is not a risk model, we need to ask why should an association care? An association is responsible for all community expenses, but the CAI definition of the reserve study inventory, and some states, is a subset. In the CAI&rsquo;s three-part test (CAI, 2023), a project is only included in the study if it can be &ldquo;reasonably anticipated&rdquo; and the cost &ldquo;reasonably estimated.&rdquo; While this restriction limits the scope of the inventory to a more grounded list, it fails to account for realistic obligations of the association. Even when a project can be reasonably anticipated and the cost estimated, the execution of projects has risks and uncertainties that the literature does not address. Thus, reserve studies will underestimate financial costs.<\/p>\n<p>In our opinion, ideally, a reserve study would inform the association of known planned maintenance items, unplanned maintenance items that are likely to occur and contingencies (i.e. risks), and the ranges of variables (i.e. uncertainty). The methodology, however, often restricts the scope to just the known planned maintenance items. A restriction in scope can make sense if reserve studies are part of a portfolio of tools, but if a portfolio is the intent, the literature and legislation are oddly silent on the issue.<\/p>\n<p>How do we define risks and uncertainty? First, risks and uncertainty are often conflated, but they are distinct concepts. A risk is a potential event that, if it happens, will have negative impact (e.g. cost the association money). Uncertainty is the range (or distribution) of possible values for some variable. A component requiring highly premature replacements due to contractor negligence is a risk. The range of potential bids for a job is an uncertainty. Inflation is not a risk as it will happen, but the magnitude of inflation is uncertain <sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote-ref\" role=\"doc-noteref\">4<\/a><\/sup>. A special assessment is not a risk; it is a consequence of insufficient budget.<\/p>\n<p>If we sum all the risks (their probabilities times their magnitudes), we will arrive at a dollar value that expresses the probabilistic cost to the association. An association can bear this risk as part of their reserve, part of their operational budget, or as a potential cost to homeowners and collected as a special assessment.<\/p>\n<h3 id=\"risk-mitigation-and-threshold-funding\">Risk Mitigation and Threshold Funding<\/h3>\n<p>Central to CAI&rsquo;s claim is that Threshold Funding, based on a percentage of the FFB, provides a proportional amount of &ldquo;risk management.&rdquo; As described above, the FFB for a site is the aggregate of the FFB for every component. Let&rsquo;s explore this concept with a simple model of three components with zero uncertainty:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Component<\/th>\n<th style=\"text-align:right\">Repl. Cost<\/th>\n<th style=\"text-align:right\">Degradation %<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">C1<\/td>\n<td style=\"text-align:right\">$5<\/td>\n<td style=\"text-align:right\">10%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">C2<\/td>\n<td style=\"text-align:right\">$3<\/td>\n<td style=\"text-align:right\">50%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">C3<\/td>\n<td style=\"text-align:right\">$7<\/td>\n<td style=\"text-align:right\">80%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The FFB for C1 is \\(5 * 0.1 = 0.5\\), C2 is \\(3 * 0.5 = 1.5\\), and C3 is \\(7 * 0.8 = 5.6\\). Summed, the FFB is $7.60.<\/p>\n<p>Within the coming year, we can compute the probabilities for each to fail (using the degradation percentages as probabilities and <em>assuming failures are independent<\/em>), the summed magnitude (cost of replacement), and the weighted magnitude (cost of replacement times the probability).<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:right\">C1 Fail?<\/th>\n<th style=\"text-align:right\">C2 Fail?<\/th>\n<th style=\"text-align:right\">C3 Fail?<\/th>\n<th style=\"text-align:right\">Prob.<\/th>\n<th style=\"text-align:right\">Magnitude<\/th>\n<th style=\"text-align:right\">Weighted M<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0.09<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0.36<\/td>\n<td style=\"text-align:right\">7<\/td>\n<td style=\"text-align:right\">2.52<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0.09<\/td>\n<td style=\"text-align:right\">3<\/td>\n<td style=\"text-align:right\">0.27<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0.36<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">3.60<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0.01<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">0.05<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0.04<\/td>\n<td style=\"text-align:right\">12<\/td>\n<td style=\"text-align:right\">0.48<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0.01<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">0.08<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">0.04<\/td>\n<td style=\"text-align:right\">15<\/td>\n<td style=\"text-align:right\">0.60<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we sum the weighted magnitude, we get $7.60, which is equal to the FFB. Thus, the FFB is the weighted probability of replacing the entire inventory. (This assumes there are no out-of-inventory costs and all projects complete within their allotted budget.)<\/p>\n<p>Let&rsquo;s now sort the probabilities in declining order to arrive at a cumulative distribution function. Because some components are nearing end-of-life and they are more degraded than not, the lowest probability items do not correspond to the highest magnitude items.<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:right\">Prob.<\/th>\n<th style=\"text-align:right\">Magnitude<\/th>\n<th style=\"text-align:right\">Weighted M<\/th>\n<th style=\"text-align:right\">Cum. P<\/th>\n<th style=\"text-align:right\">Cum. M<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:right\">0.36<\/td>\n<td style=\"text-align:right\">7<\/td>\n<td style=\"text-align:right\">2.52<\/td>\n<td style=\"text-align:right\">0.36<\/td>\n<td style=\"text-align:right\">2.52<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.36<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">3.6<\/td>\n<td style=\"text-align:right\">0.72<\/td>\n<td style=\"text-align:right\">6.12<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.09<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0<\/td>\n<td style=\"text-align:right\">0.81<\/td>\n<td style=\"text-align:right\">6.12<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.09<\/td>\n<td style=\"text-align:right\">3<\/td>\n<td style=\"text-align:right\">0.27<\/td>\n<td style=\"text-align:right\">0.9<\/td>\n<td style=\"text-align:right\">6.39<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.04<\/td>\n<td style=\"text-align:right\">12<\/td>\n<td style=\"text-align:right\">0.48<\/td>\n<td style=\"text-align:right\">0.94<\/td>\n<td style=\"text-align:right\">6.87<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.04<\/td>\n<td style=\"text-align:right\">15<\/td>\n<td style=\"text-align:right\">0.6<\/td>\n<td style=\"text-align:right\">0.98<\/td>\n<td style=\"text-align:right\">7.47<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.01<\/td>\n<td style=\"text-align:right\">5<\/td>\n<td style=\"text-align:right\">0.05<\/td>\n<td style=\"text-align:right\">0.99<\/td>\n<td style=\"text-align:right\">7.52<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">0.01<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">0.08<\/td>\n<td style=\"text-align:right\">1<\/td>\n<td style=\"text-align:right\">7.6<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As it is unlikely for many assets to fail ahead of schedule (as a reminder, a site-wide disaster would be covered by insurance, not the reserve fund), we can capture 95% of the probability range with $6.87, rather than the full $7.60. Since $6.87 is 90% of the money but is yielding 95% of the coverage, this seems we are ahead and accepting that 5% &ldquo;risk&rdquo; is rational.<\/p>\n<p>However, this logic is fallacious. First, if we simplify the model to a single component, or make one component significantly larger than others (e.g. roofs versus patio furniture), there will be a short-fall. This is because the approach is really a model of cash flow; having multiple independent inflows can &ldquo;free up&rdquo; enough cash to cover a small number of outflows. Depending on the composition of the inventory, a skillful association can make this approach work for a long time, but it is not robust.<\/p>\n<p>Secondly, note that of the eight results in the table, four of them have <em>magnitudes<\/em> greater than $7.60, the FFB amount. Realizing any of these scenarios would exhaust the reserve fund. If the association reserved an additional $0.40, then it would cover one more potential result. Ideally the method would provide a rational basis to balance the possibility of a special assessment versus the costs of saving the extra $0.40 or $2.40.<\/p>\n<p>Third, the approach is a &ldquo;closed universe&rdquo; as it does not seek to cover risks. For example:<\/p>\n<ol>\n<li>Methodological risk - a component is missed from the inventory, but must be covered by the association.<\/li>\n<li>Infrastructure risk - a component prematurely needs repair or replacement<\/li>\n<li>Project risk - the execution of a repair or replacement causes other damage or needs to be redone by other vendor due to quality issues (which are not fully mitigated by contractual terms)<\/li>\n<li>Inflation \/ Exterior Factor risk - availability or cost of material soars beyond that captured in the cost uncertainty range due to supply chain issues<\/li>\n<li>Uncertainty Methodology - If the low and high marks for a cost represent two-sigmas of the range, then 5% of the costs will be outside the given range<\/li>\n<\/ol>\n<h3 id=\"alternative---value-at-risk\">Alternative - Value at Risk<\/h3>\n<p>Value at Risk (VaR) is a commonly used, although controversial, risk framework. VaR estimates the risk of loss of capital, under normal conditions within a set amount of time. For instance, looking at the historical movement of a stock, a practitioner might say that 95% of the time, the loss is 4.6% or less on a daily basis, so over the next day, we can expect with 95% likelihood the stock&rsquo;s value at risk is the (current value) times 4.6% or less. The amount of potential loss will inform how much money needs to be in reserve or hedged.<\/p>\n<p>If we use this framework and lack historical data, we can try a projection instead. The value is the association&rsquo;s property value. As a deteriorating asset, the expected minimal loss is equal to the FFB for the year. However, there are additional uncertainties and potential risks that may incur additional expenses that are not mitigated by insurance.<\/p>\n<p>Based on the inventory, we know the expected costs for each component in nominal dollars. Sorting the components in order by cost, an association can pick the n-th percentile cost from the list. An association is unlikely to choose an outlier, but rather a higher-end yet typical project. The association adds the cost for this project as an additional buffer to the reserve fund. Thus, the desired budget is not the FFB, but the FFB plus this n-th project cost. The sum is the value-at-risk.<\/p>\n<p>The effect is that the budget accounts for all planned projects plus one unexpected but typical-sized project. An association will still likely need a special assessment if a large capital project becomes urgent several years ahead of schedule or similar high magnitude item. Unfortunately, the more long-tail the component costs are, the less amenable to straight-forward risk management approaches.<\/p>\n<h3 id=\"alternative---double-declining-balance--non-linear-deterioration\">Alternative - Double Declining Balance \/ Non-Linear Deterioration<\/h3>\n<p>If an association is mostly concerned about uncertainties in component&rsquo;s remaining life, one possible technique lifted from accounting is &ldquo;double declining balance.&rdquo; Rather than modeling deterioration as a linear function, this model front loads the deterioration. If the component ends up needing replacement early, the impact will be less because more has been paid up-front. For a component with a five year useful life and a replacement cost of $1000, the impact of replacing at four years or one year early would be $200, but about $50 in the double-declining model. Other non-linear models are certainly possible, although an association might only feel it useful for particularly high cost components. Residents may also object as the burden of payment is shifted to those with more time before the replacement is due.<\/p>\n<h2 id=\"lacks-sensitivity-analysis\">Lacks sensitivity analysis<\/h2>\n<p>Although some practitioners include some form of sensitivity analysis (SA) within their studies, the standards do not prescribe it. Indeed, this follows since the standards do not address uncertainty in estimates. In reality, practitioners do not know every number  to the third significant figure. Sensitivity analysis is an important disclosure of the practitioners confidence in the data. Associations can use SA to prioritize their own efforts.<\/p>\n<p>Sensitivity analysis (SA) is a set of techniques for understanding what &ldquo;drives&rdquo; a model. (Saltelli, 2000) states:<\/p>\n<blockquote>\n<p>\u201cModellers conduct SA to determine:\n(a) if a model resembles the system or process under study;\n(b) the factors that mostly contribute to the output variability and that require additional research to strengthen the knowledge base;\n(c) the model parameters (or parts of the model itself) that are insignificant, and that can be eliminated from the final model;\n(d) if there is some region in the space of input factors for which the model variation is maximum;\n(e) the optimal regions within the space of the factors for use in a subsequent calibration study;\n(f) if and which (group of) factors interact with each other.\u201d<\/p>\n<\/blockquote>\n<p>For our purposes, a reserve study benefits from (b), understanding the drivers of uncertainty and variability, and (c), the significance of elements on the result.<\/p>\n<p>Quantifying (c) or the impact of each component (and parameter) on the model is easiest; many existing studies include a breakdown of the costs over the time horizon using a pie chart or a treemap. In the example model, the primary contributions to total cost are:<\/p>\n<ul>\n<li>r (44%)<\/li>\n<li>Elevator - Modernize (16%)<\/li>\n<li>Building - Repaint (13%)<\/li>\n<li>Roof - Replace (8%)<\/li>\n<li>Hallways - Refurbish (7%)<\/li>\n<\/ul>\n<p>Based on this summary, we can tell that r, the inflation rate, is the primary driver of costs. This is expected since we are looking over a 30 year period. Although the inventory shows &ldquo;Elevator - Modernize&rdquo; as having the same cost as &ldquo;Roof - Replace,&rdquo; the elevator will need to be replaced twice during the study period as the roof.<\/p>\n<p>Associations should find these lists useful to prioritize their efforts, particularly where to <em>not<\/em> spend their time.<\/p>\n<p>The other relevant form of sensitivity analysis is to quantify the impact on <em>uncertainty<\/em> or variability, rather than the overall cost. If the practitioner does not provide any estimates of uncertainty, then this analysis is impossible. Associations should be suspicious of any study that presents all the estimated costs as having no significant uncertainty and practitioners being certain within a year of every component&rsquo;s remaining life.<\/p>\n<p>Within the model, there are four variables with uncertainty: the remaining useful life of components, their cost for repair or replacement, the inflation rate, and the interest rate. In our experience, although not required nor encouraged by the standards, practitioners often provide some estimate of uncertainty for the costs and will plot low and high error bounds of their financial estimates. Uncertainty in remaining useful life can be very significant (e.g. a +\/- one year uncertainty for a ten year component is equal to 10% of the nominal cost), but seems to be rarely recorded.<\/p>\n<p>That said, computing contributions to uncertainty requires advanced mathematics and, in practice, specialized software. For <a href=\"https:\/\/ztoz.blog\/reserve-form\/\">our tool<\/a>, we used the <a href=\"https:\/\/salib.readthedocs.io\/en\/latest\/index.html\">SALib<\/a> library for its implementation of Sobol Sensitivity Analysis (Saltelli, 2006). Using this technique, we find that the drivers of uncertainty within the example model are:<\/p>\n<ul>\n<li>r (78%)<\/li>\n<li>Elevator - Modernize (10%)<\/li>\n<li>Building - Repaint (7%)<\/li>\n<\/ul>\n<p>This list is similar to the one prior, although inflation (r) is significantly larger. Although the bounds on r are relatively small versus the +\/- 20% of each component&rsquo;s costs, inflation is more impactful due to its corrosive effect.<\/p>\n<p>If a component is a major driver of cost, but not uncertainty, then it is well understood and the project should be easy to manage. Contra-wise, if a component is a minor driver of cost but a major driver of uncertainty, then it is poorly understood. An association may be free to ignore it until the project comes due. If a component is both a major driver of cost and uncertainty, then an association should invest effort to reduce the uncertainty, perhaps by seeking contractor advice or seeing how other associations have managed a similar situation.<\/p>\n<p>With many practitioners already providing a summary by cost impact, it should be a standard element of reserve reports. The summary prioritizes projects and helps the association board communicate the need for reserve funding to their members.<\/p>\n<p>As an advanced technique, we are hesitant to mandate quantifying impact on uncertainty. Yet without it, or a similar technique, the practitioner is failing to disclose fundamental limitations of their knowledge and ability. This is not an admission of negligence or incompetence, but rather limitations of the state of the art.<\/p>\n<h2 id=\"fails-to-stress-need-to-mitigate-inflation\">Fails to stress need to mitigate inflation<\/h2>\n<p>In the examples so far, you may have noted the high impact of inflation on costs and uncertainty. Inflation is highly variable, and thus the impact uncertain, as this <a href=\"https:\/\/inflationdata.com\/Inflation\/Inflation\/DecadeInflation.asp\">table of historical inflation<\/a> illustrates:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:left\">Decade<\/th>\n<th style=\"text-align:right\">Annual Average<\/th>\n<th style=\"text-align:right\">Cumul. by Decade<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">1913-1919<\/td>\n<td style=\"text-align:right\">9.80%<\/td>\n<td style=\"text-align:right\">92.86%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1920-1929<\/td>\n<td style=\"text-align:right\">0.38%<\/td>\n<td style=\"text-align:right\">4.82%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1930-1939<\/td>\n<td style=\"text-align:right\">-1.80%<\/td>\n<td style=\"text-align:right\">-18.60%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1940-1949<\/td>\n<td style=\"text-align:right\">4.86%<\/td>\n<td style=\"text-align:right\">68.57%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1950-1959<\/td>\n<td style=\"text-align:right\">1.82%<\/td>\n<td style=\"text-align:right\">24.58%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1960-1969<\/td>\n<td style=\"text-align:right\">2.45%<\/td>\n<td style=\"text-align:right\">28.23%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1970-1979<\/td>\n<td style=\"text-align:right\">7.25%<\/td>\n<td style=\"text-align:right\">103.45%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1980-1989<\/td>\n<td style=\"text-align:right\">5.82%<\/td>\n<td style=\"text-align:right\">64.41%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">1990-1999<\/td>\n<td style=\"text-align:right\">3.08%<\/td>\n<td style=\"text-align:right\">33.47%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2000-2009<\/td>\n<td style=\"text-align:right\">2.54%<\/td>\n<td style=\"text-align:right\">28.31%<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">2010-2019<\/td>\n<td style=\"text-align:right\">1.75%<\/td>\n<td style=\"text-align:right\">19.38%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The &ldquo;annual average&rdquo; is computed as the geometric mean of each year&rsquo;s inflation (CPI-U) within that decade. The cumulative inflation demonstrates the constant erosion effect as, even in the 1950s where the average annual inflation rate was below the 2% target, prices grew by 25%. Over the thirty year time frame of most reserve studies, inflation provides a huge range of possibilities.<\/p>\n<p>As something so variable, one might be inclined to eliminate any projections of inflation as it cannot be &ldquo;reasonably estimated&rdquo; per the three-part test. In fact, an alternative described in (Winokur, 1998) is to revise cost estimates as inflation numbers are known, but otherwise keep the study in nominal dollars. However, in the language of risk management, rather than <em>accept<\/em> inflation, we can work to <em>mitigate<\/em> it.<\/p>\n<p>Associations should invest savings to mitigate inflation. (Frumkin, 2009) states that \u201cMeeting or potentially beating the inflation rate through safe, secure investments should be the board\u2019s ultimate goal.&quot; The financial analysis of a study, which has a multi-decade horizon after all, should similarly advance that goal.<\/p>\n<p>By &ldquo;safe, secure investments&rdquo;, (Frumkin, 2009) advocates the following prioritized elements in an investment policy:<\/p>\n<ol>\n<li>Safety of Principal: Safety of income and safety of principal.<\/li>\n<li>Liquidity<\/li>\n<li>Yield<\/li>\n<\/ol>\n<p>The book then goes on to describe some specific investment strategies that fit this policy, although, as usual, it recommends boards to work with a qualified financial advisor. (Some associations use alternative systems, such as selling bonds, to provide revenue. Associations can also own profit-generating assets, so any government seeking to legislate in this area should be aware that there is ongoing innovation.)<\/p>\n<p>The 2023 reserve standards simply require practitioners to describe &ldquo;assumptions utilized for interest and inflation&rdquo; in their report. In sample reports, we have witnessed interest rates often far below assumed inflation rates. Given the massive deleterious impact of inflation on savings without mitigating interest, it seems a dereliction of duty for the practitioner to not, at least, refer the association for some professional help.<\/p>\n<h1 id=\"conclusion-and-recommendations\">Conclusion and Recommendations<\/h1>\n<blockquote>\n<p>Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about safety from mice when there are tigers abroad. (Box, 1976)<\/p>\n<\/blockquote>\n<p>Reserve studies, at least from first principles, purport to be a straight-forward financial planning tool for planned maintenance. Professional organizations advocate that their practice leads to better long-term maintenance and more predictable member assessments. Some states have agreed and have mandated their practice. However, data is sparse on their efficacy.<\/p>\n<p>Research into community associations is often restricted by the lack of public data. Although associations and managers are often surveyed, researchers lack access to data to make efficacy measurements or track decisions. If associations shared reserve studies, they could all benefit from tracking trends in costs and actual life expectancies of key components.<\/p>\n<p>We propose the following changes in the spirit of maintaining the good qualities of the method while mitigating its weaknesses. Association boards can be proactive, rational, and adaptable. Per Box&rsquo;s quote above, faults and limitations in a model can be overcome when they are understood and the board has freedom of action.<\/p>\n<ol>\n<li>Associations share data and assist in advancing knowledge<\/li>\n<li>Practitioners quantify impact by cost<\/li>\n<li>Practitioners include uncertainty estimates on costs and remaining life<\/li>\n<li>Professional bodies advocate mitigation of inflation<\/li>\n<li>Associations work to mitigate the impact of inflation on their reserve accounts<\/li>\n<\/ol>\n<p>We also recommend reviewing (Winokur, 1998) which focuses on association&rsquo;s financial governance.<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>(Box, 1976) Box, George E. P. (1976) &ldquo;Science and statistics&rdquo;, Journal of the American Statistical Association, 71 (356): 791\u2013799, doi:10.1080\/01621459.1976.10480949 <a href=\"http:\/\/www-sop.inria.fr\/members\/Ian.Jermyn\/philosophy\/writings\/Boxonmaths.pdf\">PDF<\/a><\/p>\n<p>(CAI, 2020) Foundation for Community Association Research (2020) Best Practices: Reserve Studies\/Management. <a href=\"https:\/\/foundation.caionline.org\/wp-content\/uploads\/2017\/06\/bprs.pdf\">PDF<\/a><\/p>\n<p>(CAI, 2023) Community Associates Institute (2023) Reserve Study Standards. <a href=\"https:\/\/www.caionline.org\/Advocacy\/Priorities\/ReserveStudy\/Documents\/CAI%20Reserve%20Study%20Standards%20May%202023%20-%20FINAL.pdf\">PDF<\/a><\/p>\n<p>(Foundation, 2020) Foundation for Community Association Research (2020) Breaking Point: Examining Aging Infrastructure in Community Associations. <a href=\"https:\/\/foundation.caionline.org\/wp-content\/uploads\/2020\/04\/FoundationAgingInfrastructureReport.pdf\">PDF<\/a><\/p>\n<p>(Frumkin, 2009) Frumkin, M., March, N. (Eds.) (2009) Reserve Funds: How &amp; Why Community Associations Invest Assets, 2nd ed.<\/p>\n<p>(Saltelli, 2000) Saltelli, A., Chan, K., Scott, E.M. (Eds.) (2000) Sensitivity Analysis, Probability and Statistics. John Wiley &amp; Sons, Ltd.<\/p>\n<p>(ULI, 1964) Urban Land Institute (1964) The Homes Association Handbook Rev. and enl. ed. Washington: Urban Land Institute. <a href=\"https:\/\/archive.org\/details\/homesassociation0000urba\">Archive<\/a><\/p>\n<p>(Whitestone, 2009) Abate, D., Towers, M., Dotz, R., Romani, L. (2009) The Whitestone Facility Maintenance and Repair Cost Reference 2009-2010, 14th ed. Whitestone Research.<\/p>\n<p>(Winokur, 1998) Winokur, J. (1998) Critical Assessment: The Financial Role of Community Associations. Santa Clara Law Review Vol 38, Iss 4. <a href=\"https:\/\/digitalcommons.law.scu.edu\/cgi\/viewcontent.cgi?article=1434&amp;context=lawreview\">PDF<\/a><\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>In some cases, the reserves only existed to cover inflation, rather than the principal plus inflation costs of a project.&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>ICBI requires their practitioners to use certified software, while CAI and ARPA impose no similar restrictions.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:3\" role=\"doc-endnote\">\n<p>This is not a fully rational goal as an association could excessively mitigate the possibility of a special assessment. Ideally, this goal would be &ldquo;two sided&rdquo; and be expressed in terms of efficiency.&#160;<a href=\"#fnref:3\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:4\" role=\"doc-endnote\">\n<p>In the risk management literature, &ldquo;inflation risk&rdquo; is the risk that inflation will exceed some defined threshold.&#160;<a href=\"#fnref:4\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>"},{"title":"How User Groups Made Software Reuse a Reality","link":"https:\/\/ztoz.blog\/posts\/user-group-reuse\/","pubDate":"Tue, 27 Feb 2024 16:29:25 -0700","guid":"https:\/\/ztoz.blog\/posts\/user-group-reuse\/","description":"<p>Before the widespread existence of software repositories like CPAN, NPM, and PyPI, developers seeking to reuse an existing algorithm or library of routines would either check books or journals for code, or, they just might post a classified ad:<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/user-group-reuse\/Wanted-Subroutine-Decuscope-1965-02.png\" alt=\"Wanted: Fixed Point Sine\/Cosine subroutine faster than 10 ms for PDP-14\">\n  \n  <figcaption>Request posted in Decuscope 1965, Vol 4\/Iss 2<\/figcaption>\n<\/figure>\n\n\n<p>User groups provided catalogues of software, from mathematical algorithms to system utilities to games and demos. Leveraging the user group&rsquo;s periodicals, developers could post requests for specific examples of code. Or, more frequently, developers would review catalogs for existing solutions. They would contribute by sending their own creations to the group for others to use.<\/p>\n<p>In this article, we will examine how these user groups coordinated development and shared code, how they promoted discoverability of software, and how they attempted to maintain a high bar of quality.<\/p>\n<h1 id=\"origins\">Origins<\/h1>\n<p>While the importance of a set of reusable subroutines to reduce the cost of development was noted in (Goldstine 1947) and the first set of published subroutines came out in (Wheeler 1951), the lack of standardization between computers and sites meant that it was concepts that were being shared, not code. Reuse implied porting code between architectures and languages.<\/p>\n<p>By the mid-1950s, computers had shifted from a research and bespoke creation, where an organization might build their own computer, to one where manufacturers sold multiple copies of the same model of computer. IBM announced the 701 or &ldquo;Defense Calculator&rdquo; in 1952 and installed it in nineteen sites. For business computing, the IBM 650 first shipped in December 1954 and eventually saw over 2,000 installations.<\/p>\n<p>With multiple instances of the same machine model, programs and practices could be replicated directly between sites.<\/p>\n<p>Faced with a programmer shortage, productive utilization of their 701s as low as 60%, and the coming obsolecense of the 701s by the 704s, a few West Coast aerospace companies decided to cooperate on a Project for the Advancement of Coding Techniques (PACT) (Malahn 1956). Formed in November 1954, they designed and developed a coding preprocessor to automate repetitious coding efforts. By June 1955 an implementation was available. Members of the project reported initial success and started evangelizing both the techniques used in the project and the cooperative nature by September 1955 (Greenwald 1956).<\/p>\n<p>In August 1955, the first cooperative user group, SHARE, was formed. The formative group of seventeen installations were all users of the IBM 701 and were anticipating the transition to the 704. The listed advantages for joining included &ldquo;[&hellip;] do considerably less programming and checkout of utility routines, mathematical routines, and complete systems&rdquo;. In addition, the group represented &ldquo;authoritative customer opinion&rdquo; and thus promised greater influence on IBM&rsquo;s product plans (SHARE 1956).<\/p>\n<p>Obligation of SHARE membership included &ldquo;have a cooperative spirit&rdquo; and &ldquo;respect for the competence of other members&rdquo;. Furthermore, each member site was expected to attend meetings with at least two members, one with a technical understanding of the system and another with the ability to commit resources. They also were expected to promptly answer the mail and keep the Secretary informed of any delays in their programming assignments.  Programs were distributed as type-written copies of the documentation and code.<\/p>\n<p>The Univac Scientific Exchange, or USE, was founded in January 1956. This group was initially focused on the 1103A, Univac&rsquo;s competitor to the 704. Similar to PACT and SHARE, they proposed using a similar &ldquo;minimum standard assembly&rdquo; language and cooperatively developing and sharing programs. The list of proposed programs included function subroutines, matrix, linear programming, tape handlers, and data output. The committee deferred developing of data input and post mortems utilities until a common language had been adopted (USE 1956).<\/p>\n<p>Of the proposed programs, USE volunteers submitted and accepted seven work assignments. USE members checked the programs on a 1103. The subroutines were for exponential calculations, logarithms, square root, sine\/cosine, arc sine, and arc tangent (USE-c 1956).<\/p>\n<p>DECUS or the Digital Equipment Computer Users Society, which eventually became the largest user group, was formed in 1961. As DEC&rsquo;s first computer was only delivered in November 1960, this demonstrates that customers saw immediate value in joining these groups. By 1971, there were sixteen operational user groups.<\/p>\n<h1 id=\"controlling-quality\">Controlling Quality<\/h1>\n<p>As user groups assembled their initial catalogs of algorithms, they were concerned about the accuracy, efficiency, and documentation of the submissions. Since authors and their employers were recorded in the catalog, authors were expected to be highly conscientious about their submissions. After code was submitted, it was then normally sent to a second person who would verify the code worked as expected on a different set of hardware. A third person would then verify that both the submitter and checker had performed their duty and signed off on the result.<\/p>\n<p>A typical form for this process is from the SWAP user group (SWAP 1968). SWAP was the user group for Control Data Computer systems.<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/user-group-reuse\/SWAP-Acceptance-Report.jpg\" alt=\"Form documenting tests and hardware used to check code\">\n  \n  <figcaption>Form for certification\/acceptance of M1-ITEK-DATE Algorithm<\/figcaption>\n<\/figure>\n\n\n<p>The expectations for documentation quality at SHARE and USE were robust. The proposed documentation sections (USE-b 1956) were:<\/p>\n<ol>\n<li>Identification\n<ol>\n<li>Title<\/li>\n<li>Author - Date<\/li>\n<li>Installation<\/li>\n<\/ol>\n<\/li>\n<li>Purpose<\/li>\n<li>Method\n<ol>\n<li>Accuracy<\/li>\n<li>Range<\/li>\n<li>Derivation or Reference<\/li>\n<\/ol>\n<\/li>\n<li>Usage\n<ol>\n<li>Calling Sequence<\/li>\n<li>Control Data<\/li>\n<li>Operational Procedure<\/li>\n<li>Space Required<\/li>\n<li>Error Codes<\/li>\n<li>Format Received or Generated if an Input-Output Routine<\/li>\n<\/ol>\n<\/li>\n<li>Restrictions\n<ol>\n<li>Components required other than minimum 1103A<\/li>\n<li>Other Programs Required<\/li>\n<li>Data<\/li>\n<li>Card Form, Tape Format<\/li>\n<li>Wiring Diagram or Board Used, if a Print Routine<\/li>\n<\/ol>\n<\/li>\n<li>Coding Information\n<ol>\n<li>Constraints<\/li>\n<li>Erasable Input-Output Locations<\/li>\n<li>Time<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>SHARE required the same list of top-level sections (although in slightly different order), plus a &ldquo;Checkout&rdquo; section to describe tests performed by the author that could be replicated to verify correct duplication or porting of the code.<\/p>\n<p>As an example of timing information, MTI-0, a linear matrix equation solver, provided the machine time in milliseconds as (USE-b 1956):<\/p>\n<blockquote>\n<p>.3n^3 + .9n^2 * m + 1.7n^2 + .3m^2 + 2.5nm + 1.8n + 1.6m + 2.7<\/p>\n<\/blockquote>\n<p>The write-up for MTI-0 required eight typewritten pages. The author took five pages for documentation and three for the source code. Perhaps unusually, the code was both written and checked by the same person, while in the other cases the author observed distinct individuals.<\/p>\n<p>While user groups attempted to hold a high bar for quality, none went so far as to guarantee the fitness of software. In a July 1956 memo, SHARE adopted the following disclaimer on all programs:<\/p>\n<blockquote>\n<p>Although this program has been carefully tested by its contributor, no guarantee is made of its correct functioning under all conditions, and no responsibility is taken by him in case of possible failure.<\/p>\n<\/blockquote>\n<p>Although the review system seems to have been robust for SHARE and USE, it seems not all user groups adopted it. DECUS had formed a catalog of about twenty programs early in 1962. However, as reported in the Nov 1962 (Vol I, No 8) issue of Decuscope, the organization had suspended distribution of the tapes due to reported corruption issues. As documented in January 1963, their method of duplicating paper tapes was unreliable, requiring them to reassemble programs. DECUS continued to struggle with paper tape corruption throughout the year as a letter in the Nov 1963 newsletter notes that the tape has garbage at the end, visible labels causing checksum errors, and out of spec spacing of holes.<\/p>\n<p>For DECUS, January 1963 also saw the beginning of their system of certification which required <em>two<\/em> checkers. A letter to the editor asked &ldquo;how to induce users to report program failures and how best to utilize such reports.&rdquo; The author was also concerned about free-loaders and advocated to &ldquo;establish punitive measures for failure to contribute to the &lsquo;certification&rsquo; process.&rdquo;<\/p>\n<p>By 1966, though, DECUS saw a need to distribute software with less overhead. In Vol 5, Issue 5, the editors launched a new section &ldquo;Available from Authors&rdquo; to allow advertisements of software that lacked certification. Authors could advertise software while it was still being debugged or lacked full documentation, but was sufficiently complete to be useful.<\/p>\n<p>A year later, DECUS expanded its advertising policy yet again by allowing commercial software to be listed. Only corporations were allowed to submit programs, not individuals. (Individuals could only post no-cost programs.) As a quality measure, DECUS stated it would act as a &ldquo;repository for complaints&rdquo; but stated it &ldquo;will not ordinarily investigate complaints&rdquo;.<\/p>\n<p>The first commercial advertisement might have been in Volume 8, Issue 2 (1969) with a routine for controlling a Calcomp plotter.<\/p>\n<p>In 1979 (Issue 6), the DECUS certification process seems to have gone away, as instructions on submitting programs no longer mention it. Submitting a program required only a small amount of documentation (enough to populate a catalog entry) and specified how to provide the code for reproducibility.<\/p>\n<p>There was a resurgence of quality concerns in 1983 (Issue 4) with the launch of a new &ldquo;DECUS Library Program Quality Assurance&rdquo;. This program&rsquo;s objective was to verify programs worked as advertised, focusing on the PDP-11. Volunteers were again asked to perform the evaluation. A pilot program evaluated 25 programs, finding seventeen still useful and marked for retention while recommending eight for archival. Rather than being a quality assurance program for individual programs, this was a process for cleaning up the catalog. Demonstrating the human discretion required, an archival decision required more context than just the age and lack of updates to a program.<\/p>\n<p>The multipage form for submitting programs in the 1990 DECUS catalog did not include any questions on testing performed by the author or require any checks by third-parties. Further, the role of DECUS was defined as a &ldquo;clearinghouse&rdquo; and they took no position on the quality of software.<\/p>\n<p>This trajectory was influenced by two trends. One, manufacturers were delivering much more software with the computer and thus were responsible for the core functionality. For instance, manufactures supplied operating sytems, compilers, and standard libraries. Customers no longer expected to write their own math routines. This also meant customers relied more on the manufacturer and less on each other. Two, the volume of software being developed and shared exceeded the user group&rsquo;s ability to review and control it. Economically, most of the collected fees went for basic distribution and administration of conferences. There were few volunteers to review code and little money to incentivize people.<\/p>\n<h1 id=\"promoting-discoverability\">Promoting Discoverability<\/h1>\n<p>The first computer science textbook, <em>The Preparation of Programs for an Electronic Digital Computer<\/em> included an appendix of programs developed for the EDSAC (Wheeler, 1951). Wheeler et al. organized the programs into a hierarchy based on their application area, each area denoted by a single letter. USE and SHARE adopted very similar schemes (Table 1 and 2, (USTB 1958) (Balstad 1975), respectively).<\/p>\n<p><strong>Table 1: Top-level Revised Univac Scientific Routine Index, 1958<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th>Section<\/th>\n<th>Category<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>Programmed Arithmetic<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>Elementary Functions<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>Polynomials and Special Functions<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>Differential Equations<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>Interpolations and Approximations<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>Matrices, Vectors, Simultaneous Linear Equations<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>Statistical Analysis and Probability<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>Operations Research and Linear Programming<\/td>\n<\/tr>\n<tr>\n<td>9<\/td>\n<td>Input Routines<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>Output Routines<\/td>\n<\/tr>\n<tr>\n<td>11<\/td>\n<td>Executive Routines<\/td>\n<\/tr>\n<tr>\n<td>12<\/td>\n<td>Information Processing<\/td>\n<\/tr>\n<tr>\n<td>13<\/td>\n<td>Debugging Routines<\/td>\n<\/tr>\n<tr>\n<td>14<\/td>\n<td>Simulation Programs<\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>Diagnostic Routines<\/td>\n<\/tr>\n<tr>\n<td>16<\/td>\n<td>Service Programs<\/td>\n<\/tr>\n<tr>\n<td>17<\/td>\n<td>All Others<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Table 2: Top-level SHARE Classification Scheme, 1965<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th>Code<\/th>\n<th>Category<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>A<\/td>\n<td>Arithmetic Routines<\/td>\n<\/tr>\n<tr>\n<td>B<\/td>\n<td>Elementary Functions<\/td>\n<\/tr>\n<tr>\n<td>C<\/td>\n<td>Polynomials and Special Functions<\/td>\n<\/tr>\n<tr>\n<td>D<\/td>\n<td>Operations on Functions and Solutions of Differential Equations<\/td>\n<\/tr>\n<tr>\n<td>E<\/td>\n<td>Interpolation and Approximations<\/td>\n<\/tr>\n<tr>\n<td>F<\/td>\n<td>Operations on Matrices, Vectors, and Simultaneous Linear Equations<\/td>\n<\/tr>\n<tr>\n<td>G<\/td>\n<td>Statistical Analysis and Probability<\/td>\n<\/tr>\n<tr>\n<td>H<\/td>\n<td>Operations Research Techniques, Simulation and Management Science<\/td>\n<\/tr>\n<tr>\n<td>I<\/td>\n<td>Input<\/td>\n<\/tr>\n<tr>\n<td>J<\/td>\n<td>Output<\/td>\n<\/tr>\n<tr>\n<td>K<\/td>\n<td>Internal Information Transfer<\/td>\n<\/tr>\n<tr>\n<td>L<\/td>\n<td>Executive Routines<\/td>\n<\/tr>\n<tr>\n<td>M<\/td>\n<td>Data Handling<\/td>\n<\/tr>\n<tr>\n<td>N<\/td>\n<td>Debugging<\/td>\n<\/tr>\n<tr>\n<td>O<\/td>\n<td>Simulation of Computers and Data Processors; Interpreters<\/td>\n<\/tr>\n<tr>\n<td>P<\/td>\n<td>Diagnostics<\/td>\n<\/tr>\n<tr>\n<td>Q<\/td>\n<td>Service or Housekeeping; Programming Aids<\/td>\n<\/tr>\n<tr>\n<td>R<\/td>\n<td>Logical and Symbolic<\/td>\n<\/tr>\n<tr>\n<td>S<\/td>\n<td>Information Retrieval<\/td>\n<\/tr>\n<tr>\n<td>T<\/td>\n<td>Applications and Application-oriented Programs<\/td>\n<\/tr>\n<tr>\n<td>U<\/td>\n<td>Linguistics and Languages<\/td>\n<\/tr>\n<tr>\n<td>V<\/td>\n<td>General Purpose Utility Subroutines<\/td>\n<\/tr>\n<tr>\n<td>Z<\/td>\n<td>All Others<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Within the 1950s and to the mid-1960s, each user group&rsquo;s focus was on code that a developer could copy into their own programs, rather than free-standing programs. There were full-fledged programs and utilities available, such as assemblers, editors, and maintenance tools, but they were in the minority.<\/p>\n<p>In 1961, the Joint Users Group (JUG) organized itself within the the Association for Computing Machinery. JUG sought to close the &ldquo;software gap&rdquo;, the time and effort required between specification and implementation. By 1971, JUG worked with 16 different user groups to assemble a directory, with well-defined metadata, of software available on the various systems. <em>Computer Programs Directory<\/em> was published first in 1971 with a second volume following in 1974. These directories represent a snapshot in time of software being exchanged within user groups as well as what developers thought others might find useful.<\/p>\n<p>The 1974 directory included a classification system (adopted in 1972) that expanded the list with many application areas, but maintained categories from the earlier SHARE and WWG list. There was a need for application areas as user members were increasingly sharing full programs rather than individual routines. In the 61 Education category, the directory lists twenty programs under Demonstrations (61.1), thirty-five programs under Problem Solving (61.2), and twenty-one programs under Record Keeping (61.3). Most of these programs were written in BASIC.<\/p>\n<p><strong>Table 3: Top-level Program Library Classification Code, 1974<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th>Code<\/th>\n<th>Category<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>00<\/td>\n<td>Utility (External) Programs<\/td>\n<\/tr>\n<tr>\n<td>01<\/td>\n<td>Utility (Internal) Programs<\/td>\n<\/tr>\n<tr>\n<td>02<\/td>\n<td>Diagnostics<\/td>\n<\/tr>\n<tr>\n<td>03<\/td>\n<td>Programming Systems<\/td>\n<\/tr>\n<tr>\n<td>04<\/td>\n<td>Testing and Debugging<\/td>\n<\/tr>\n<tr>\n<td>05<\/td>\n<td>Executive Routines<\/td>\n<\/tr>\n<tr>\n<td>06<\/td>\n<td>Data Handling<\/td>\n<\/tr>\n<tr>\n<td>07<\/td>\n<td>Input\/Output<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>Systems Analysis<\/td>\n<\/tr>\n<tr>\n<td>11<\/td>\n<td>Simulation of Computer and Components<\/td>\n<\/tr>\n<tr>\n<td>12<\/td>\n<td>Conversion of Programs and Data<\/td>\n<\/tr>\n<tr>\n<td>13<\/td>\n<td>Statistical<\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>Management Science\/Operations Research<\/td>\n<\/tr>\n<tr>\n<td>16<\/td>\n<td>Engineering<\/td>\n<\/tr>\n<tr>\n<td>17<\/td>\n<td>Sciences and Mathematics<\/td>\n<\/tr>\n<tr>\n<td>18<\/td>\n<td>Nuclear Codes<\/td>\n<\/tr>\n<tr>\n<td>19<\/td>\n<td>Financial<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>Cost Accounting<\/td>\n<\/tr>\n<tr>\n<td>21<\/td>\n<td>Payroll and Benefits<\/td>\n<\/tr>\n<tr>\n<td>22<\/td>\n<td>Personnel<\/td>\n<\/tr>\n<tr>\n<td>23<\/td>\n<td>Manufacturing<\/td>\n<\/tr>\n<tr>\n<td>24<\/td>\n<td>Quality Assurance\/Reliability<\/td>\n<\/tr>\n<tr>\n<td>25<\/td>\n<td>Inventory<\/td>\n<\/tr>\n<tr>\n<td>26<\/td>\n<td>Purchasing<\/td>\n<\/tr>\n<tr>\n<td>27<\/td>\n<td>Marketing<\/td>\n<\/tr>\n<tr>\n<td>28<\/td>\n<td>Sales Entered and Billed<\/td>\n<\/tr>\n<tr>\n<td>29<\/td>\n<td>General Business Services<\/td>\n<\/tr>\n<tr>\n<td>30<\/td>\n<td>Demonstration and Games<\/td>\n<\/tr>\n<tr>\n<td>40<\/td>\n<td>Arithmetic Routines<\/td>\n<\/tr>\n<tr>\n<td>41<\/td>\n<td>Elementary Functions<\/td>\n<\/tr>\n<tr>\n<td>42<\/td>\n<td>Polynomials and Special Functions<\/td>\n<\/tr>\n<tr>\n<td>43<\/td>\n<td>Operations on Functions and Solutions of Differential Equations<\/td>\n<\/tr>\n<tr>\n<td>44<\/td>\n<td>Interpolation and Approximations<\/td>\n<\/tr>\n<tr>\n<td>45<\/td>\n<td>Operations on Matrices, Vectors, and Simultaneous Linear Equations<\/td>\n<\/tr>\n<tr>\n<td>50<\/td>\n<td>Insurance<\/td>\n<\/tr>\n<tr>\n<td>61<\/td>\n<td>Education<\/td>\n<\/tr>\n<tr>\n<td>62<\/td>\n<td>Literary Data Processing<\/td>\n<\/tr>\n<tr>\n<td>63<\/td>\n<td>Humanities<\/td>\n<\/tr>\n<tr>\n<td>71<\/td>\n<td>Hybrid Computing<\/td>\n<\/tr>\n<tr>\n<td>72<\/td>\n<td>Time Sharing<\/td>\n<\/tr>\n<tr>\n<td>99<\/td>\n<td>Miscellaneous<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>What languages did contributors use? Drawing from JUG 1971 and 1974 using a random sampling approach, the author sampled 193 programs and counted by the primary computer language (Table 3). In cases where a language had multiple versions (e.g. FORTRAN II is distinct from Fortran 66), the author simplified to the &ldquo;parent&rdquo; language name. The author also grouped together all assembly languages.<\/p>\n<p><strong>Table 4: Programming Language Count and Percentage from the JUG listings<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th>Language<\/th>\n<th>Count<\/th>\n<th>Percentage<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>FORTRAN<\/td>\n<td>73<\/td>\n<td>38%<\/td>\n<\/tr>\n<tr>\n<td>Assembly (incl. MACRO &amp; COMPASS)<\/td>\n<td>49<\/td>\n<td>25%<\/td>\n<\/tr>\n<tr>\n<td>PAL (Pedagogic Algorithmic Language)<\/td>\n<td>20<\/td>\n<td>10%<\/td>\n<\/tr>\n<tr>\n<td><em>Unspecified<\/em><\/td>\n<td>11<\/td>\n<td>6%<\/td>\n<\/tr>\n<tr>\n<td>BASIC<\/td>\n<td>10<\/td>\n<td>5%<\/td>\n<\/tr>\n<tr>\n<td>FOCAL<\/td>\n<td>10<\/td>\n<td>5%<\/td>\n<\/tr>\n<tr>\n<td>LAP6<\/td>\n<td>5<\/td>\n<td>3%<\/td>\n<\/tr>\n<tr>\n<td>ALGOL<\/td>\n<td>4<\/td>\n<td>2%<\/td>\n<\/tr>\n<tr>\n<td>Wang<\/td>\n<td>4<\/td>\n<td>2%<\/td>\n<\/tr>\n<tr>\n<td>COBOL<\/td>\n<td>3<\/td>\n<td>2%<\/td>\n<\/tr>\n<tr>\n<td>ASPER, CODAP-1, CP, DIAL<\/td>\n<td>All n=1<\/td>\n<td>2%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As the majority of computer sites were performing technical work, it is unsurprising that FORTRAN, as the first high-level language targeting engineering and scientific use cases, counts for nearly 40% of the programs. FORTRAN also had the advantage of being supported on multiple platforms, rather than being a vendor-specific language. Assembly counted for approximately one quarter of the programs, but by this point, programs shipped in assembly focused on hardware drivers or low-level equipment tests and tools. The large amount of programs written in PAL can be contributed to their educational lesson focus and that large numbers of lessons could be cranked out with little effort. FOCAL and BASIC competed in the same niche, and were similar syntactically and functionally, but FOCAL was DEC-specific while BASIC was ported between many architectures. LAP6 was the language for the LINC personal workstation. Although there were only 50 workstations built, the LINC community were especially enthusiastic about sharing what software they wrote.<\/p>\n<p>Perhaps to the chagrin of the ACM, ALGOL barely registered even though it was designed for cross-platform use and there were compilers available for most architectures. Similarly, COBOL barely registers on the list. ACM anticipated greater sharing of business related software as shown by their adding categories like &ldquo;27 Marketing&rdquo;, but the actual directory showed only minimal activity in such categories. The activity that existed were more likely to be BASIC programs, such as one for handling personnel recruiting workflows.<\/p>\n<p>Wang and the other items in the table were vendor-specific languages.<\/p>\n<p>Since DECUS was the largest user group and contributed the largest number of items to the JUG directory, we would expect the table to be biased towards languages available on the PDP minicomputers.<\/p>\n<h1 id=\"protecting-software-integrity\">Protecting Software Integrity<\/h1>\n<p>The initial threats to software integrity were fairly prosaic: errors introduced by manual copying of code from one medium to another and the degradation of storage over time. To combat this, documentation included steps for verifying and checkout out code to ensure it was copied correctly.<\/p>\n<p>As user groups acquired duplicating machinery, they started distributed software via paper tape. However, paper tape was error prone, as were the technologies used to process it. In the 1966 Volume 5, Issue 2 of DECUScope, a review of the ASR-33 stated it  was a &ldquo;weak link in the system in both speed and reliability&rdquo; and &ldquo;is quite capable of preparing a tape with a checksum error&rdquo;. (The Teletype Corporation would improve the reliability of the ASR-33 over the next nearly two decades life of the product.)<\/p>\n<p>Magnetic tape was more reliable, but also more expensive. Floppy diskettes were first commercially available in 1973 and by the late 1970s were supported extensively in minicomputer and microcomputer models. In 1978, 5 1\/4 diskettes had fallen to <a href=\"https:\/\/en.wikipedia.org\/wiki\/History_of_the_floppy_disk#cite_note-NYT1978-19\">$1.50 in price in bulk<\/a> or $7 in 2022 dollars. Reliability improved, but replication costs still made mass distribution expensive.<\/p>\n<p>A typical practice, recommended as early as (Wheeler 1951), was to maintain master copies of software which were not to be used directly, but rather copied from time to time to tapes in use.<\/p>\n<p>Although improved media reliability and checksums improved the integrity of software distribution, these measures did not ward against adversarial attacks. In the Winter\/Spring 1990 DECUS program library, the editors included this note about viruses:<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/user-group-reuse\/DECUS-Virus.jpg\" alt=\"States DECUS policy about viruses - will stop distribution and might investigate (see DECUS policy below)\">\n  \n  <figcaption>DECUS policy toward reports about viruses or malware in the catalog<\/figcaption>\n<\/figure>\n\n\n<p>Within this policy we see the continuation of norms from SHARE, that of quality and good behavior coming from the developer&rsquo;s desire to maintain their reputation. This norm was likely reinforced because user groups did meet face-to-face at conferences. However, based on the submission forms, there were few controls against impersonating another person.<\/p>\n<p>Based on documents the author has reviewed, supply chain adversarial attacks were not a concern until the late 80s. Instead, the user groups expected members to follow norms and avoid inadvertent or intentional harm to other members.<\/p>\n<h1 id=\"comparison-to-modern-repositories\">Comparison to Modern Repositories<\/h1>\n<p>In &ldquo;Software Parts Nostalgia&rdquo;, Robert C. Glass (Glass 1981) argues that SHARE worked because:<\/p>\n<ol>\n<li>Available to everyone (within a site)<\/li>\n<li>&ldquo;Effective parts taxonomy&rdquo; and &ldquo;effective delivery document&rdquo;<\/li>\n<li>Pride in software authorship<\/li>\n<li>Not available elsewhere<\/li>\n<\/ol>\n<p>Do these factor apply to a online software repository?<\/p>\n<p>A modern software respository is more available than software through a user group as they simply require an internet connection &mdash; no need to prove you own a specific machine or pay any dues. Further, new software and updates can be made accessed within seconds. This level of availability may be a handicap as the bandwidth costs can be expensive and the sites operate in a hostile environment with constant denial of service attacks. However, compared to sending physical media, websites are multiple orders less expensive.<\/p>\n<p>Modern tooling greatly reduces the friction of adding a dependency to a project and tracking updates, but finding and selecting a library is largely an adhoc process by the developer leveraging search engines and popularity metrics. There are still attempts at a taxonomy, for example <a href=\"https:\/\/crates.io\/category_slugs\">crate&rsquo;s list<\/a> for Rust, but these are not fine-grained. Modern repositories place greater weight on free-form search  than taxonomies. When user groups were sharing paper catalogs, reverse indexed search was not an option.<\/p>\n<p>The MTI-0 write-up and similar documents in the SHARE and USE binders are models of documentation and are analagous to a Unix man page. However, as these programs are really just functions and typically written in assembly, their scope is limited and much of the documentation would be unnecessary if the code was written in a higher-level language. Over time, user groups reduced their expectations on documentation, but user groups also were sharing fewer routines and more stand-alone programs.<\/p>\n<p>Pride is a difficult subject to measure as are the feelings of responsibility, reputation, and ownership. Within the early days of SHARE and USE, a developer would be releasing code to colleagues that met face-to-face at least monthly. Not only could poor quality code lead to ostracization or reduced influence within the group, but high quality code could lead to better jobs and enhanced status. By 1981, when Glass was writing, &ldquo;software parts&rdquo; were more likely released commercially by a company. Individual contributions were anonymized, although quality might still be recognized and rewarded within a company.<\/p>\n<p>Alternatives to anonymity, however, were quickly changing with the beginnings of the Free and Open Source movements (GNU was first announced in 1983; OSI formed in 1998). Contributions from individual authors were public and trackable, particularly with software project management sites such as SourceForge, launched in 1999. While public recognition for contributions can be socially rewarding, simply releasing code or a patch is insufficient. Software repositories have ample packages that are buggy, abandoned, and unfit for consumption. Modeled as a clearinghouse, similar to DECUS finding a need to remove obsolete software from their catalog, clearinghouses will fill with junk over time.<\/p>\n<p>Due to network effects and economic efficiency, there tends to be a single repository per programming language &ndash; e.g. PyPI for Python, Central Repository for Java, Rubygems for Ruby, NPM for JavaScript. (An organization will often run a private repository for their internal development, but they will share software via an external repository.) The switch from repository per manufacturer to repository per language started with CTAN, the Comprehensive TeX Archive Network, which inspired the very popular CPAN or Comprehensive Perl Archive Network. Officially announced in 1993, CTAN consolidated via mirrors and tooling TeX macros previously distributed across many sites. The balkanization of software sources, which had greatly increased once software started to be unbundled from hardware in 1969, began to reduce with centralized systems (even if they used distributed mirrors).<\/p>\n<h1 id=\"lessons-for-modern-development\">Lessons for Modern Development<\/h1>\n<ul>\n<li>Communities can rapidly fill missing documentation and features<\/li>\n<\/ul>\n<p>If a community can come together and organize, they can effectively defend their interests and create solutions. The computer manufacturers of the time helped the user groups grow by including notices for them when they sold a machine, whether the user group was directly managed by the manufacturer or not.<\/p>\n<ul>\n<li>Mirrors are great for availability<\/li>\n<\/ul>\n<p>The answer to code availability and corruption issues has always been local copies. External dependencies should be stored within a caching proxy or similar mirroring system.<\/p>\n<ul>\n<li>Curatorship is more expensive than development<\/li>\n<\/ul>\n<p>Programmers may be in short supply, but programmers have consistently shown greater willingness to volunteer their programming output versus their skills in reviewing and auditing other&rsquo;s output. While there are groups that have a tradition of deep review (e.g. netlib, openbsd), the user groups struggled to maintain a similar review system past a small set of &lsquo;standard&rsquo; libraries. Instead, it seems less expensive to let the consumers try to determine fitness and quality themselves and attract customers with the quantity of available software.<\/p>\n<h1 id=\"acknowledgements\">Acknowledgements<\/h1>\n<p>I&rsquo;d like to thank the <a href=\"https:\/\/cse.umn.edu\/cbi\">Charles Babbage Institute<\/a> for their assistance with the research.<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>(Balstad 1975): John Bolstad. 1975. A proposed classification for computer program library subroutines. SIGUCCS Newsl. 5, 2 (May 1975), 25\u201337. <a href=\"https:\/\/doi.org\/10.1145\/1098881.1098882\">https:\/\/doi.org\/10.1145\/1098881.1098882<\/a><\/p>\n<p>(Glass 1981): Robert C. Glass. 1981. Software Parts Nostalgia. Datamation, Vol 27, (Nov. 1981), 245-247.<\/p>\n<p>(Goldstine 1947): Herman Goldstine and John Von Neumann. Planning and coding of problems for an electronic computing instrument, Vol. 1. Inst. for Advanced Study, Princeton, N. J, Apml 1, 1947 (69 pp.). (Reprinted in von Neumann&rsquo;s Collected Works, Vol 5, A H Taub, Ed., Pergamon, London, 1963, pp. 80-151.)<\/p>\n<p>(Greenwald 1956): I. D. Greenwald and H. G. Martin. 1956. Conclusions After Using the PACT I Advanced Coding Technique. J. ACM 3, 4 (Oct. 1956), 309\u2013313. <a href=\"https:\/\/doi.org\/10.1145\/320843.320850\">https:\/\/doi.org\/10.1145\/320843.320850<\/a><\/p>\n<p>(Melahn 1956): Wesley S. Melahn. 1956. A Description of a Cooperative Venture in the Production of an Automatic Coding System. J. ACM 3, 4 (Oct. 1956), 266\u2013271. <a href=\"https:\/\/doi.org\/10.1145\/320843.320844\">https:\/\/doi.org\/10.1145\/320843.320844<\/a><\/p>\n<p>(SHARE 1956): John Greenstadt, Editor-in-Chief. SHARE Reference Manual for the IBM 704. 1956. <a href=\"https:\/\/www.piercefuller.com\/scan\/share59.pdf\">https:\/\/www.piercefuller.com\/scan\/share59.pdf<\/a><\/p>\n<p>(SWAP 1968): Users Organization for Control Data Computer Systems, Handbook, Part III: Program Catalog, (June 1968). <a href=\"https:\/\/archives.lib.umn.edu\/repositories\/3\/archival_objects\/1616775\">https:\/\/archives.lib.umn.edu\/repositories\/3\/archival_objects\/1616775<\/a><\/p>\n<p>(USE 1956): Minutes of Meeting of USE, Seattle WA. (9-10 January 1956). <a href=\"https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19\">https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19<\/a><\/p>\n<p>(USE-b 1956): Minutes of Meeting of USE, St. Paul MN. (15-16 February 1956). <a href=\"https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19\">https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19<\/a><\/p>\n<p>(USE-c 1956): Minutes of Meeting of USE, Van Nerys CA. (29-30 March 1956). <a href=\"https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19\">https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19<\/a><\/p>\n<p>(USTB 1958): USTB-5 Revised Univac Scientific Routines Index. (January 31 1958). <a href=\"https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19\">https:\/\/archives.lib.umn.edu\/repositories\/3\/resources\/19<\/a><\/p>\n<p>(Wheeler 1951): David John Wheeler, et al. The Preparation of Programs for an Electronic Digital Computer: With Special Reference to the Edsac and the Use of a Library of Subroutines. Reprint of the ed. 1951 ed. Addison-Wesley 1951.<\/p>\n<h1 id=\"appendix\">Appendix<\/h1>\n<h2 id=\"decus-about-viruses-policy-1990\">DECUS &lsquo;About Viruses&rsquo; Policy (1990)<\/h2>\n<p>The DECUS Library is a clearinghouse only. As such, it cannot checks it software for proper function, quality of implementation, covert behavior, or other similar issues. However, it will not knowingly release software which is known to be harmful to its users.<\/p>\n<p>While we cannot provide absolute assurance against covert behavior, we believe it is rendered less probable because:<\/p>\n<ul>\n<li>Submitters' names remain permanently associated with their submissions; anyone contributing a virus or trojan horse risks his or her reputation.<\/li>\n<li>A large user community is likely to have someone notice covert behavior. Since this can be reported quickly to a central location, there is safety in numbers.<\/li>\n<li>Submitting sources is encouraged, so that covert behavior is harder to hide.<\/li>\n<\/ul>\n<p>If covert behavior is reported in software submitted to the DECUS library, the library will take the following actions:<\/p>\n<ul>\n<li>Suspend distribution of that software.<\/li>\n<li>Investigate the report and if it is found true, remove the offending software from the library.<\/li>\n<li>Report the complete incident to all known sites to whom the software was distributed, with all available information, including where the software came from.<\/li>\n<li>Attempts to remove the objectionable behavior may be made by volunteers, to reclaim what value may exist in the software. Any such reclaimed software will appear only with a full explanation of what was done.<\/li>\n<\/ul>"},{"title":"Method of Differences Add'l Applications","link":"https:\/\/ztoz.blog\/posts\/differences-applications\/","pubDate":"Tue, 20 Feb 2024 17:43:25 -0700","guid":"https:\/\/ztoz.blog\/posts\/differences-applications\/","description":"<p>In my post on the <a href=\"https:\/\/ztoz.blog\/posts\/method-differences\/\">Method of Differences<\/a>, I focused on  using the technique to generate mathematical tables. As some readers noted, there are additional and interesting modern applications of the technique. This post will discuss two of them: <em>sequences to functions<\/em>, a method of analyzing sequences, and <em>strength reduction<\/em>, a compiler optimization technique.<\/p>\n<h1 id=\"sequences-to-functions\">Sequences to Functions<\/h1>\n<p>Rather than having a function that you want to evaluate at many positions, let&rsquo;s say you have a sequence of values and you want to know how they are generated. For instance, say you are given the sequence 1, 5, 12, 22, 35, 51, 70, 92, &hellip; If you cannot just look up the sequence in the <a href=\"https:\/\/oeis.org\/search?q=1%2C+5%2C+12%2C+22%2C+35%2C+51%2C+70%2C+92&amp;language=english&amp;go=Search\">Online Encyclopedia of Integer Sequences<\/a>, then the <a href=\"https:\/\/oeis.org\/EIStext.pdf\"><em>Encyclopedia of Integer Sequences<\/em><\/a> recommends the use of &ldquo;Analysis of Differences&rdquo; as the best method to analyze a sequence &ldquo;by hand&rdquo; (section 2.5, page 10).<\/p>\n<p>Since the <em>Encyclopedia<\/em>&rsquo;s explanation is quite clear, rather than repeat it, I suggest reading section 2.5 to understand how to derive the answer of \\(\\frac{1}{2}(n + 1)(3n + 2)\\).<\/p>\n<h1 id=\"strength-reduction\">Strength Reduction<\/h1>\n<p>Strength reduction is a code optimization that replaces expensive operations with less expensive ones. For example, integer multiplication takes <a href=\"http:\/\/ithare.com\/infographics-operation-costs-in-cpu-clock-cycles\/\">about three times<\/a> the amount of time as an integer add, although real-world performance, particularly with modern processors, may make the difference <a href=\"https:\/\/lemire.me\/blog\/2010\/07\/19\/is-multiplication-slower-than-addition\/\">neglible<\/a>. The authors of the chapter &ldquo;Reduction of Operator Stength&rdquo; attribute the basis of the technique to Babbage and his use in the Difference Engine.<\/p>\n<p>Similar to Bailey&rsquo;s long list of application areas, Allen, Cocke, and Kennedy list scenarios where strength reduction can be appplied:<\/p>\n<ol>\n<li>Multiplication of an induction variable by a region constant<\/li>\n<li>Multiplication of an induction variable by another induction variable<\/li>\n<li>Multiplication of an induction variable by itself<\/li>\n<li>Integer division of an induction variable by a region constant<\/li>\n<li>Integer modulo of an induction variable<\/li>\n<li>Exponentiation to an induction variable<\/li>\n<li>Integer addition (eliminating extraneous variables)<\/li>\n<li>General order n polynomials<\/li>\n<li>Trigonometric functions at equidistant points<\/li>\n<li>Continuous differentiable functions<\/li>\n<\/ol>\n<p>The first three cases and the 8th case (which corresponds to the mathematical tables that motivated the Victorians) are reducible to pure additions, while the others will require some other operators.<\/p>\n<p>To demonstrate how it works, we will show a simple example leveraging memory lookups. Memory lookups within arrays are very common operations in scientific computing. Let&rsquo;s assume our code is summing the last column in a 4x3 matrix. The matrix is represented in row-major order and contains 32-bit signed integers.<\/p>\n<p>The matrix for our example:<\/p>\n<p>$$\n\\begin{pmatrix}\n2 &amp; 10 &amp; 18 \\\\\n4 &amp; 12 &amp; 20 \\\\\n6 &amp; 14 &amp; 22 \\\\\n8 &amp; 16 &amp; 24\n\\end{pmatrix}\n$$<\/p>\n<p>In memory, assuming the matrix begins at address 1000, the layout will be:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align:right\">Address<\/th>\n<th style=\"text-align:right\">Value<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:right\">1000<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1004<\/td>\n<td style=\"text-align:right\">10<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1008<\/td>\n<td style=\"text-align:right\">18<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1012<\/td>\n<td style=\"text-align:right\">4<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1016<\/td>\n<td style=\"text-align:right\">12<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1020<\/td>\n<td style=\"text-align:right\">20<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1024<\/td>\n<td style=\"text-align:right\">6<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1028<\/td>\n<td style=\"text-align:right\">14<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1032<\/td>\n<td style=\"text-align:right\">22<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1036<\/td>\n<td style=\"text-align:right\">8<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1040<\/td>\n<td style=\"text-align:right\">16<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:right\">1044<\/td>\n<td style=\"text-align:right\">24<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Thus, the address (A) of a cell within the matrix is the starting address (M) plus the zero-indexed row (r) multiplied by the size of a value in bytes (s) times the number of columns (C) plus the zero-indexed column (c) multiplied by the size of the value:<\/p>\n<p>$$\nA = M + r \\times s \\times C + c \\times s\n$$<\/p>\n<p>So, the address at zero-indices (1, 2) is 1000 + 1 * 4 * 3 + 2 * 4 or 1020, which is the value 20.<\/p>\n<p>If we simplify by replacing known values with constants, we get:<\/p>\n<p>$$\nA = M + r \\times s \\times C + c \\times s \\\\\nA = 1000 + 12r + 4c\n$$<\/p>\n<p>So, to lookup a value, we are looking at two multiplications and three additions.<\/p>\n<p>To sum the last column in the zero-indexed matrix, a C snippet could be (using i and j instead of r and c to be more idiomatic):<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\"><span style=\"color:#66d9ef\">int<\/span> sum <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n<span style=\"color:#66d9ef\">int<\/span> j <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n<span style=\"color:#66d9ef\">for<\/span>(i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">4<\/span>; i<span style=\"color:#f92672\">++<\/span>) {\n\tsum <span style=\"color:#f92672\">+=<\/span> <span style=\"color:#f92672\">*<\/span>(matrix <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">12<\/span>i <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">4<\/span><span style=\"color:#f92672\">*<\/span>j); <span style=\"color:#75715e\">\/\/ matrix[i][2]\n<\/span><span style=\"color:#75715e\"><\/span>}\n<span style=\"color:#66d9ef\">return<\/span> sum;\n<\/code><\/pre><\/div><p>The 4*2 would be pre-computed by most compilers since the column is fixed in this example.<\/p>\n<p>Using a strength reduction, we can insert a temporary variable, t1, that is initialized to the base address. Then, since i is being incremented by one each time and the &ldquo;stride&rdquo; is 12, we can eliminate the multiplication of i by 12 and just increment t1 by 12 directly.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-c\" data-lang=\"c\"><span style=\"color:#66d9ef\">int<\/span> sum <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n<span style=\"color:#66d9ef\">int<\/span> j <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">2<\/span>;\n<span style=\"color:#66d9ef\">int<\/span> t1 <span style=\"color:#f92672\">=<\/span> matrix <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">4<\/span><span style=\"color:#f92672\">*<\/span>j;\n<span style=\"color:#66d9ef\">for<\/span>(i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">4<\/span>; i<span style=\"color:#f92672\">++<\/span>) {\n\tsum <span style=\"color:#f92672\">+=<\/span> <span style=\"color:#f92672\">*<\/span>(t1); <span style=\"color:#75715e\">\/\/ memory load from address t1\n<\/span><span style=\"color:#75715e\"><\/span>\tt1 <span style=\"color:#f92672\">=<\/span> t1 <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">12<\/span>;\n}\n<span style=\"color:#66d9ef\">return<\/span> sum;\n<\/code><\/pre><\/div><p>Thus, while the original code in the loop would be performing (at least) 4 multiplications and 12 additions, the strength reduced code is performing only 12 additions.<\/p>\n<p>The FORTRAN I compiler, completed in 1957, included strength reduction to transform subscript calculations into index register increments and decrements. In F. E. Allen&rsquo;s <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/1500774.1500875\">&ldquo;A technological review of the FORTRAN I Compiler&rdquo;<\/a>, she states that this was one of most impactful optimizations.<\/p>\n<h2 id=\"scalar-evolution\">Scalar Evolution<\/h2>\n<p>If you look at the list of optimizations in the LLVM compiler, you will find <a href=\"https:\/\/llvm.org\/docs\/Passes.html#scalar-evolution-scalar-evolution-analysis\">Scalar Evolution<\/a> listed. Scalar Evolution is a way to model how the value of a scalar changes with the execution of code, particularly within a loop. It is based on the mathematical model of Chains of Recurrences, which is a generalized form of forward differencing. Theory has continued to evolve and improve.<\/p>\n<p>Some additional references:<\/p>\n<ul>\n<li><a href=\"https:\/\/bohr.wlu.ca\/ezima\/papers\/ISSAC94_p242-bachmann.pdf\">Chains of Recurrences - a method to expedite the evaluation of closed form functions<\/a><\/li>\n<li><a href=\"https:\/\/kristerw.blogspot.com\/2019\/04\/how-llvm-optimizes-geometric-sums.html?m=1\">How LLVM Optimizes Power Sums<\/a><\/li>\n<li><a href=\"http:\/\/cri.ensmp.fr\/~pop\/gcc\/mar04\/slides.pdf\">Scalar Evolutions<\/a><\/li>\n<\/ul>\n<h1 id=\"references-and-further-reading\">References and Further Reading<\/h1>\n<p>Donald Knuth. 1998. &ldquo;Tabulating polynomial values&rdquo;, section 4.6.4, <em>The Art of Computer Programming<\/em>, Volume 2, 3rd Edition: Seminumerical Algorithms.<\/p>\n<p>John Cocke and Ken Kennedy. 1977. An algorithm for reduction of operator strength. Commun. ACM 20, 11 (Nov. 1977), 850\u2013856. <a href=\"https:\/\/doi.org\/10.1145\/359863.359888\">https:\/\/doi.org\/10.1145\/359863.359888<\/a><\/p>\n<p>Steven Muchnick and Neil Jones. 1981. <em>Program Flow Analysis: Theory and Applications<\/em>. Prentice-Hall, Inc.<\/p>\n<p>Includes<\/p>\n<ul>\n<li>Allen, Cocke, and Kennedy. &ldquo;Reduction of Operator Strength.&rdquo;<\/li>\n<\/ul>\n<p>Steven S. Muchnick. 1997. <em>Advanced Compiler Design and Implementation<\/em>. Morgan Kaufmann.<\/p>"},{"title":"Method of Differences","link":"https:\/\/ztoz.blog\/posts\/method-differences\/","pubDate":"Fri, 02 Feb 2024 11:14:53 -0700","guid":"https:\/\/ztoz.blog\/posts\/method-differences\/","description":"<p>The &ldquo;Method of Differences&rdquo; is a mathematical technique for reducing the computation of polynomials to repeated addition. Once the system is setup, relatively unskilled human computers can populate dense mathematical tables. This is the &ldquo;difference&rdquo; in the Charles Babbage&rsquo;s <em>Difference Engine<\/em> which aimed to automate the creation and printing of these tables. I have created a playground where you can experiment with the method and in this article I aim to explain how to apply the technique, why the Difference Engine earned public funding, and the limitations of the method.<\/p>\n<blockquote>\n<p>Try it yourself: <a href=\"https:\/\/ztoz.blog\/mdifferences\/\">Method of Differences Playground<\/a><\/p>\n<\/blockquote>\n<h1 id=\"how-it-works\">How it works<\/h1>\n<p>In his 1822 letter, Charles Babbage states that using the engine that he has just finished, he has constructed tables of <a href=\"https:\/\/mathworld.wolfram.com\/SquareNumber.html\">square numbers<\/a>: \\(x^2\\), <a href=\"https:\/\/mathworld.wolfram.com\/TriangularNumber.html\">triangular numbers<\/a>: \\(x^2\/2 + x\/2\\), and the expression \\(x^2 + x + 41\\).<\/p>\n<p>Let&rsquo;s use the example \\(x^2 + x + 41\\).<\/p>\n<p>Since the example is already a polynomial, we can compute the Initial Values. The highest power of the polynomial is 2, so we need 3 (2 + 1) rows. First, we&rsquo;ll compute the expression at x=0 and two steps using standard methods:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">47<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Because the highest power of the polynomial is 2, there are two difference (\\(D\\)) columns we need to compute. We will append the two columns to the right of the table:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\"><\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\"><\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">47<\/td>\n<td style=\"text-align:right\"><\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Then, using <em>simple differences<\/em>, we will compute the \\(D^1\\) column by subtracting values in the left-same \\(f(x)\\) and left-below columns:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">43 - 41 = 2<\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">47 - 43 = 4<\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">47<\/td>\n<td style=\"text-align:right\"><\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Recursively, we will then do the same for \\(D^2\\). If there were more difference columns, the process would be repeated.<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">4 - 2 = 2<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">4<\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">47<\/td>\n<td style=\"text-align:right\"><\/td>\n<td style=\"text-align:right\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The first row of this table becomes our Initial Values: 41, 2, 2.<\/p>\n<p>Now, to compute the function, we can use repeated addition. We initialize the table with the Initial Values like so:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Then, cells equal the cell above plus the cell to the above right. The last column on the right will stay constant.<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">41 + 2 = 43<\/td>\n<td style=\"text-align:right\">2 + 2 = 4<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Append another row:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">4<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">43 + 4 = 47<\/td>\n<td style=\"text-align:right\">4 + 2 = 6<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Repeating the process, the table becomes:<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(x\\)<\/th>\n<th style=\"text-align:right\">\\(f(x)\\)<\/th>\n<th style=\"text-align:right\">\\(D^1\\)<\/th>\n<th style=\"text-align:right\">\\(D^2\\)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td style=\"text-align:right\">41<\/td>\n<td style=\"text-align:right\">2<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">43<\/td>\n<td style=\"text-align:right\">4<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">47<\/td>\n<td style=\"text-align:right\">6<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td style=\"text-align:right\">53<\/td>\n<td style=\"text-align:right\">8<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td style=\"text-align:right\">61<\/td>\n<td style=\"text-align:right\">10<\/td>\n<td style=\"text-align:right\">2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"non-polynomials\">Non-Polynomials<\/h2>\n<p>If your expression is not a polynomial, it may be possible to convert it to one or approximate it as a polynomial. For example, using <a href=\"https:\/\/mathworld.wolfram.com\/TaylorSeries.html\">Taylor series<\/a>, \\(sin(x)\\) can be approximated as:<\/p>\n<p>$$\\frac{x^9}{362880} - \\frac{x^7}{5040} + \\frac{x^5}{120} - \\frac{x^3}{6} + x$$<\/p>\n<p>Once in a polynomial form, you can use Method of Differences to compute a table of values. However, as an approximation, will contain errors and, as the number of rows builds, the error will increase.<\/p>\n<h2 id=\"playground-syntax\">Playground Syntax<\/h2>\n<p>The <a href=\"https:\/\/ztoz.blog\/mdifferences\/\">Method of Differences Playground<\/a> parses expressions using SymPy.<\/p>\n<p>The expression requires and permits only a single variable: <code>x<\/code>.<\/p>\n<p>Exponents are expressed with double asterisks, i.e. <code>x**4<\/code> means raise <code>x<\/code> to the 4th power.<\/p>\n<p>As of the date of this post, logarithms are not supported. Trigonometric functions are supported.<\/p>\n<h1 id=\"why-did-people-care\">Why did people care?<\/h1>\n<p>Mathematical tables were deeply important for both practical and academic reasons. Looking up a value was far faster and less error prone than calculating a value from scratch.  Francis Bailey, later head of the Royal Astronomical Society, presented a long list of mathematical tables which were amenable to this technique:<\/p>\n<table>\n<thead>\n<tr>\n<th>Examples<\/th>\n<th>Examples<\/th>\n<th>Examples<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Products of numbers<\/td>\n<td>Square numbers<\/td>\n<td>Cube numbers<\/td>\n<\/tr>\n<tr>\n<td>Higher powers<\/td>\n<td>Square and cube roots<\/td>\n<td>Reciprocals<\/td>\n<\/tr>\n<tr>\n<td>Sine, Cosine, Tangent<\/td>\n<td>Logarithms<\/td>\n<td>Logarithmic sine, cosine, tangent, cotangent<\/td>\n<\/tr>\n<tr>\n<td>Hyperbolic logarithm<\/td>\n<td>Astronomical<\/td>\n<td>Navigation<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Populating the tables was tedious work, but since it was just repeated addition, it was amenable for computers. At this time, a computer  was a human that did calculations. If you were starting a new project, you would hire a number of computers and set them on the same task, in parallel. The project lead could then check the outputs of the computers against each other, which was a fair way to establish reliability. Since errors in addition would have additive error in the output, binary search would be an effective checking technique. At the end, the tables would be copied over to the publisher, at which case the editor would need to perform additional checks for correctness.<\/p>\n<p>As you can see, the process still allows many avenues for errors as well as requiring significant intellectual labor. Babbage&rsquo;s Difference Engine promised to eliminate the intellectual labor by replacing the human computers and eliminate the copying errors by producing the tables automatically. These two elements together were considered necessary by Babbage and he argued against separating the computation from printing, even though that might have led to an earlier deliverable.<\/p>\n<p>Beyond the immediate practical aspects, the rapid and inexpensive production of tables would encourage the exploration of new mathematical functions. For example, interpreting polar coordinates is much aided by tables of trigonometric functions. For a deeper discussion on the context of the Difference Engine, see Doron D. Swade&rsquo;s article <em>Calculating Engines: Machines, mathematics, and misconceptions<\/em> in <em>Mathematics in Victorian Britain<\/em>.<\/p>\n<p>**Update 2024-02-20: I&rsquo;ve described <a href=\"https:\/\/ztoz.blog\/posts\/differences-applications\">additional applications of the method<\/a>.<\/p>\n<h1 id=\"limitations\">Limitations<\/h1>\n<p>Since many functions, including all the trigonometric and logarithmic functions, can only be approximated by polynomials, method of differences will produce errors in the output. Not all errors are fatal, though. For example, a B.H. Babbage documents, the Engine was to use 20 places of figures and handle six orders of differences. Many errors could be minor, and so by truncating the answer to fewer digits, the answers would be correct. Mathematically, one could also re-center tables on new initial values, which could eliminate accumulated errors in previous rows.<\/p>\n<p>While method of differences is a useful technique for generating tables, it is not general-purpose. One can&rsquo;t use it to interpolate a value using Newton&rsquo;s Method, determine if a number is prime, tabulate a census, or choose different tax rates based on a category. The Analytical Engine (theoretically) could, which is why its potential value was so greater.<\/p>\n<h1 id=\"further-reading\">Further Reading<\/h1>\n<p><em>The Works of Charles Babbage: Vol 2, The Difference Engine and Table Making<\/em>. Edited by Martin Campbell-Kelly. London, William Pickering. 1989.<\/p>\n<p>Includes<\/p>\n<ul>\n<li><em>A Note Respecting the Application of Machinery to the Calculation of Astronomical Tables<\/em>, by Charles Babbage, 14 June 1822<\/li>\n<li><em>On Mr. Babbage&rsquo;s New Machine for Calculating and Printing Mathematical and Astronomical Tables<\/em> by Francis Bailey, 28 November 1823<\/li>\n<li><em>Babbage&rsquo;s Calculating Machine or Difference Engine<\/em> by B.H. Babbage, 1872<\/li>\n<\/ul>\n<p><em>Babbage Difference Engine #2 - How to Initialize the Machine - Including Calculating Initial Values<\/em>. Ed Thelen. <a href=\"https:\/\/ed-thelen.org\/bab\/bab-intro.html\">Website<\/a><\/p>\n<p><em>Method of Differences<\/em>. Boi (\ubcf4\uc774) et al. <a href=\"https:\/\/brilliant.org\/wiki\/method-of-differences\/\">Website<\/a><\/p>\n<p><em>Mathematics in Victorian Britain<\/em>. 2011. Oxford England: Oxford University Press.<\/p>\n<p><a href=\"https:\/\/www.sciencemuseum.org.uk\/objects-and-stories\/charles-babbages-difference-engines-and-science-museum\">Charles Babbage&rsquo;s Difference Engines and the Science Museum<\/a><\/p>\n"},{"title":"Konrad Zuse's Early Computers (Review)","link":"https:\/\/ztoz.blog\/posts\/zuse-early-computers\/","pubDate":"Sat, 02 Dec 2023 16:39:27 -0700","guid":"https:\/\/ztoz.blog\/posts\/zuse-early-computers\/","description":"<p>Although historians are usually pushed to decide questions of &ldquo;who was first,&rdquo; the question of <em>impact<\/em> is more interesting. Konrad Zuse has a claim to the first computer, the first high-level computer language, and the first European commercial computer sale, but his relative isolation working in Germany in the 30s and 40s limited his visibility and impact. It wasn&rsquo;t until the 1970s for his innovations to become more widely known. Ra\u00fal Rojas has long contributed to the historical appreciation of Zuse and the remarkable modern nature of his machines.  This volume collects and makes available in English many of his previous articles on the subject.<\/p>\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/zuse-early-computers\/978-3-031-39876-6.webp\" alt=\"Book cover\">\n  \n\n<blockquote>\n<p>Ra\u00fal Rojas. <em>Konrad Zuse&rsquo;s Early Computers: The Quest for the Computer in Germany<\/em>. Springer 2023. <a href=\"https:\/\/doi.org\/10.1007\/978-3-031-39876-6\">Publisher Link<\/a><\/p>\n<\/blockquote>\n<p>The book is primarily a technical history, discussing how each of Zuse&rsquo;s early computers (Z1, Z2, Z3, and Z4) were designed and implemented, including the post-war effort to add conditional jumps to the Z4. The author also provides one of the clearest explanations of Plankalk\u00fcl, Zuse&rsquo;s high-level, symbolic programming language with an unusual two-dimensional syntax (likely inspired by engineering spreadsheet computation forms). The author also covers the S1 and S2, special-purpose computing machines for finding control corrections caused by aircraft wing manufacturing defects. Zuse&rsquo;s efforts towards chess playing software are explained. Finally, the book contains two chapters providing historical context on the world-wide efforts to build computers and insight into the politics of Zuse and his close associates.<\/p>\n<p>Rojas' explanations are very clear, improved by a large number of illustrations and diagrams. The tone is measured and he provides evidence for his claims, and the limits of those claims. For instance, the Z4, built between 1942 and 1945, lacked any instructions for jumps, conditional or not. Zuse was aware of the limitation (as noted in his journals of the time), but since the Z4 only had 64 words of memory, he felt it was an acceptable limitation. After the war, the ETH university leased the machine in 1950, but with the proviso that Zuse add conditional jumping. Using only microinstructions, he was able to add test operations (e.g. x equals zero, x is greater than zero) and a subroutine jump instruction. Although the jump was still limited, limiting programs to a single loop, this was sufficient for the use cases of the day. Rojas discusses how this limitation factors into the debate over the first computer, both practically, comparatively (comparing it to the ENIAC and the IBM\/Harvard Mark I), and theoretically (how this limitation can be ignored in terms of Turing&rsquo;s infinite tape).<\/p>\n<p>(If you are interested in how the Z4 was used at ETH, I recommend reading the chapter &ldquo;Computer Development in Switzerland&rdquo; in Herbert Bruderer&rsquo;s <em>Milestones in Analog and Digital Computing<\/em>, Vol 2.)<\/p>\n<p>In terms of the book&rsquo;s goal &mdash; describe the progression of Zuse&rsquo;s thinking and the development of his computers through the specification of the Plankalk\u00fcl language &mdash; it succeeds in its purpose. We learn little about Zuse himself, but that is not the point of the book. Due to the after-effects of the war, Zuse was penniless and lost many years before he could restart his efforts and build new computers. In the mean-time, the U.S. and British efforts quickly iterated on new designs and established new markets. While Zuse was not unknown &mdash; he met Turing and Plankalk\u00fcl was referenced by the ALGOL committee &mdash; his impact was limited, perhaps partially due to his patents being disregarded by the German government and his doctorate thesis not being published.<\/p>\n<p>However, since the matter of impact is all post-war, I think the book ended too early. Zuse&rsquo;s story is intertwined with post-war politics and geography, the influence of patents, and the beginning of the commercial computer industry. None of these are technical topics or within the author&rsquo;s expression of scope, but by not including them, the reader needs to read another volume to get the full story.<\/p>\n<p>Also, the book could be improved by the addition of some color plates as some of the photos and diagrams have more detail than is comfortable in grey-scale. Most of the diagrams are high-quality and easy to read.<\/p>\n<p>Recommended. At $38 for the hardbound edition, this is a good value for an academic work.<\/p>"},{"title":"Monsters, Aliens, and Holes in the Ground (Review)","link":"https:\/\/ztoz.blog\/posts\/monsters-holes-ground\/","pubDate":"Thu, 09 Nov 2023 19:26:29 -0700","guid":"https:\/\/ztoz.blog\/posts\/monsters-holes-ground\/","description":"<p>For those interested in the history of tabletop roleplaying game design, <em>Monsters, Aliens, and Holes in Ground<\/em> is an excellent, near encyclopedic treatment. The author, Stu Horvath, documents major and minor game systems, how they innovated or were influenced by other game systems, and how the systems expanded with settings and adventures. Horvath covers the 1970s through the 2010s and, reading through the entries, chronicles the major changes in design theory from simulation, to forms of narrative, to today&rsquo;s many experimental systems.<\/p>\n<p><img src=\".\/9780262048224.avif\" alt=\"Book Cover\" title=\"Book Cover with alien, barbarian, scholar, and dungeon entrance\"><\/p>\n<blockquote>\n<p>Stu Horvath and Kyle Patterson. <em>Monsters, Aliens, and Holes in the Ground: A Guide to Tabletop Roleplaying Games from D&amp;D to Mothership<\/em>. MIT Press 2023. <a href=\"https:\/\/mitpress.mit.edu\/monsters-aliens-and-holes-in-the-ground\/\">Publisher Link<\/a><\/p>\n<\/blockquote>\n<p>I bought the standard edition and the book is weighty and beautiful. In fact, the book is a bit <em>too<\/em> heavy and can be uncomfortable to read in a chair. Over 400 pages long of heavy paper stock, almost all entries are annotated with at least one full-color image of the book being discussed. Each decade is introduced by a thematic two page illustration of the major game design thrusts by Kyle Patterson.<\/p>\n<p>Organized by the publication date, the text supports a linear read-through as game systems influence later books. Horvath is fairly opinionated about what works and didn&rsquo;t work within a design. This improves the readability, although, as an academic imprint, some of his arguments deserve greater support. (Horvath does pull from game journalists and many interviews with game designers.) I was pleased to finally learn the background of titles like <em>Skyrealms of Jorune<\/em>, which was advertised consistently in <em>Dragon Magazine<\/em>, but was sufficiently weird to remain obscure.<\/p>\n<p>Although the book documents an extensive number of systems, it is not exhaustive. The author is open about this fact in the introduction. First, a system may not be included if he has never heard of it. Second, there are many systems (particularly foreign language) that he has not been able to acquire. Third, he excludes some systems if due to their subject matter or his disagreements with the creators. Of these, he only specifies <em>Empire of the Petal Throne<\/em> as eliminated. Thus, for some of the rare games in my collection that are not mentioned in the book, I do not know if their absence is intentional or inadvertent.<\/p>\n<p>Additionally, the main threads of game design are described over many individual game entries. I think the book, and his arguments, would have been improved by a summary tying the major milestones together.<\/p>\n<p>I recently visited a half-dozen game shops across three states and I was disappointed in the small RPG selection at the stores. If it&rsquo;s not D&amp;D 5e, they are unlikely to have it. In comparison, growing up, every store stocked multiple major systems and many had a deep catalog of used titles. So, I appreciate books like this that show the sheer range of the hobby and how designers have influenced and built upon each other&rsquo;s innovation and experiments. Recommended.<\/p>\n"},{"title":"Shared Fantasy: Role-Playing Games as Social Worlds (Review)","link":"https:\/\/ztoz.blog\/posts\/shared-fantasy\/","pubDate":"Sat, 09 Sep 2023 15:59:21 -0600","guid":"https:\/\/ztoz.blog\/posts\/shared-fantasy\/","description":"<p><em>Shared Fantasy<\/em> is an ethnographic study of fantasy roleplayers in the Minnesota area from 1977 to 1979 by Gary Alan Fine. As this predates the moral panic of the early 1980s (James Egbert disappeared in 1979; the movie <em><a href=\"https:\/\/www.imdb.com\/title\/tt0084314\/\">Mazes and Monsters<\/a><\/em> was released in 1982) and the resultant explosion in popularity of the field, it also serves as a historical artifact of the hobby&rsquo;s early days. Does this book provide a better historical understanding of roleplaying games? Will reading this book make you a better player? My answers are yes and no, respectively.<\/p>\n<blockquote>\n<p>Fine GA. <em>Shared Fantasy : Role-Playing Games As Social Worlds<\/em>. Chicago: University of Chicago Press; 1983.<\/p>\n<\/blockquote>\n<p>As an ethnographic study, Fine played in a large number of roleplaying sessions across multiple game systems (Dungeons &amp; Dragons, Chivarly &amp; Sorcery, Traveller, and Empire of the Petal Throne), gamemastered a few sessions, and interviewed players and prominent game designers. He spent most of his time at the Golden Brigade clubhouse, but also played in private games and participated in two conventions. Within the book, he covers group dynamics, individual motivations to play, how players approach their characters, and how players and gamemasters approach their collective, shared worlds.<\/p>\n<p>Fine separates players into two categories. The first plays a character that is largely an extension of themselves and attempts to &ldquo;win&rdquo;. The second type plays a character, and the player tries to play the character accurately, even if the character choices would disagree with the player&rsquo;s choices. Fine notes that the players in both cases have no trouble distinguishing reality from the game. Historically, Fine is witnessing the growing sophistication and tailoring of content to different styles of play, as Chivalry &amp; Sorcery catered to fans who wanted a more &ldquo;accurate&rdquo; medieval simulation and Empire catered to those interested in exploring a far more alien world with non-traditonal influences. However, contemporary gaming literature was documenting more nuanced categories of motivations, as the literature <a href=\"https:\/\/tvtropes.org\/pmwiki\/pmwiki.php\/Main\/PlayerArchetypes\">continues to do today.<\/a><\/p>\n<p>Based on contemporary reviews, Fine&rsquo;s accounts of the players militarism aroused the most interest. While he noted that <em>characters<\/em> often killed, tortured, and committed sexual violence without remorse, the <em>players<\/em> were law-abiding and sometimes pacifists in real-life. (Today, this character behavior would earn the ephithet <a href=\"https:\/\/old.reddit.com\/r\/DnD\/comments\/6asjyq\/murder_hobos_what_are_they\/\">murderhobo<\/a>.) In the methodological appendix, the author notes his discomfort at their actions in the game and their crude table talk, but that he played along to avoid seeming weird. A contemporary review criticizes the author for missing an opportunity to do a psychological study of the participants (versus just interviewing them and asking why they did such-and-such). From a gaming perspective, it is unclear if there were any consequences for the character&rsquo;s actions to dissuade them from being sociopathic. Based on my experiences, what he describes is more players testing boundaries than a cultural practice. As evidence, when younger children are at the table, or the table has more mixed sex representation, the players self-censure to be less violent and more chaste.<\/p>\n<p>As dictated by sociological models, the author describes status structures within the group, which he bases on experience, role, and age. Players with more experience (e.g. sessions) in the game are afforded more respect, have more influence in determining the group&rsquo;s actions, and can select to play roles more to their liking. Gamemasters are afforded the most respect. I do not think this analysis was particularly useful, but it was likely necessary for publication. A contemporary reviewer simulataneously criticized the book for including some content to justify the scholarly nature, while also not including analyses per  several psychological and sociological models&hellip; so it seems he can&rsquo;t win.<\/p>\n<p>This excess of methodology can be seen in the &ldquo;babysitting crisis&rdquo;. After the club is mentioned in the local newspaper, the gaming club saw parents dumping children at the club to play. Older members and the leadership saw this as an imposition &mdash; that they were being used for free babysitting. The leaders adopted a policy that they would not allow the kids to stay later than the city&rsquo;s curfew (10 pm) and, although it seems they never had to enforce the policy, they reduced the number of kids through the mere threat.<\/p>\n<p>The book analyzes this situation in terms of power and status, where the older, higher status members used their privilege to enact ageist policies. However, as any person who has tried to teach young children rules to a game has learned, maturity matters. Immature or bad players can easily ruin the game for the entire group. In fact, this was a recent thread in <a href=\"https:\/\/kenzerco.com\/knights-of-the-dinner-table\/\"><em>Knights of the Dinner Table<\/em><\/a> where the players were concerned about including a younger player into a tense, high-stakes scenario. The player concerns were not around abstract notions of status, but practical concerns about maturity, knowledge of the rules, and gaming late into the night. By trying to examine everything from a sociological perspective, the analysis sometimes misconstrues more practical reality.<\/p>\n<p>While the book contains many interesting anecdotes and is one of the deeper looks into the hobby at the time from the view of a semi-detached player, the contemporary gaming literature at the time was far ahead in terms of understanding players and sharing concrete advise on how to evolve and improve games.<\/p>"},{"title":"Godot4 HTML5 Export Development Server","link":"https:\/\/ztoz.blog\/posts\/godot4-html5-export\/","pubDate":"Mon, 14 Aug 2023 14:05:53 -0600","guid":"https:\/\/ztoz.blog\/posts\/godot4-html5-export\/","description":"<p>Godot 4&rsquo;s <a href=\"https:\/\/docs.godotengine.org\/en\/4.1\/tutorials\/export\/exporting_for_web.html\">HTML5 export<\/a> uses WebAssembly, WebGL, and SharedArrayBuffers. Browsers require a <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/Security\/Secure_Contexts\">secure context<\/a> for these features to be available, which requires sending certain HTTP headers when serving the game content. For development, there are two straight-forward ways to serve the content: Godot&rsquo;s Python http server and Miniserve.<\/p>\n<h1 id=\"godots-python-http-server\">Godot&rsquo;s Python HTTP Server<\/h1>\n<p>The Godot team has written a script, leveraging Python&rsquo;s existing http server implementation, which adds the necessary HTTP headers.<\/p>\n<p>Download <a href=\"https:\/\/raw.githubusercontent.com\/godotengine\/godot\/master\/platform\/web\/serve.py\"><code>serve.py<\/code><\/a><\/p>\n<p>Run it via:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-sh\" data-lang=\"sh\">python3 serve.py -r game_directory\n<\/code><\/pre><\/div><h1 id=\"miniserve\">Miniserve<\/h1>\n<p><a href=\"https:\/\/github.com\/svenstaro\/miniserve\">Miniserve<\/a> is a self-contained command-line tool for serving HTTP content. As a self-contained binary, it may be easier to install and manage than a full Python installation.<\/p>\n<p>You can run it via:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-sh\" data-lang=\"sh\">.\/miniserve --header Cross-Origin-Embedder-Policy:require-corp --header Cross-Origin-Opener-Policy:same-origin game_directory\n<\/code><\/pre><\/div>"},{"title":"Drawing Down the Moon (Review)","link":"https:\/\/ztoz.blog\/posts\/drawing-down-moon-review\/","pubDate":"Fri, 04 Aug 2023 15:32:53 -0600","guid":"https:\/\/ztoz.blog\/posts\/drawing-down-moon-review\/","description":"<p><em>Drawing Down the Moon: Magic in the Ancient Greco-Roman World<\/em> is an academic history of the perspective and practice of magic. Broad and detailed, Edmonds covers multiple types of magic, discussing the who, what, why, and how of each, and attempting to place the magic within the broader culture practice. Themes carried throughout include what counts as magic and how that magic might be embedded within the culture.<\/p>\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/drawing-down-moon-review\/drawing-down-the-moon-cover.jpg\" alt=\"Cover of Drawing Down the Moon\">\n  \n\n<p><a href=\"https:\/\/press.princeton.edu\/books\/hardcover\/9780691156934\/drawing-down-the-moon\">Publisher Website<\/a><\/p>\n<p>Since this is an unusual work for me, I should establish the context of this review. First, I have no training in the history of magic. My history of science training included material on astrology and alchemy, in terms of their influence on the development of mathematics, astronomy, and chemistry, but only at the undergraduate level. I have two interests in the book: one, a lay-interest in the subject matter amid my general interest in Greco-Roman studies and two, evaluating whether the material could be useful in a roleplaying game supplement. Thus, while I am unqualified to speak to the finer academic debates within the text, I can represent the lay-person.<\/p>\n<p>What counts as magic? This is a deep question, because the ancients treated some magic as &ldquo;natural&rdquo;, some was integrated into religious practices, while other aspects were considered very unnatural. Edmonds proposes the definition:<\/p>\n<blockquote>\n<p>Magic is a discourse pertaining to non-normative ritualized activity, in which the deviation from the norm is most often marked in terms of the perceived efficacy of the act, the familiarity of the performance within the cultural tradition, the ends for which the act is performed, or the social location of the performer. (pg 5)<\/p>\n<\/blockquote>\n<p>This definition is tested multiple times within the book and proves fairly robust. Note that this definition does not include &ldquo;going against the natural order&rdquo; since there wasn&rsquo;t a scientific viewpoint to violate.<\/p>\n<p>Edmonds begins the book with chapters on the definition of magic and the broader nature of magic in the Greco-Roman world. Afterwards, he covers curses, love charms, healing and protective spells, the relationship between magic and religion, divination, astrology, alchemy, and theurgy with a detailed chapter on each. Finishing the book is a conclusion on the definition of magic. The first two chapters are rough-going to the layperson because they concern many subtle academic debates. The heart of the book, discussions of different types of magic, are much more engaging.<\/p>\n<p>A key part of the analysis is the attempt at obtaining an emic perspective. There are two types of perspectives: <em>etic<\/em>, which is the exterior, scholarly perspective of magic, its practices, and its role within the culture, while <em>emic<\/em> is the cultural insider&rsquo;s perspective. Separated by 2000 years, the emic perspective is lost to us, but we can try to understand aspects through literary and other historical artifacts. The author quotes many poems, letters, and legal documents to provide insights into the emic pespective.<\/p>\n<p>Although the chapters on each type of magic may quote spells and rituals, they are neither catalogs nor manuals. For instance, the chapter on astrology is insufficiently detailed for someone to draw their own horoscope, but it does cover what horoscopes, tools, and other references come to us, how astrology was likely practiced, the theories underlying astrology and some of the criticism of the time, and what may have driven people to pay for horoscopes (or outlaw them). While appropriate for a historical work, this is less useful for background material for a game.<\/p>\n<p>While I found much of the book interesting, the depth of the material often exceeded my patience. I do not think this is a good first book on the subject or for readers unused to graduate level texts, but I think it is a good way to launch into more detailed studies or obtain an understanding of the current state of the field.<\/p>"},{"title":"Range to Dice Notation","link":"https:\/\/ztoz.blog\/posts\/range-to-dice-notation\/","pubDate":"Thu, 06 Jul 2023 22:19:25 -0600","guid":"https:\/\/ztoz.blog\/posts\/range-to-dice-notation\/","description":"<p>Before dice notation was adopted, early roleplaying games described dice rolls using range notation. For example, <code>3-18<\/code> indicated rolling three six-sided dice or <code>3d6<\/code> in modern notation. Converting a range to dice notation requires a little thought, so I&rsquo;m going to describe a way to solve the conversion programmatically.<\/p>\n<h1 id=\"background\">Background<\/h1>\n<p>Dice notation is a succinct domain specific language to represent discrete probability distributions and is heavily used in tabletop roleplaying games. For example, <code>4d10 + 3<\/code> means rolling four ten-sided dice, summing the result, and adding three. Over time, roleplaying systems have introduced many mechanics and esoteric distributions such that multiple attempts at a universal language have been developed (e.g. <a href=\"https:\/\/wiki.roll20.net\/Dice_Reference\">roll20<\/a>, <a href=\"https:\/\/www.ianhunter.ie\/GNOLL\/\">gnoll<\/a>). Dice notation replaced an earlier system of <a href=\"http:\/\/playingattheworld.blogspot.com\/2013\/08\/the-origins-of-dice-notation.html\">specifying ranges<\/a>.<\/p>\n<p>Range specifications were compact but ambiguous and sometimes confusing. For example, the Holmes edition monster manual specified damages from different types of giants as:<\/p>\n<table>\n<thead>\n<tr>\n<th>Giant Type<\/th>\n<th>Damage<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Hill<\/td>\n<td>2 - 16<\/td>\n<\/tr>\n<tr>\n<td>Stone<\/td>\n<td>3 - 18<\/td>\n<\/tr>\n<tr>\n<td>Frost<\/td>\n<td>4 - 24<\/td>\n<\/tr>\n<tr>\n<td>Fire<\/td>\n<td>5 - 30<\/td>\n<\/tr>\n<tr>\n<td>Cloud<\/td>\n<td>6 - 63<\/td>\n<\/tr>\n<tr>\n<td>Storm<\/td>\n<td>7 - 42<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>When the gamemaster came to make a roll, they would need to translate these ranges into the number of dice to roll. Dividing the second number by the first yields some convenient solution &ndash; 2d8, 3d6, 4d6, 5d6, and 7d6 for all the giants but cloud. The range 6 to 63 violates this trend as 63 isn&rsquo;t divisible by 6. The game designer probably expected the gamemaster to roll <code>3d20 + 3<\/code>, but there are many possibilities. Weatherspud found <a href=\"https:\/\/gist.github.com\/weatherspud\/f09dcafcba571ff554876a3dc3e162f0\">164 solutions<\/a> if multiple dice types are allowed. The solution I describe below finds eight under a different set of constraints: <code>1d2*57-51<\/code>, <code>57d2-51<\/code>, <code>3d2*19-51<\/code>, <code>19d2*3-51<\/code>, <code>1d4*19-13<\/code>, <code>19d4-13<\/code>, <code>1d20*3+3<\/code>, and <code>3d20+3<\/code>.<\/p>\n<h1 id=\"problem-assumptions-and-constraints\">Problem Assumptions and Constraints<\/h1>\n<p>A range is denoted by two integers, <em>a<\/em> and <em>b<\/em>, which define an inclusive range. The value <em>a<\/em> is less than or equal to <em>b<\/em>. (I&rsquo;ll be following a convention where lower-case variables are known constants and upper-case variables are unknown.)<\/p>\n<p>We will restrict solutions to:<\/p>\n<p>$$\nNdD \\times Q + C\n$$<\/p>\n<p>where:<\/p>\n<ul>\n<li><em>N<\/em> is the number of dice to roll (a non-negative integer).<\/li>\n<li><em>D<\/em> is the number of sides of the die and is restricted to the set {2, 4, 6, 8, 10, 12, 20, 30, 100}. Each die is numbered from 1 to the number of faces, inclusive.<\/li>\n<li><em>Q<\/em> is a multiplication constant (typically equal to 1) and is restricted to positive integers.<\/li>\n<li><em>C<\/em> is an additive constant and may be any integer (including negative).<\/li>\n<\/ul>\n<p>Restricting solutions to a single type of die matches the game&rsquo;s need to move along quickly without diverting the players into a long-winded math exercise. However, some <a href=\"https:\/\/www.athenopolis.net\/2016\/12\/cloud-giant-damage.html\">studies of this problem<\/a> have not imposed this restriction and allowed multiple types of dice.<\/p>\n<p>A solution means that the dice roll&rsquo;s minimum and maximum values will equal the range&rsquo;s <code>a<\/code> and <code>b<\/code> value. A range does not need to be &ldquo;dense&rdquo;; for example, the range [1000, 4000] can be met by <code>1d4 * 1000<\/code>, even though the values rolled will never equal 1001, 2052, or 3999.<\/p>\n<p>Solutions may not be unique. For example, the range [3, 12] may be solved with either <code>3d4<\/code> or <code>1d10 + 2<\/code>. Depending on the context, a game designer may strongly prefer a normal-like distribution over a uniform distribution, or vice versa, which is one of the reasons range notation was replaced by dice notation.<\/p>\n<h2 id=\"minimum-and-maximum-rolls\">Minimum and Maximum Rolls<\/h2>\n<p>Given a solution of N, D, Q, and C as presented above, the minimum value that can be rolled is <code>NQ + C<\/code> and the maximum value is <code>NQD + C<\/code>.<\/p>\n<p>So,<\/p>\n<p>$$\nNQ + C = a\n$$\n$$\nNQD + C = b\n$$<\/p>\n<h1 id=\"exhaustive-search\">Exhaustive search?<\/h1>\n<p>In <a href=\"https:\/\/www.athenopolis.net\/2016\/12\/cloud-giant-damage.html\">Weather Spud&rsquo;s posting<\/a>, they treat the problem as an unbounded knapsack problem. (Their solution does not include multipliers nor two-sided dice, but they do allow multiple dice of different types.) Can we use a simpler approach?<\/p>\n<p>The equations:<\/p>\n<p>$$\nNQ + C = a\n$$\n$$\nNQD + C = b\n$$<\/p>\n<p>have four unknown integer values and two known integer values. If we introduce a new variable, \\(X = NQ\\) and \\(y = b - a\\), then we can solve the parallel equations:<\/p>\n<p>$$\na = X + C\n$$\n$$\nb = XD + C\n$$\n$$\ny = b - a\n$$\n$$\ny = XD + C - X - C\n$$\n$$\ny = XD - X\n$$\n$$\ny = X(D - 1)\n$$<\/p>\n<p>which have two unknowns, one of which, D, is restricted to nine values. Iterating through all possible values that let X be an integer, we can then enumerate all possible N and Q values for X where N and Q are divisors of X.<\/p>\n<p>The relevant Python code:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\">y <span style=\"color:#f92672\">=<\/span> b <span style=\"color:#f92672\">-<\/span> a\npotential_nqs <span style=\"color:#f92672\">=<\/span> [y <span style=\"color:#f92672\">\/\/<\/span> (side <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#66d9ef\">for<\/span> side <span style=\"color:#f92672\">in<\/span> SIDES <span style=\"color:#66d9ef\">if<\/span> y <span style=\"color:#f92672\">%<\/span> (side <span style=\"color:#f92672\">-<\/span> <span style=\"color:#ae81ff\">1<\/span>) <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>]\n<span style=\"color:#66d9ef\">for<\/span> nq <span style=\"color:#f92672\">in<\/span> potential_nqs:\n    c <span style=\"color:#f92672\">=<\/span> a <span style=\"color:#f92672\">-<\/span> nq\n    d <span style=\"color:#f92672\">=<\/span> (b <span style=\"color:#f92672\">-<\/span> c) <span style=\"color:#f92672\">\/\/<\/span> nq\n\n    <span style=\"color:#66d9ef\">for<\/span> n, q <span style=\"color:#f92672\">in<\/span> _divisor_pairs(nq):\n        <span style=\"color:#66d9ef\">if<\/span> n <span style=\"color:#f92672\">!=<\/span> q:\n            <span style=\"color:#75715e\"># yield both n, q and q, n pairs since the solutions are distinct<\/span>\n            <span style=\"color:#66d9ef\">yield<\/span> Solution(n, d, q, c)\n            <span style=\"color:#66d9ef\">yield<\/span> Solution(q, d, n, c)\n        <span style=\"color:#66d9ef\">else<\/span>:\n            <span style=\"color:#66d9ef\">yield<\/span> Solution(n, d, q, c)\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">_divisor_pairs<\/span>(d: int) <span style=\"color:#f92672\">-&gt;<\/span> typing<span style=\"color:#f92672\">.<\/span>Iterable[typing<span style=\"color:#f92672\">.<\/span>Tuple[int, int]]:\n    <span style=\"color:#e6db74\">&#34;&#34;&#34;\n<\/span><span style=\"color:#e6db74\">    Yield all (a, b) pairs such that a, b are positive integers, a &lt;= b, and ab = d.\n<\/span><span style=\"color:#e6db74\">\n<\/span><span style=\"color:#e6db74\">    &gt;&gt;&gt; list(_divisor_pairs(1))\n<\/span><span style=\"color:#e6db74\">    [(1, 1)]\n<\/span><span style=\"color:#e6db74\">    &gt;&gt;&gt; list(_divisor_pairs(2))\n<\/span><span style=\"color:#e6db74\">    [(1, 2)]\n<\/span><span style=\"color:#e6db74\">    &gt;&gt;&gt; list(_divisor_pairs(4))\n<\/span><span style=\"color:#e6db74\">    [(1, 4), (2, 2)]\n<\/span><span style=\"color:#e6db74\">    &gt;&gt;&gt; list(_divisor_pairs(12))\n<\/span><span style=\"color:#e6db74\">    [(1, 12), (2, 6), (3, 4)]\n<\/span><span style=\"color:#e6db74\">\n<\/span><span style=\"color:#e6db74\">    :param d: a positive integer\n<\/span><span style=\"color:#e6db74\">    :return: an iterator of (a, b) tuples\n<\/span><span style=\"color:#e6db74\">    &#34;&#34;&#34;<\/span>\n    <span style=\"color:#66d9ef\">assert<\/span> isinstance(d, int), <span style=\"color:#e6db74\">&#39;d must be an integer&#39;<\/span>\n    <span style=\"color:#66d9ef\">assert<\/span> d <span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>, <span style=\"color:#e6db74\">&#39;d must be positive&#39;<\/span>\n\n    <span style=\"color:#66d9ef\">for<\/span> a <span style=\"color:#f92672\">in<\/span> range(<span style=\"color:#ae81ff\">1<\/span>, math<span style=\"color:#f92672\">.<\/span>floor(math<span style=\"color:#f92672\">.<\/span>sqrt(d)) <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>):\n        <span style=\"color:#66d9ef\">if<\/span> d <span style=\"color:#f92672\">%<\/span> a <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">0<\/span>:\n            <span style=\"color:#66d9ef\">yield<\/span> a, d <span style=\"color:#f92672\">\/\/<\/span> a\n<\/code><\/pre><\/div><blockquote>\n<p><strong>On Python Iterators<\/strong><\/p>\n<p>Iterators (continuation style) are great because they allow the consumer to pull the amount of data they need. This <em>might<\/em> reduce memory consumption versus building up a data structure of all solutions, but continuations are not free so, as usual, your use case matters.<\/p>\n<\/blockquote>\n<p>Given the range (3, 18), the algorithm finds eight solutions:<\/p>\n<pre tabindex=\"0\"><code>1d2*15-12, 15d2-12, 3d2*5-12, 5d2*3-12, 1d4*5-2, 5d4-2, 1d6*3, 3d6\n<\/code><\/pre><p>Of these, 3d6 is the &ldquo;preferred&rdquo; solution. d2 appear rarely in Dungeons &amp; Dragons and multipliers are similarly rare for small numbers, so I&rsquo;d expect players to only use 3d6 or 5d4-2. (Comparing these distributions on <a href=\"https:\/\/anydice.com\/program\/3065c\">anydice.com<\/a> we see a player might slightly prefer rolling <code>5d4-2<\/code> as the results are more likely to be in the safe mid area.)<\/p>\n<p>Algorithmically, the run-time is dominated by the <code>_divisor_pairs<\/code> computation which requires the square root of y number of modulus checks. Modulus is one of the more expensive operators, but for realistic ranges in roleplaying games, this algorithm is quite fast. Python&rsquo;s timeit <code>timeit.timeit('list(range2dice.solve(1000, 4000))', setup='import range2dice')<\/code> reported a little more than 17 seconds to execute one million times on my HP zBook G9.<\/p>\n<p>The full code is available at <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/range2dice\/-\/blob\/master\/range2dice\/__init__.py\">my gitlab repository<\/a>.<\/p>\n"},{"title":"Black on Red: My 44 Years Inside the Soviet Union (Review)","link":"https:\/\/ztoz.blog\/posts\/black-on-red\/","pubDate":"Fri, 23 Jun 2023 16:09:19 -0600","guid":"https:\/\/ztoz.blog\/posts\/black-on-red\/","description":"\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/black-on-red\/Black-on-Red-page-103-1024x809.jpeg\" alt=\"Robinson after being elected to the Moscow Soviet, book page 103\">\n  \n  <figcaption>Robinson after being elected to the Moscow Soviet, book page 103<\/figcaption>\n<\/figure>\n\n\n<p>Robert Robinson was a black machinist who accepted an offer from the Soviets to leave his job at the Ford plant to come to Russia to help train their machinists in the 1930s. Accepting the deal, he pursued his engineering career and invented many productivity improving tools, acquired a mechanical engineering degree, survived the Purges and World War II, and after the war, spent decades trying to leave. He eventually escaped in 1974 via Uganda, spent some time as a professor, and ultimately regained citizenship in the United States.<\/p>\n<blockquote>\n<p>Robert Robinson and Jonathan Slevin. 1988. <em>Black on Red : My 44 Years Inside the Soviet Union : An Autobiography<\/em>. Washington D.C: Acropolis Books.<\/p>\n<\/blockquote>\n<p>As an autobiography, this book details Robinson&rsquo;s experience with Soviet racism towards him as a black foreigner (even after he gained Soviet citizenship), the surveillance and restrictions on his freedom, as well as the successes and disappointments within his engineering career. Most of the book focuses on his experiences during the 30s and 40s which is when he witnessed the most radical changes in Soviet society. There is a poignant chapter on the Space Race and the Soviet worker&rsquo;s collapse in morale as the United States pulled ahead. Of the three themes, most reviewers focus on the racism (since his status as a black engineer in Moscow was unique) and his interactions in Soviet Russia (which were far less unique as surveillance and suspicion was pervasive), so I choose to focus on his engineering experience.<\/p>\n<p>Unlike many people who traveled to the Soviet Union in the early 1930s, Robinson was not a fellow traveler, but rather a man who wanted to pursue his chosen profession &mdash; engineering. He stayed highly apolitical, partly following his own stated nature and also to avoid helping or angering the authorities. In this, he spent the majority of his life in a ball bearings plant in Moscow, side-by-side with factory workers.<\/p>\n<p>While he was born in Jamaica, Robinson grew up in Cuba. He demonstrated an early talent with machines and became a trained tool maker. Looking for better opportunities, he came to Detroit. Although his resume opened many doors, they were quickly closed once the hiring staff saw him. The trick was you needed an in first, so he took a job sweeping the floor at the Ford plant (a position open to blacks) and then took the internal technical training course. After passing that course (which was redundant given his prior training), he was promoted to a tool making job, although he had to fight sabotage from other workers displeased with his presence. He was the single black among 700 toolmakers.<\/p>\n<p>In April 1930, a Soviet delegation came to the plant to recruit workers for a one year contract. The terms were lucrative (about twice the pay, free housing, free transport to and back, and other amenities) and offered other opportunities, such as tuition for a  university education. Robinson had spent three years at Ford and these terms were a promotion &ndash; he would be training others in his craft, he had a path towards an engineering degree, and the Soviets promised a lack of Jim Crow. These terms were far better than he could arrange in the United States and he agreed to go.<\/p>\n<p>His initial experiences were positive as he was recognized for his work and found promotions. In contrast to the Americans that traveled to Russia with him, he found the Russians welcoming and friendly, although very poor. He advised other blacks curious about following him that the Russian standard of living was a step below what they were used to, such as reliable hot and cold running water. He renewed his contract many times and returned to Russia after a trip back to the United States in 1933.<\/p>\n<p>Although his intention was to return to the United States within a few years, he found this path blocked. He was being used as a tool of propaganda by the Soviets, and eventually the United States ordered him back home. (The U.S. National Museum of American Diplomacy has documented their <a href=\"https:\/\/diplomacy.state.gov\/stories\/robert-robinson-and-bill-davis\/\">version of the events<\/a> along with additional internal documents.) With the Depression continuing, and his role with the Soviets leading him to be blacklisted in Detroit, he felt there was no employment possibilities awaiting him back in America. He appealed to the Soviet government and, after promising that he could later drop his Soviet citizenship and become American again, became a Soviet citizen. After this, he was trapped.<\/p>\n<p>Throughout his life, Robinson found solace in building, designing, and engineering. His machine shop was a place where he could lose himself within his work and he recounts several inventions that improved productivity at the plant. He continued to work after retirement in Russia, even with a secured pension.<\/p>\n<p>One of the productivity improvements he describes were indicators. The plant owned a small number of mechanical indicators, which were used to ensure the ball bearings were within tolerances. Given the small number of indicators, workers had to queue and wait for them, drastically reducing their output. The indicators (purchased many years ago) were also in Imperial units, but since the specifications were in metric, the measurements had to be translated.<\/p>\n<p>Robinson designed a new indicator that he thought the plant could build. The senior management disagreed, saying the tolerances in his design were beyond their ability. As he was often obliged, Robinson worked on the indicators outside of his scheduled hours and was able to manufacture fourteen specimens. He presented them to management and, after they were validated, the new indicators were adopted by the workers and productivity improved.<\/p>\n<p>Financially, for Robinson&rsquo;s initiative and additional work (he estimates at 105 working day), he was awarded a twelve day vacation pass and 400 rubles. (Robinson&rsquo;s account of the resulting vacation is humorous in how regimented and inadequate Soviet rest homes were.) He equates the monetary value to $156 in 1945 dollars. For a median American mechanical engineer, this was <a href=\"https:\/\/fraser.stlouisfed.org\/title\/employment-outlook-engineers-employment-trends-outlook-earnings-occupational-mobility-4384?start_page=56\">less than two weeks of pay<\/a>.<\/p>\n<p>Unfortunately, the indicators were not to last. After four months, all of the devices were collected and shipped off to be recalibrated. They were dismantled and doused in gasoline, which would have required substantial work to put back together, except they then were lost. The devices likely made their way to the Kalibr Factory which started manufacturing similar devices soon after.<\/p>\n<p>I found his life story to often follow the trajectory of <em>The Ghost of the Executed Engineer<\/em>. <em>Ghost<\/em> is a historical work that attempts to answer the question &ldquo;Given how many engineers and scientists held power in the USSR, why did they trail the West?&rdquo;. In the early days of the USSR, government policy encouraged Western experts to come in and train workers on Western approaches. The government also invested in Western equipment (Robinson notes that his factory was still using the equipment from the 30s when he left in the 1970s.). Large engineering efforts used humane approaches and volunteers.<\/p>\n<p>However, coincident with the purges, the Soviets adopted a more nationalist approach, believing their techniques and accomplishments superior to the West. Foreigners were no longer welcome, but suspicious; Western-training a mark of corruption. Projects were given less funding and turned to more manual processes and away from machines (e.g. using criminal labor to dig trenches rather than excavators). Economically, the piece-work system (workers were paid by their output, not by their salary) discouraged productivity and quality investments. The purges virtually eliminated the older intelligentsia, eliminating vast amounts of experience and talent.<\/p>\n<p>Fighting these incentives, Robinson&rsquo;s ambition pushed him to obtain a degree in mechanical engineering which coincided with World War II. To take classes, he traveled six miles across Moscow to the Moscow Evening Institute while the city was being bombed. Moscow had a curfew at 2100, but since his classes ended at 2145, he had to walk the way back. He accounts he was arrested 14 times for breaking curfew.<\/p>\n<p>As this was a Soviet engineering degree, it only included two &ldquo;liberal education&rdquo; classes: a history of the Communist party and a course on political economy. Robinson laments that many of his fellow students were narrow-minded and lacked curiosity. As <em>Ghost<\/em> describes, this was the desired goal per Soviet policy, even though it crippled the typical engineer&rsquo;s ability to be flexible and innovative.<\/p>\n<p>The degree program was also used a cudgel for industrial policy. A month after being told he had successfully graduated, Robinson went to pick up his diploma. (Soviet diplomas were thick packets of material that described your entire period in college; these could be demanded at will by employers.) Robinson was told that he could only be given his diploma if he accepted a new job with a lesser title than his current title; this new job paid 800 rubles (two-thirds his current pay) and required living in an apartment with three other men (he had a private apartment). He refused and push-back from his plant director eventually allowed him to pick up his diploma. Many of his fellow students were not so lucky; those that refused the job offers became blacklisted from engineering jobs.<\/p>\n<p>The book is riveting and is a useful companion to other works on Soviet history, engineering policy, and the experience of the factory working class. 436 pages.<\/p>\n<h1 id=\"where-to-read-it\">Where to read it<\/h1>\n<p><em>Black on Red<\/em> is out-of-print but can be found in used book stores, <a href=\"https:\/\/www.worldcat.org\/title\/16900995\">many libraries<\/a>, and digitally via the <a href=\"https:\/\/openlibrary.org\/works\/OL4977726W\/Black_on_Red\">Open Library<\/a>.<\/p>\n"},{"title":"Availability, Fire Safety, and the IBM 704","link":"https:\/\/ztoz.blog\/posts\/availability-704\/","pubDate":"Thu, 06 Apr 2023 11:15:09 -0600","guid":"https:\/\/ztoz.blog\/posts\/availability-704\/","description":"<p>The SHARE organization was organized in 1955 to share operational knowledge and computer programs for operators of the IBM 704. &ldquo;Operational knowledge&rdquo; included monthly availability reports and, perhaps unexpectedly, safety advice.<\/p>\n<p>Each site reported its own availability via a common form. Each installation was given a two-letter identifier (which, as SHARE grew to more installations, became a branding problem for some sites). The NS site was a government-run facility. This is a typical report from them:<\/p>\n\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/availability-704\/ns-1956-02thru06.jpg\" alt=\"NS Availability 1956 February through June\">\n  \n\n<p>The NS site reported some of the highest availability scores; a 93.6% score was very good, comparatively, particularly since it was over multiple shifts. Compute time covered the actual productive use of the computer, while checkout represented time spent in development and testing activities.<\/p>\n<p>The RAND Corporation sold some of the computer time to other customers. This report from July shows only a small fraction of time sold to other customers (&ldquo;Good Time&rdquo;), but other months showed higher percentages. RAND seemed to struggle with availability as they were usually below 80%. Based on my review, the various sites seemed to average 80%.<\/p>\n\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/availability-704\/rand-1956-07.jpg\" alt=\"RAND Availability 1956-07\">\n  \n\n<p>The most interesting report was this one from RAND in May of 1956:<\/p>\n\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/availability-704\/rand-1956-05-fire.jpg\" alt=\"RAND Availability 1956-05\">\n  \n\n<p>The footnote &ldquo;Main frame fire&rdquo; was elaborated in a letter from Irwin Greenwald that also provides some useful fire tips:<\/p>\n\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/availability-704\/rand-1956-fire-pg1.jpg\" alt=\"RAND Fire Suggestions\">\n  \n\n\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/availability-704\/rand-1956-fire-pg2.jpg\" alt=\"RAND Fire Suggestions\">\n  \n\n"},{"title":"Surviving the Blizzard: An Application of Markov Chains ","link":"https:\/\/ztoz.blog\/posts\/surviving-blizzard\/","pubDate":"Mon, 02 Jan 2023 17:16:12 -0700","guid":"https:\/\/ztoz.blog\/posts\/surviving-blizzard\/","description":"<p>In the <em>Call of Cthulhu<\/em> <a href=\"https:\/\/www.drivethrurpg.com\/product\/1677\/Blood-Brothers-II\">adventure<\/a> &ldquo;Chateau of Blood&rdquo;, the characters are faced between spending the day trapped in an ominous chateau where they will likely be attacked by monsters or attempting to navigate a treacherous mountain trail as a ferocious blizzard blows. Inspired by <a href=\"https:\/\/www.hammerfilms.com\/\">Hammer films<\/a>, the characters are encouraged to stay inside the chateau, find clues, and face the monsters rather than go out into almost certain doom. However, as this is a Hammer film, few are expected to survive the supernatural horrors. What <em>are<\/em> the character&rsquo;s chance in the storm? We&rsquo;ll use Markov chains and a short Rust program to work out the odds.<\/p>\n<h1 id=\"formulating-the-approach\">Formulating the Approach<\/h1>\n<p>If the characters decide to leave the (relative) safety of the chateau and attempt to travel to Karlensburg (the closest town) while the blizzard rages, the adventure describes the mechanics thusly:<\/p>\n<blockquote>\n<p>In these conditions, Karlensburg is a ten-hour walk and a seven-hour horseback ride. Travel by coach is impossible. Roll D100 each hour. Initial CON x5 roll or lose 1 hit point the first hour; in the second hour, the lose for a missed CON roll is 2 hit points, in the third hour 3 hit points, and so on. Further, each time a CON roll is failed, thereafter lower the multiplier by one: thus one failed CON roll lowers the multiplier to x4. [1]<\/p>\n<\/blockquote>\n<p>At this point in the adventure, characters may have suffered damage. They may or may not have access to horses, depending on earlier events.<\/p>\n<p>What are the key aspects of this mechanic?<\/p>\n<ol>\n<li>There are four numbers that matter: the character&rsquo;s CON score (which is constant), the number of hours spent traveling, the number of failed rolls, and the amount of damage accumulated.<\/li>\n<li>Trips conclude when the number of hours match the mode of travel. A successful trip is one where damage accumulated is less than the character&rsquo;s hit points.<\/li>\n<li>Each roll of the dice equals an hour of travel.<\/li>\n<li>The die roll is the only mechanism that leads to a change in state.<\/li>\n<\/ol>\n<p>This is a <em>stochastic process<\/em>; a random variable (the die roll) solely maps the event (the success or failure determination) to different numbers (the state) at different times. Since the next state can be computed given only the current state, this process is a <em>first-order Markov process<\/em>. Furthermore, since time moves discretely and state is discrete, this is a <em>Discrete-time Markov Chain<\/em> [2].<\/p>\n<p>This is useful because various facts about discrete-time Markov chains can be computed efficiently, including the probability of various states.<\/p>\n<p>We can express the Markov property as:<\/p>\n<p>$$\nP[X_k=j | X_{k-1} = i] = p_{ij}\n$$<\/p>\n<p>That is, the probability of state <em>X<\/em> at time <em>k<\/em>, given the state at time <em>k-1<\/em>, is equal to the probability stored in the matrix <em>p<\/em> at index i, j.<\/p>\n<p>The matrix <em>p<\/em> must satisfy the conditions [3]:<\/p>\n<ol>\n<li>\\(0 \\le p_{ij} \\le 1\\)<\/li>\n<li>\\(\\underset{j}{\\sum} p_{ij} = 1, i=1, 2, \\mathellipsis, n,\\) which follows from the fact that the states are mutually exclusive and collectively exhaustive<\/li>\n<\/ol>\n<p>If we want to know the probability of being in a state <em>m<\/em> after <em>n<\/em> transitions given an initial state <em>I<\/em> (a column vector), we can compute it using <em>p<\/em> with:<\/p>\n<p>$$\nm = p^n * I\n$$<\/p>\n<p>where <em>m<\/em> will be a column vector containing the probability of each state.<\/p>\n<h1 id=\"implementing-a-solution\">Implementing a Solution<\/h1>\n<p>Since we have decided to use Markov chains, we need a matrix implementation which supports products. This is not a hard reach. Furthermore, because the transition matrix will only contain a maximum of \\(2^{hours}\\) entries, but be of size \\(2^{hours} \\times 2^{hours}\\), we would prefer using sparse matrices to conserve memory. After first considering running the calculation within the browser via WASM, I decided that the &ldquo;universe of solutions&rdquo; was small enough I should instead just write a file with all solutions.<\/p>\n<p>A 7x7 subset of the transition matrix <em>p<\/em>:<\/p>\n<p>$$\n\\begin{pmatrix}\n0\t&amp;\t0\t&amp;\t0\t&amp; \t0 \t&amp;\t0\t&amp;\t0 \t\\\\\n0.6\t&amp;\t0\t&amp;\t0\t&amp;\t0\t&amp;\t0\t&amp;\t0\t\\\\\n0.4 &amp;\t0\t&amp; \t0\t&amp;\t0\t&amp;\t0\t&amp;\t0\t\\\\\n0\t&amp;\t0.6\t&amp;\t0\t&amp;\t0\t&amp;\t0\t&amp;\t0\t\\\\\n0\t&amp;\t0\t&amp;\t0.48\t&amp;\t0\t&amp;\t0\t&amp;\t0\t\\\\\n0\t&amp;\t0.4\t&amp;\t0\t&amp;\t0\t&amp;\t0\t&amp;\t0\n\\end{pmatrix}\n$$<\/p>\n<p>I chose Rust for &ldquo;reasons&rdquo; and, browsing lib.rs, chose <a href=\"https:\/\/docs.rs\/sprs\/latest\/sprs\/#\">sprs<\/a>, a &ldquo;sparse linear algebra library&rdquo;. Alternatively, I could have used Mathematica (which I did use for validation and visualization), Python and <a href=\"https:\/\/docs.scipy.org\/doc\/scipy\/reference\/sparse.html\">scipy<\/a>, Scala, or JavaScript and <a href=\"https:\/\/mathjs.org\/docs\/datatypes\/matrices.html\">math.js<\/a>. If I was feeling very industrious, I could have used either of the sparse libraries that come with Netlib <a href=\"https:\/\/netlib.org\/sparse\/index.html\">sparse<\/a>\/<a href=\"https:\/\/netlib.org\/sparse-blas\/index.html\">sparse-blas<\/a>. The requirements for this implementation are very trivial.<\/p>\n<p>The <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/karlensburg\/-\/blob\/main\/src\/main.rs#L168\"><code>main<\/code><\/a> function loops through all possible CON, hit point, and travel mode possibilities, solving each and outputting the results to stdout formatted as a CSV file:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">main<\/span>() {\n    println!(<span style=\"color:#e6db74\">&#34;hours,CON,hp,alive,dead,alive + dead&#34;<\/span>);\n    <span style=\"color:#66d9ef\">for<\/span> con <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#ae81ff\">3<\/span><span style=\"color:#f92672\">..=<\/span><span style=\"color:#ae81ff\">18<\/span> {\n        <span style=\"color:#66d9ef\">for<\/span> hp <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">..=<\/span><span style=\"color:#ae81ff\">20<\/span> {\n            <span style=\"color:#66d9ef\">for<\/span> hours <span style=\"color:#66d9ef\">in<\/span> vec![<span style=\"color:#ae81ff\">7<\/span>, <span style=\"color:#ae81ff\">10<\/span>] {\n                <span style=\"color:#66d9ef\">let<\/span> character <span style=\"color:#f92672\">=<\/span> Character { con, hp };\n                <span style=\"color:#66d9ef\">let<\/span> (states, mat) <span style=\"color:#f92672\">=<\/span> build_transition_matrix(<span style=\"color:#f92672\">&amp;<\/span>character, hours);\n                <span style=\"color:#66d9ef\">let<\/span> (alive, dead) <span style=\"color:#f92672\">=<\/span> survival(hours, character, states, mat);\n                println!(<span style=\"color:#e6db74\">&#34;{},{},{},{},{},{}&#34;<\/span>, hours, con, hp, alive, dead, alive <span style=\"color:#f92672\">+<\/span> dead);\n            }\n        }\n    }\n}\n<\/code><\/pre><\/div><p>When we build the transition matrix, we need to decide how we are going to an index within the matrix to a state. Within the program, <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/karlensburg\/-\/blob\/main\/src\/main.rs#L7\">TravelState<\/a> is the &ldquo;friendly&rdquo; way to track a character&rsquo;s progress. I originally planned to map the states using a bitmap approach &mdash; use 4 bits for hours, 3 bits for fails, and 5 bits for damage, all packed within a unsigned 16 bit integer. Although the representation can be more clever and save a few bits, we still ended up with unrepresentable states. The transition matrix must have at least one value in every column (each column must sum to 1), so we either need to manually insert dummy values into non-representable states or map indices in a different way.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#75715e\">#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Debug)]<\/span>\n<span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">TravelState<\/span> {\n    <span style=\"color:#e6db74\">\/\/\/ hours spent within the blizzard [0, 7 or 10]\n<\/span><span style=\"color:#e6db74\"><\/span>    hours: <span style=\"color:#66d9ef\">u8<\/span>,\n    <span style=\"color:#e6db74\">\/\/\/ total number of times PC has failed the CON check [0, 5]\n<\/span><span style=\"color:#e6db74\"><\/span>    fails: <span style=\"color:#66d9ef\">u8<\/span>,\n    <span style=\"color:#e6db74\">\/\/\/ accumulated hit point damage from the blizzard [0, 18] or hit points of character\n<\/span><span style=\"color:#e6db74\"><\/span>    damage: <span style=\"color:#66d9ef\">u8<\/span>,\n}\n\n<span style=\"color:#66d9ef\">impl<\/span> TravelState {\n    <span style=\"color:#e6db74\">\/\/\/ Initial state for characters\n<\/span><span style=\"color:#e6db74\"><\/span>    <span style=\"color:#66d9ef\">const<\/span> INIT: <span style=\"color:#a6e22e\">TravelState<\/span> <span style=\"color:#f92672\">=<\/span> TravelState {\n        hours: <span style=\"color:#ae81ff\">0<\/span>,\n        fails: <span style=\"color:#ae81ff\">0<\/span>,\n        damage: <span style=\"color:#ae81ff\">0<\/span>,\n    };\n}\n<\/code><\/pre><\/div><p>I decided to export a <code>states<\/code> vector where the index in the vector matched the transition matrix index. In this way, the matrix could have as small as a rank as possible. Thus, <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/karlensburg\/-\/blob\/main\/src\/main.rs#L74\"><code>build_transition_matrix<\/code><\/a> returns both a transition matrix and a states vector for interpreting the states.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#e6db74\">\/\/\/ Return a vector of travel states and a Markov transition matrix, representing a character\n<\/span><span style=\"color:#e6db74\">\/\/\/ travelling through the blizzard to Karlensburg. The sparse matrix values are transition\n<\/span><span style=\"color:#e6db74\">\/\/\/ probabilities, represented as float in the range 0 to 1.0 (although zeroes are not stored\n<\/span><span style=\"color:#e6db74\">\/\/\/ explicitly in the matrix). The indices of the travel states correspond to the row and column\n<\/span><span style=\"color:#e6db74\">\/\/\/ indices of the transition matrix.\n<\/span><span style=\"color:#e6db74\">\/\/\/\n<\/span><span style=\"color:#e6db74\">\/\/\/ Every state leads to two additional states until the character has finished the journey, after\n<\/span><span style=\"color:#e6db74\">\/\/\/ either seven or ten transitions, in which case the states transition to themselves. For a seven\n<\/span><span style=\"color:#e6db74\">\/\/\/ hour journey, there will be strictly less than 1 + 2 + 4 + ... + 128 = 255 states. For ten\n<\/span><span style=\"color:#e6db74\">\/\/\/ hours, 1023.\n<\/span><span style=\"color:#e6db74\"><\/span><span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">build_transition_matrix<\/span>(\n    character: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">Character<\/span>,\n    hours: <span style=\"color:#66d9ef\">u8<\/span>,\n) -&gt; (Vec<span style=\"color:#f92672\">&lt;<\/span>TravelState<span style=\"color:#f92672\">&gt;<\/span>, sprs::CsMatI<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">f32<\/span>, <span style=\"color:#66d9ef\">usize<\/span><span style=\"color:#f92672\">&gt;<\/span>) {\n    <span style=\"color:#75715e\">\/\/ Since multiple transitions may lead to the same state, prev_states\/next_states is a set\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#75715e\">\/\/ rather than a list so we avoid duplicates.\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> prev_states: <span style=\"color:#a6e22e\">BTreeSet<\/span><span style=\"color:#f92672\">&lt;<\/span>TravelState<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> BTreeSet::new();\n    prev_states.insert(TravelState::INIT);\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> next_states: <span style=\"color:#a6e22e\">BTreeSet<\/span><span style=\"color:#f92672\">&lt;<\/span>TravelState<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> BTreeSet::new();\n\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> vertices: Vec<span style=\"color:#f92672\">&lt;<\/span>TravelState<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> Vec::new();\n    vertices.push(TravelState::INIT);\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> edges: Vec<span style=\"color:#f92672\">&lt;<\/span>(TravelState, TravelState)<span style=\"color:#f92672\">&gt;<\/span> <span style=\"color:#f92672\">=<\/span> Vec::new();\n\n    <span style=\"color:#66d9ef\">for<\/span> hour <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">..=<\/span>hours {\n        <span style=\"color:#66d9ef\">for<\/span> prev_state <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#f92672\">&amp;<\/span>prev_states {\n            <span style=\"color:#66d9ef\">let<\/span> success <span style=\"color:#f92672\">=<\/span> TravelState {\n                hours: <span style=\"color:#a6e22e\">hour<\/span>,\n                fails: <span style=\"color:#a6e22e\">prev_state<\/span>.fails,\n                damage: <span style=\"color:#a6e22e\">prev_state<\/span>.damage,\n            };\n            <span style=\"color:#66d9ef\">let<\/span> fail <span style=\"color:#f92672\">=<\/span> TravelState {\n                hours: <span style=\"color:#a6e22e\">hour<\/span>,\n                fails: <span style=\"color:#a6e22e\">prev_state<\/span>.fails <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>,\n                damage: <span style=\"color:#a6e22e\">prev_state<\/span>.damage <span style=\"color:#f92672\">+<\/span> hour,\n            };\n\n            vertices.push(success);\n            edges.push((<span style=\"color:#f92672\">*<\/span>prev_state, success));\n            next_states.insert(success);\n\n            vertices.push(fail);\n            edges.push((<span style=\"color:#f92672\">*<\/span>prev_state, fail));\n            next_states.insert(fail);\n        }\n\n        mem::swap(<span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#66d9ef\">mut<\/span> prev_states, <span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#66d9ef\">mut<\/span> next_states);\n        next_states.clear();\n    }\n    <span style=\"color:#75715e\">\/\/ for the final hour, the states should transition to themselves\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">for<\/span> prev_state <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#f92672\">&amp;<\/span>prev_states {\n        edges.push((<span style=\"color:#f92672\">*<\/span>prev_state, <span style=\"color:#f92672\">*<\/span>prev_state));\n    }\n    prev_states.clear();\n\n    vertices.sort();\n    vertices.dedup();\n\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> mat <span style=\"color:#f92672\">=<\/span> sprs::TriMatBase::with_capacity((vertices.len(), vertices.len()), edges.len());\n    <span style=\"color:#66d9ef\">for<\/span> (src, dst) <span style=\"color:#66d9ef\">in<\/span> edges {\n        mat.add_triplet(\n            vertices.binary_search(<span style=\"color:#f92672\">&amp;<\/span>dst).unwrap(),\n            vertices.binary_search(<span style=\"color:#f92672\">&amp;<\/span>src).unwrap(),\n            character.prob(src, dst),\n        );\n    }\n\n    (vertices, mat.to_csr())\n}\n<\/code><\/pre><\/div><p>Using a breadth-first approach, I generate states (vertices within the graph) for each hour traveled. I also generate the edges, although the edges aren&rsquo;t labeled with probabilities yet. Terminal states in the transition matrix must be represented as transitioning to themselves, so I have to include some special handling of that case. Finally, we build the sparse matrix by looping through the edge list and calculating the probability of traversal. The probabilities are stored as <code>f32<\/code> or 32-bit floats. Given the small range, I thought of using <code>f8<\/code> or one of the reduced bit formats, but since we are using sparse matrices and we do not need to worry about running in memory constrained browsers, it would be a trivial optimization.<\/p>\n<p>The actual computation of the probability is handled within <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/karlensburg\/-\/blob\/main\/src\/main.rs#L137\"><code>survival<\/code><\/a>. The initial state vector has a 1 in the first index (corresponding to the initial state being stored first in the vector), representing that, absent a transition, there is a 100% chance of being in the initial state. Then, we perform the multiplications and sum the results based on whether a particular travel state would indicate that the character survived or not.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#e6db74\">\/\/\/ Compute the survival probability (probability of surviving alive, probability of dying in the\n<\/span><span style=\"color:#e6db74\">\/\/\/ blizzard) for a character travelling the specified number of hours. The `states` and `transition`\n<\/span><span style=\"color:#e6db74\">\/\/\/ arguments should be the output of the `build_transition_matrix` function.\n<\/span><span style=\"color:#e6db74\"><\/span><span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">survival<\/span>(\n    hours: <span style=\"color:#66d9ef\">u8<\/span>,\n    character: <span style=\"color:#a6e22e\">Character<\/span>,\n    states: Vec<span style=\"color:#f92672\">&lt;<\/span>TravelState<span style=\"color:#f92672\">&gt;<\/span>,\n    transition: <span style=\"color:#a6e22e\">sprs<\/span>::CsMatI<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">f32<\/span>, <span style=\"color:#66d9ef\">usize<\/span><span style=\"color:#f92672\">&gt;<\/span>,\n) -&gt; (<span style=\"color:#66d9ef\">f32<\/span>, <span style=\"color:#66d9ef\">f32<\/span>) {\n    <span style=\"color:#66d9ef\">let<\/span> initial <span style=\"color:#f92672\">=<\/span> sprs::CsVec::new(states.len(), vec![<span style=\"color:#ae81ff\">1<\/span>], vec![<span style=\"color:#ae81ff\">1.0<\/span><span style=\"color:#66d9ef\">f32<\/span>]);\n\n    <span style=\"color:#75715e\">\/\/ Within a Markov chain, the probability after n transitions of a transition matrix A\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#75715e\">\/\/ is equal to A**n * x, where x is in the initial state column vector.\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> m <span style=\"color:#f92672\">=<\/span> transition;\n    <span style=\"color:#66d9ef\">for<\/span> _ <span style=\"color:#66d9ef\">in<\/span> <span style=\"color:#ae81ff\">0<\/span><span style=\"color:#f92672\">..<\/span>hours {\n        m <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>m <span style=\"color:#f92672\">*<\/span> <span style=\"color:#f92672\">&amp;<\/span>m;\n    }\n    m <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">&amp;<\/span>m <span style=\"color:#f92672\">*<\/span> <span style=\"color:#f92672\">&amp;<\/span>initial.col_view();\n\n    <span style=\"color:#75715e\">\/\/ Why treat the two probabilities as independent? This is a way to verify the process has\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#75715e\">\/\/ resulted in values where p and not p is equal to 1 (or sufficiently close).\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> alive_prob <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0.0<\/span>;\n    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> dead_prob <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0.0<\/span>;\n    <span style=\"color:#66d9ef\">for<\/span> (prob, (row, _)) <span style=\"color:#66d9ef\">in<\/span> m.iter() {\n        <span style=\"color:#66d9ef\">if<\/span> character.is_dead(states[row]) {\n            dead_prob <span style=\"color:#f92672\">+=<\/span> prob;\n        } <span style=\"color:#66d9ef\">else<\/span> {\n            alive_prob <span style=\"color:#f92672\">+=<\/span> prob;\n        }\n    }\n\n    (alive_prob, dead_prob)\n}\n<\/code><\/pre><\/div><h1 id=\"the-odds\">The Odds<\/h1>\n<p>Assuming each character has taken no prior damage (not a sure thing in this adventure), the probability of successfully making the trip to Karlensburg is:<\/p>\n<table>\n<thead>\n<tr>\n<th>Character Name<\/th>\n<th>CON<\/th>\n<th>HP<\/th>\n<th>P(success) | 7 hr<\/th>\n<th>P(success) | 10 hr<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Anton Berrudzeck<\/td>\n<td>14<\/td>\n<td>15<\/td>\n<td>59<\/td>\n<td>16<\/td>\n<\/tr>\n<tr>\n<td>Boris Sturkl<\/td>\n<td>15<\/td>\n<td>15<\/td>\n<td>67<\/td>\n<td>24<\/td>\n<\/tr>\n<tr>\n<td>Hans Winklemann<\/td>\n<td>13<\/td>\n<td>11<\/td>\n<td>33<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>Johann<\/td>\n<td>18<\/td>\n<td>18<\/td>\n<td>93<\/td>\n<td>68<\/td>\n<\/tr>\n<tr>\n<td>Belinda Chadaver<\/td>\n<td>14<\/td>\n<td>12<\/td>\n<td>46<\/td>\n<td>13<\/td>\n<\/tr>\n<tr>\n<td>Beatrice Chadaver<\/td>\n<td>16<\/td>\n<td>16<\/td>\n<td>78<\/td>\n<td>36<\/td>\n<\/tr>\n<tr>\n<td>Phillipe d&rsquo;Isigny<\/td>\n<td>11<\/td>\n<td>12<\/td>\n<td>23<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>Yurik Drozopczech<\/td>\n<td>14<\/td>\n<td>14<\/td>\n<td>56<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As you can see, the seven hour trip is treacherous but survivable for a few characters. The ten hour trip is dangerous even for Johann. The probability of the entire party making the journey successfully is a mere <em>0.5%<\/em> and the ten hour trip 0.00004%.<\/p>\n<p>If we plot the survivability matrix for all CON scores 3 to 18 and hit points of 1 to 20 (below; <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/karlensburg\/-\/blob\/main\/probabilities.csv\">csv<\/a>), we can see that the probabilities are pretty grim, particularly if characters have taken damage before.<\/p>\n<p><img src=\".\/TravelKarlensburg.png\" alt=\"Survivability Matrix CONxHP\" title=\"Survivablity Probability CON x HP\"><\/p>\n<p>Of course, the alternative may provide even less of a chance of survival&hellip;<\/p>\n<h1 id=\"references\">References<\/h1>\n<p>[1] Love, Penelope. &ldquo;Chateau of Blood.&rdquo; <em>Blood Brothers 2<\/em>, edited by Lynn Willis, Chaosium Inc., 1992, 23-37.<\/p>\n<p>[2] Ibe, Oliver C.&ldquo;Introduction to Markov Processes.&rdquo; <em>Markov Processes for Stochastic Modeling<\/em>, Elsevier Academic Press, 2009, 45-46.<\/p>\n<p>[3] Ibe, Oliver C.&ldquo;Discrete-Time Markov Chains.&rdquo; <em>Markov Processes for Stochastic Modeling<\/em>, Elsevier Academic Press, 2009, 55-58.<\/p>"},{"title":"Slaying Dragon","link":"https:\/\/ztoz.blog\/posts\/slaying-dragon\/","pubDate":"Tue, 27 Dec 2022 11:15:27 -0700","guid":"https:\/\/ztoz.blog\/posts\/slaying-dragon\/","description":"<p>Slaying the Dragon: A Secret History of Dungeons &amp; Dragons\nby Ben Riggs<\/p>\n<p><a href=\"https:\/\/www.writerbenriggs.com\/slaying-the-dragon\">Author&rsquo;s website<\/a><\/p>\n<p><em>Slaying the Dragon<\/em> is a business history of TSR, focusing on the Lorraine Williams period (1985-1997). During this period, TSR recovered from the excesses of the Gygax\/Blume Brothers period, launched the 2nd edition rules, novels line, and attempted pivots and expansions of their IP. However, the sales declined continued until they were purchased by Wizards of the Coast. Ben Riggs chronicles the company&rsquo;s history, key products, and their strategic mistakes.<\/p>\n<p>The book is laid out in four parts: The Rise and Fall of Gary Gygax, The Williams Era, Slapping a Band-Aid on Failures, and The Preacher&rsquo;s Son Performs a Resurrection. The second and third parts are the heart of the book and the most successful. The key challenge for TSR during the Williams era was to arrest the downward slide in sales. Williams had prevented the collapse of the company and made it more stable, but revenue continued to drop. Organizing his argument thematically, Riggs covers some missteps (such as TSR West and the attempt to enter comic publishing), his thesis is that these three factors were the cause of the decline:<\/p>\n<ol>\n<li>Lack of talent management<\/li>\n<li>Fish-bait Strategy fueling decline in sales<\/li>\n<li>Random House contract leading to untenable debt<\/li>\n<\/ol>\n<p>Although TSR found and matured many creative geniuses, TSR had difficulty retaining talent. This was partly due to policies that could not scale rewards to a person&rsquo;s impact (in the 70s, authors could gain royalties on sales, but this was not an option in the 80s or 90s) and partly due to personnel issues which led to acrimonious departures.<\/p>\n<p>The Fish-bait Strategy hypothesized that new settings would attract new players, similar to how fish are only attracted to specific types of bait. This strategy led to the release of multiple D&amp;D settings such as Al-Qadim, Dark Sun, Planescape, and Birthright. However, rather than attracting new players, these settings came to <em>split<\/em> the customer base, who would not purchase products for settings they didn&rsquo;t play. Due to TSR&rsquo;s policies that hid sales data, the fact that this strategy was not working was hid from middle-management and, year after year, they kept on dividing their market.<\/p>\n<p>Random House was TSR&rsquo;s key distributor and their channel for selling to the key bookstore market. (TSR had their own channels to the hobby store market, but internally they prized the bookstore market far more.) Riggs discovers a key financial element in TSR&rsquo;s downfall in their odd contract with Random House. Unlike a traditional distribution contract, TSR was paid in loans from Random House at the beginning of the year based on their shipped product. Because it was based on shipped product and not sold product, TSR was insulated from poor sales, but paid for it in further debt. Across a number of chapters, Riggs covers the implications of this contract and how it warped TSR&rsquo;s operations. Like many debtors, things could appear fine until the debt was called, at which point TSR had an existential crisis.<\/p>\n<p>For TSR, that existential crisis was solved by being purchased by Wizards of the Coast, which was heavily capitalized by sales of <em>Magic: The Gathering<\/em>. The last part of the book covers the sale and the early steps Peter Adkinson, WoTC&rsquo;s CEO, took to heal the business.<\/p>\n<p>The first part of the book is the weakest. This part of the company&rsquo;s history was covered in far greater depth and analysis in Jon Peterson&rsquo;s <a href=\"https:\/\/ztoz.blog\/posts\/game-wizards\"><em>Game Wizards<\/em><\/a>. (Peterson is referenced as a source by Riggs, but not the Game Wizards book itself.) Further, Rigg&rsquo;s text suffers from some clunky passages such as &ldquo;What did the bankers think of this company that had an entire hall filled with windup toys, marching and buzzing and glowing, that seemed to serve no discernable purpose? Were company resources invested in this? Was there some wage slave whose only job was to wind the toys and then hide themselves?&rdquo; (page 25). Since the company did acquire loans, it is unclear if TSR was leading the zeitgest towards more casual companies as typified by start-ups or made the loans in spite of the unprofessional atmosphere.<\/p>\n<p>The book contains a color stills of select pieces of artwork, TSR offices, Gen Cons, and employees. Since the book is organized more thematically than chronologicaly, I think the book would have been better served by a unified timeline. For example, chapter 11&rsquo;s &ldquo;The Sage of TSR West&rdquo;, which covers TSR&rsquo;s attempt to become an independent comic (book\/module) publisher, uses vague dates of &ldquo;late &rsquo;80s and early &rsquo;90s&rdquo; (1989 to 1991 are defendable dates). Given the recurrence of characters from earlier chapters such as &ldquo;D&amp;D and Living the Riot Life in LA&rdquo; and how this parallels other attempts at expanding revenue such as the book line, a timeline would help clarify if the various efforts were part of a coherent strategy or a shotgun of tactics.<\/p>\n<p>Methodologically, the book draws heavily on oral histories from many of the participants. (Although the author mentions other sources, like court cases and contracts, they are not &ldquo;inserted&rdquo; into the document like in Peterson&rsquo;s book.) One strength is that differences in recall are noted and the author seems to treat every perception with skepticism. Unfortunately, Lorraine Williams herself refused to be interviewed. As the author notes, since TSR kept sales and operations data from their employees, we still have minimal insight into many of the decisions. The best data we have comes from the due diligence documents from the Wizards sale.<\/p>"},{"title":"Simplifying Logic","link":"https:\/\/ztoz.blog\/posts\/slack-notification\/","pubDate":"Tue, 15 Nov 2022 13:19:49 -0600","guid":"https:\/\/ztoz.blog\/posts\/slack-notification\/","description":"<p><a href=\"https:\/\/slack.engineering\/reducing-slacks-memory-footprint\/\">Slack&rsquo;s business logic for showing notification<\/a> periodically appears in LinkedIn posts and such. In Slack&rsquo;s original post, this diagram was meant to illustrate what logic was being transferred from the multiple Slack clients to the server, but the diagram has since been taken as an example of <a href=\"https:\/\/www.linkedin.com\/posts\/dmitry-samoylovskikh_as-we-move-closer-to-the-launch-of-uniborn-activity-6989304570250158080-m17N\">product complexity<\/a> and <a href=\"https:\/\/www.linkedin.com\/posts\/markseo_dev-webdevelopment-programming-activity-6944989675979739136-TmFt\">how development is harder than it may sound<\/a>. In contrast, I think the diagram obscures the intended logic, but the logic itself is not complex. I&rsquo;ll describe how to simplify it and where diagrams might not be the right approach to convey requirements.<\/p>\n<p>The diagram in question:<\/p>\n<p><img src=\"notification-logic.jpeg\" alt=\"Notification Logic\" title=\"Slack's logic for showing notifications\"><\/p>\n<p>First, some praise. The diagram is well laid out, with only a few cases of lines crossing each other. The text is succinct and largely clear. The representation is also compact and there&rsquo;s good reason to believe the diagram is complete (at least at the time of publication<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>). I also appreciate the actual blog post; moving logic to the server tends to reduce bugs and improve velocity via re-use and eliminating duplication. Reducing memory footprint is also a user benefit. Finally, the post was not about the business logic itself, so the diagram comes to us without context.<\/p>\n<p>The blog post describes the diagram as a &ldquo;workflow diagram&rdquo;, but since there is no <em>work<\/em> being described (e.g. fetching of data), only decisions, this is closer to a decision graph. (Some nodes are given multiple parents, so as written, this is not a tree, but it could be translated into one.) The graph has two terminal nodes, &ldquo;Yes&rdquo; and &ldquo;No&rdquo;, to represent whether a notification should be sent to the user on the running device. Other nodes represent a decision, mostly boolean, although some nodes have multiple enumerated values. Edges are labeled with the value. (The purple &ldquo;Pref Value&rdquo; nodes could be written as edges instead.)<\/p>\n<p>I&rsquo;m going to ignore the logic after the &ldquo;Yes&rdquo; terminal. This is the sole part of the diagram that deals with temporal matters (mobile push timeouts) and requires  a note that discusses interactions about message state changes between clients. This part of the business logic should have been addressed separately, probably via a sequence diagram.<\/p>\n<p>Some nodes (e.g. the &ldquo;Mentions&rdquo; blocks) have multiple non-exclusive children. This creates ambiguity in how we interpret the semantics, particularly since the diagram does not follow a formal standard. For instance, if a message contains both an @here and a highlighted word, it is possible to follow the logic such that we arrive at both the YES and NO states<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup>. Given the use case, I believe the intended semantics is to resolve to YES if possible. Thus, the execution semantics suggest we can have multiple &ldquo;tokens&rdquo; representing the current decision state and if any arrive at the YES terminal state, we will notify.<\/p>\n<p>Interestingly, the graph is cyclic. A direct message within a thread that the user has unsubscribed leads to a loop. I think this is an error in the diagram, probably caused by the author trying to represent the diagram&rsquo;s support for multiple concurrent states. Without this single case, the diagram is acyclic.<\/p>\n<p>A large part of the visual complexity in the diagram comes from the fact that a user may have both channel preferences and global preferences. This leads to many shared nodes and represents an opportunity for simplification.<\/p>\n<h1 id=\"simplifying\">Simplifying<\/h1>\n<p>First, we re-draw the diagram with the intent of capturing the same logic, but in an editable diagram that will show the future simplifications:<\/p>\n<p><img src=\"SlackNotificationLogic-OrigRedrawn.drawio.png\" alt=\"Notification diagram redrawn\" title=\"Redrawn version of Slack's logic for notifications\"><\/p>\n<p>Slack&rsquo;s diagram is effectively a representation of propositional logic; discrete values that can be combined logically (and, or, not, &hellip;). Focusing on the bottom of the figure, we can create a table of the variables:<\/p>\n<table>\n<thead>\n<tr>\n<th>Var<\/th>\n<th>Definition<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>A<\/td>\n<td>Channel notification pref is &ldquo;Nothing&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>B<\/td>\n<td>Channel notification pref is &ldquo;Everything&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>C<\/td>\n<td>Channel notification pref is &ldquo;Mentions&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>D<\/td>\n<td>Channel notification pref is &ldquo;Default&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>E<\/td>\n<td>Global notification pref is &ldquo;Never&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>F<\/td>\n<td>Global notification pref is &ldquo;All&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>G<\/td>\n<td>Global notification pref is &ldquo;Mentions&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>H<\/td>\n<td>Device is Mobile<sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote-ref\" role=\"doc-noteref\">3<\/a><\/sup><\/td>\n<\/tr>\n<tr>\n<td>J<\/td>\n<td>Message is a @here<\/td>\n<\/tr>\n<tr>\n<td>K<\/td>\n<td>Message is a direct message (DM)<\/td>\n<\/tr>\n<tr>\n<td>L<\/td>\n<td>Message is a @mention<\/td>\n<\/tr>\n<tr>\n<td>M<\/td>\n<td>Message is a highlight word<\/td>\n<\/tr>\n<tr>\n<td>N<\/td>\n<td>Message is a comment on a file<\/td>\n<\/tr>\n<tr>\n<td>P<\/td>\n<td>User presence active?<\/td>\n<\/tr>\n<tr>\n<td>Q<\/td>\n<td>Thread message?<\/td>\n<\/tr>\n<tr>\n<td>R<\/td>\n<td>User subscribed?<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Variables A through C and E through G are exclusionary; the preference can only be in one of the states. Furthermore, the logic only cares about global notification preferences if channel notification preferences are set to &ldquo;Default&rdquo;. Since messages can contain content that matches multiple forms (e.g. a direct message can also mention a name, include a term that the user has chosen as a highlighted word, <em>and<\/em> be a comment on a file), J through N are not exclusionary variables. However, this suggests we can simplify by creating &ldquo;functions&rdquo; that encapsulate repeated logic:<\/p>\n<table>\n<thead>\n<tr>\n<th>Var<\/th>\n<th>Definition<\/th>\n<th>Note<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>T<\/td>\n<td>not Q or (Q and R)<\/td>\n<td>User has not unsubscribed from thread<\/td>\n<\/tr>\n<tr>\n<td>X<\/td>\n<td>A or (D and E)<\/td>\n<td>Unified preference is &ldquo;Do not notify&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>Y<\/td>\n<td>B or (D and F)<\/td>\n<td>Unified preference is &ldquo;Always notify&rdquo;<\/td>\n<\/tr>\n<tr>\n<td>Z<\/td>\n<td>C or (D and G)<\/td>\n<td>Unified preference is &ldquo;Mentions&rdquo;<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>With our functions and the nine non-preference variables at the bottom of the diagram, we have 13 variables. This is too many variables for a Karnaugh Map, which is our usual go-to means to simplify logic. However, we can visually trace through the preference logic. The &ldquo;Everything\/All&rdquo; and &ldquo;Nothing\/Never&rdquo; paths are easy to verify as equivalent. The &ldquo;Mentions&rdquo; paths are unfortunately harder to follow.<\/p>\n<p>&ldquo;Mentions&rdquo; is where the concurrent states allowed by the diagram&rsquo;s language complicate things, because we want to equate a <em>channel<\/em> preference for Mentions  to the <em>global<\/em> preference of Mentions, but the Global logic elevates direct mentions and highlight words as parallel choices to the preference value.<\/p>\n<p>A likely error in the figure is the &ldquo;Highlight Words (mobile)&rdquo; path under the Global notification pref. The next block in the path checks for either @mention or a highlighted word, and will then trigger a notification if T is true. However, since there is a parallel path under Global notification for Mentions, and that path will trigger a @mention regardless of the value of T, this implies unnecessary checks if a message contains both an @mention and a highlighted word. Removing the @mention line eliminates an unnecessary complication on the mobile side.<\/p>\n<p>With that, we want to prove that:<\/p>\n<pre tabindex=\"0\"><code>channel notification pref set to Mentions and at least one of:\n\tK and T\n\tL or N\n\tJ and P\n\tM and T\n<\/code><\/pre><p>is equivalent to:<\/p>\n<pre tabindex=\"0\"><code>channel notification pref set to Default and at least one of:\n\tG and ((M and T) or (J and P) or (L or N))\n\tH and K and T\n\tH and M and T\n<\/code><\/pre><p>The Mentions logic aligns <em>except<\/em> for direct messages. If the channel notification pref is set to Mentions, a direct message will notify as long as T is also true. However, if there is no channel preference (so Default), then a direct message on the desktop &hellip; is not represented. Since direct messages, based on personal experience, can notify on the desktop, then I&rsquo;m obliged to suspect another error in the diagram and that &ldquo;DMs (mobile)&rdquo; is really just &ldquo;DMs&rdquo;.<\/p>\n<p>With these corrections and simplifications, we can present an updated diagram as:<\/p>\n<p><img src=\"SlackNotificationLogic-Simplified.drawio.png\" alt=\"Notification diagram simplified\" title=\"Simplifying diagram by combining repeated logic\"><\/p>\n<p>Alternatively, we could represent this logic (again focusing on the bottom) in pseudo-logic or code. Decision logic often represents well using pattern matching, so I present an alternate representation:<\/p>\n<pre tabindex=\"0\"><code>case X =&gt; False\ncase Y =&gt; T\ncase Z if highlight word or direct message =&gt; T\ncase Z if @here =&gt; user presence active and T\ncase Z =&gt; @mention or (desktop device and comment on file)\n<\/code><\/pre><h1 id=\"diagrams-for-requirements\">Diagrams for Requirements<\/h1>\n<p>In general, I&rsquo;m a fan of visual languages to describe activities and interactions. I&rsquo;m less of a fan for using diagrams to convey structure in software (e.g. UML class diagrams, entity relationship diagrams) or decision logic because these tend to become eye charts. When representing decisions visually, keep these factors in mind:<\/p>\n<ol>\n<li>\n<p>Flowcharts\/decision trees impose precedence. The diagram language states that you check one factor before following a branch to check another. In contrast, the reading order of symbolic logic is understood as not imposing precedence, unless parentheses are used. Sometimes the order matters and sometimes it is arbitrary. I have seen diagrams created that show the optimal order of checks (based on the cost of making them), but often it is meaningless. Implementers of the logic need to understand if the diagram&rsquo;s order is intentional or not.<\/p>\n<\/li>\n<li>\n<p>Unless the diagram allows hierarchy or authors are willing to place complex checks into a node, repeated logic may appear within the diagram. This can lead to duplicative code which increases the risk of bugs and increases the cost of change.<\/p>\n<\/li>\n<li>\n<p>If someone wanted 100% coverage for the decision logic, they could face writing ((2^19)) (or more) tests, depending on how they assign variables. If they just follow the diagram so as to eliminate all the don&rsquo;t care states, they will need to trace through the diagram over and over. Since the diagram does not follow a formal representation, there are no tools to aid in processing or translating the logic. In contrast, if this was built from a library of test cases or a formal representation (e.g. a programming language), the authors could leverage software tools.<\/p>\n<\/li>\n<\/ol>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Since the publication in 2017, Slack has shifted the data model from channels of various types to a single model of conversations.&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>In the @here path, if the user presence is not active, the @here path will resolve to no. In the highlighted word path, if the message is not threaded  or the message is threaded and the user is subscribed, this will resolve to yes.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:3\" role=\"doc-endnote\">\n<p>We are modeling device types as H=1 meaning mobile and H=0 meaning desktop. The diagram only provides for the two states, but this may not be a good forward-looking modeling choice.&#160;<a href=\"#fnref:3\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>"},{"title":"A Pain Scale for On-call?","link":"https:\/\/ztoz.blog\/posts\/pain-scale-for-oncall\/","pubDate":"Mon, 19 Sep 2022 14:15:52 -0600","guid":"https:\/\/ztoz.blog\/posts\/pain-scale-for-oncall\/","description":"<p>Being on-call is often a necessary part of the job, particularly for engineers in a SaaS business. The burden of operations often negatively impacts morale and productivity. If we were to estimate the impact on a team, we could build a model based on incident frequency, ticket severity, time of alert versus sleeping schedules, and other operational metrics. Alternatively, we can ask the on-call engineers directly, which should be more accurate, and use the metrics to help drive improvements. This article describes an on-call experience program intended to be integrated into an operational review system.<\/p>\n<h1 id=\"what-causes-pain\">What causes pain<\/h1>\n<p>On-call related pain in a software-as-a-service business comes from a variety of sources including:<\/p>\n<ol>\n<li>Loss of freedom (need to be available to respond to an alert\/tethered to a phone)<\/li>\n<li>Unplanned interruptions from alerts or new tickets<\/li>\n<li>Loss of sleep due to alerts (or stress inducing insomnia)<\/li>\n<li>Escalations and stress from needing to respond rapidly<\/li>\n<li>Discomfort from having to fix problems in unfamiliar or unpleasant systems<\/li>\n<\/ol>\n<p>The pain is both physiological and psychological, but also very subjective and, crucially, not directly measurable.<\/p>\n<h1 id=\"is-it-all-pain\">Is it all pain?<\/h1>\n<p>No! On-call can be enjoyable. First, joining on-call is a sign of trust from the team that the individual can handle problems on their own. It is a graduation event. Secondly, on-call (ideally) is a time to focus on improvements that help both the clients and the team. Since the scope of improvements is usually small, an engineer can start and finish the effort within a few days. The ability to quickly iterate and make improvements can be very rewarding. On-call can also be a break from other projects, but a &ldquo;blessed&rdquo; break that is still helping the team.<\/p>\n<h1 id=\"experience-program-objectives\">Experience Program Objectives<\/h1>\n<p>An experience program is a set of activities designed to help measure and drive toward some ideal experience. An ideal on-call experience should be motivating and show clear improvements in availability, reliability, and operational cost. A new on-call experience program, and its related metrics, should be:<\/p>\n<ol>\n<li><em>Actionable<\/em> &mdash; both individuals and teams should be able to leverage the metrics to make on-call more pleasant and effective<\/li>\n<li><em>Reliable<\/em> &mdash; repeated measurements under similar situations will yield similar measures; this is necessary if individual scores will be aggregated<\/li>\n<li><em>Valid<\/em> &mdash; validity takes many forms:\n<ol>\n<li>face validity &mdash; program participants and those examining the results will accept the test as an acceptable and agreeable approach<\/li>\n<li>content validity &mdash; content of the program will fit within the overall operational on-call program<\/li>\n<li>concurrent validity &mdash; results from this program will correlate with future programs<\/li>\n<li>construct validity &mdash; results of this program will correlate with operational signals in an anticipated way; for example, the &ldquo;alert&rdquo; positive affect will be negatively correlated with at-night alerts and &ldquo;nervous&rdquo; will correlate with the use of new, high-risk maintenance actions<\/li>\n<\/ol>\n<\/li>\n<li><em>Ratio-like\/Scalar-like<\/em> &mdash; the affect metrics are naturally ordinal (ordered), but may not necessarily be scalar (subject to addition\/subtraction\/percentage change). If we can demonstrate that we can treat the results numerically, the results can then be treated as trends and can be compared across teams<\/li>\n<\/ol>\n<p>Details for how we measure and implement these objectives are in the <a href=\"#program-validation\">Validation<\/a> section below.<\/p>\n<h1 id=\"design\">Design<\/h1>\n<p>This design is based on the <a href=\"https:\/\/depts.washington.edu\/uwcssc\/sites\/default\/files\/Positive%20and%20Negative%20Affect%20Schedule%20%28PANAS%29%20Short%20Form.pdf\">I-PANAS-SF<\/a> (International Positive and Negative Affect Schedule Short Form) methodology for measuring emotional affect.<\/p>\n<p>On the day after an individual has been on-call for seven or more hours, they will be prompted to answer this survey:<\/p>\n<blockquote>\n<p>Question: Thinking about your experience during yesterday&rsquo;s on-call, to what extent did you generally feel:<\/p>\n<\/blockquote>\n<p>This will be followed by ten multiple choice questions. The categories (affect categories) are Upset, Hostile, Alert, Ashamed, Inspired, Nervous, Determined, Attentive, Afraid, and Active. For each category, the respondent may be answer never, 1 through 5 inclusive, or always. This question may be asked via a matrix-type question:<\/p>\n<p><img src=\"..\/pain-scale-survey-question.png\" alt=\"affect-question in a matrix\" title=\"Example matrix presentation\"><\/p>\n<h2 id=\"scoring\">Scoring<\/h2>\n<p>A choice of &ldquo;never&rdquo; has a value of 0; &ldquo;always&rdquo; has a value of 6.<\/p>\n<p><strong>Positive Affect<\/strong> (PA) is the sum of values for Alert, Inspired, Determined, Attentive, and Active.<\/p>\n<p><strong>Negative Affect<\/strong> (NA) is the sum of values for Upset, Hostile, Ashamed, Nervous, and Afraid.<\/p>\n<p>The range of PA and NA is 0 to 30 (inclusive).<\/p>\n<h2 id=\"applying-the-program-as-an-individual\">Applying the program as an individual<\/h2>\n<p>An individual may find value using this program, even if they do not share data with anyone else. As a form of <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3444304\/\">affect labeling<\/a> or putting feelings into words, the process has been shown to promote emotional regulation. This can <a href=\"https:\/\/expertprogrammanagement.com\/2021\/02\/the-panas-scale\/\">help individuals<\/a> reduce stress and mentally focus on constructive solutions.<\/p>\n<p>Studies suggest PANAS scores are stable over time and, although they are influenced by the immediate emotional state, they will regress to an individual&rsquo;s long-term emotional state. By recording their affect while not on-call, an individual can start to quantify their &ldquo;base&rdquo; position and detect how much variability they see in their life, which can help in overall emotional tracking and control.<\/p>\n<h2 id=\"applying-the-program-within-a-team\">Applying the program within a team<\/h2>\n<p>Even if scores have not been statistically shown as comparable between individuals, as part of their normal operational review cadence, a team can examine the scores (both in aggregate and individual affect) for the previous term. The team should discuss any unusually strong affect, both positive or negative. For instance, the affects of nervous and afraid may reveal risks that the individual feels are not being mitigated.<\/p>\n<p>The team should focus on <em>learnings<\/em> and <em>actions<\/em> before attempting to do any statistical analysis. The program requires reflection from the team to do good and the use of quantitative measures is intended to promote thinking, rather than being the end itself.<\/p>\n<h2 id=\"applying-the-program-between-teams\">Applying the program between teams<\/h2>\n<p>Although it would be tempting to use the PA and NA scores as proxy signal for manager effectiveness, the raw scores are subject to both individual and cultural differences. Teams sharing their lessons and actions will be more useful.<\/p>\n<p>Over time, if teams show scores are reliable, then the data might be able to be transformed into a normalized and standardized form, such that it can be compared between teams. I anticipate that not all teams are equal and that teams involved in certain types of operations will cluster around certain values, similar to how mean NPS varies between industries.<\/p>\n<h1 id=\"rationale-and-major-alternatives-considered\">Rationale and Major Alternatives Considered<\/h1>\n<h2 id=\"i-panas-sf-and-top-level-design-decisions\">I-PANAS-SF and Top-level Design Decisions<\/h2>\n<p>The Positive and Negative Affect Schedule (PANAS) is a standardized, well-studied and well-used test for measuring emotional affect. The I-PANAS-SF is a modification of PANAS that reduces the number of questions by eliminating questions that are correlated together as well as changing some of the affect terminology to make scores more consistent across multiple cultures.<\/p>\n<p>I chose a metric based on emotional affect because it can capture both positive and negative feelings and is multi-dimensional, which I anticipate will yield more learnings than a straight intensity score. My initial focus was on measuring pain, but I expect focusing on pain as the measure will bias measures negatively. Furthermore, from my experience, I sometimes looked forward to being on-call, and I think an experience program should try to capture the &ldquo;full&rdquo; experience.<\/p>\n<p>Creating new affect categories requires significant data collection, so using existing scales is economically the right choice. Many companies have internationally distributed teams, so using a scale that has been studied cross-culturally and has high-quality translations available should improve the test&rsquo;s reliability.<\/p>\n<h2 id=\"numeric-rating-scale-nrs-11-and-other-measures-of-pain-intensity\">Numeric Rating Scale (NRS-11) and other measures of pain intensity<\/h2>\n<p>The NRS-11 is a commonly used test for measuring emotional intensity. The test uses an eleven point scale, anchored at zero for &ldquo;no pain&rdquo; and ten for &ldquo;worst pain imaginable&rdquo;, and can be self-administered.<\/p>\n<p>Administering the NRS-11 would be similar to a Net Promotion Score (NPS) or satisfaction score, through which there are ample commercial solutions. However, there are several limitations in this approach:<\/p>\n<ol>\n<li>NRS-11 has not shown &ldquo;ratio&rdquo; properties, so changes of the number over time should not be interpreted as a linear number. (Some measures that are similar to NRS-11 have shown ratio properties, so a correction may be possible.)<\/li>\n<li>As a measure of pain intensity, this test is unable to measure positive aspects of on-call<\/li>\n<li>As a single value, this test does not capture any of the flavor of the pain. The test could be expanded with a free-form response (similar to NPS), but free-form data is difficult to standardize across teams.<\/li>\n<\/ol>\n<h2 id=\"mcgill-pain-questionnaire-and-other-measures-of-pain-affect\">McGill Pain Questionnaire and other measures of pain affect<\/h2>\n<p>The McGill Pain Questionnaire is a common and well-studied test for pain affect. The test lists various adjectives (affects), with terms grouped by intensity, and the respondent notes which adjectives apply.<\/p>\n<p>Structured for physical pain, about one-third of the affects listed apply to on-call. The test is also purely focused on pain, so it does not track any positive affects or emotions. The pain categories that apply to on-call seem to have analogous affects on the I-PANAS-SF.<\/p>\n<h2 id=\"affect-grid\">Affect Grid<\/h2>\n<p>Affect Grid is a single question test where the respondent marks their current emotional state on a two-axis grid, where the dimensions are pleasant\/unpleasant and arousal\/sedation. Since it only requires a single mark by the respondent, it is one of the fastest tests to conduct. However, I felt it eliminated too much of the flavor of the response.<\/p>\n<h1 id=\"rationale-for-secondary-design-elements\">Rationale for Secondary Design Elements<\/h1>\n<p><strong>Prompting for data the day after<\/strong><\/p>\n<p>Our memories about feelings tend to dampen over time, so ideally we will collect data soon after the event. Collecting after a day, rather than after the rotation, helps standardize time series analysis and comparison against other teams (which may not have the same rotations) and allows teams to see the variance in scores and the impact of events, like incidents, more directly. Alternatively, we could collect data <em>during<\/em> the on-call (similar to an Experience Sampling Method), which might be the most accurate way of measurement. However, since on-call is usually interrupt driven and the on-call subject may be in the middle of resolving higher priority matters, we should not add to the burden, hence allowing the individual to provide the data on their schedule.<\/p>\n<p><strong>Prompting for data only if on-call duration exceeds seven hours<\/strong><\/p>\n<p>It is fairly common for someone to override another person&rsquo;s schedule. In my experience, if the override is fairly short, then the person taking the override is likely just monitoring for alerts and is thus experiencing a subset of the full on-call experience. Eliminating these periods from data collection should enhance data quality and reduce fatigue from answering the survey.<\/p>\n<p><strong>Participants<\/strong><\/p>\n<p>The expectation is that participants in this program will be adult professionals, most likely with college education, or at least literate. The individuals will be familiar with using a computer and the question will be provided in one of their &ldquo;work place&rdquo; languages. Individuals within a pool may differ by gender and nationality.<\/p>\n<p>Based on these assumptions, we need a test that is robust\/valid across different population groups, although each group will be literate adults.<\/p>\n<h1 id=\"program-validation\">Program Validation<\/h1>\n<h2 id=\"demonstrating-reliability\">Demonstrating Reliability<\/h2>\n<p>There are four primary ways to evaluate reliability: test-retest, parallel forms, split-half, and internal consistency. We posit that the &ldquo;true on-call experience&rdquo; is measured on the working days (this would the weekdays for a weekly on-call cadence) and that, although each day is unique, they can be treated as multiple repeated &ldquo;samplings&rdquo; of an experience. Thus, we can use the test-retest methodology for evaluating reliability.<\/p>\n<p>Grouping consecutive days with recorded values from the same individual producing pairs of \\((x, y)\\), and appending pairs for individuals in the same team, we calculate the Pearson product-moment correlation coefficient via:<\/p>\n<p>$$\nr = \\frac{\\sum_{i=1}^{n} (\\frac{x_i - \\bar{x}}{s_x})(\\frac{y_i - \\bar{y}}{s_y})}{n - 1}\n$$<\/p>\n<p>where \\(s_x\\) and \\(s_y\\) are the standard deviation of the x and y values. The correlation coefficient \\(r\\) can be considered the reliability metric.<\/p>\n<p>We aim for reliability of 0.7 or greater, which is the expected reliability of personality tests. Collecting 30 observations per individual on the team (which, given observations are collected daily, implies at least 30 days of on-call) should be sufficient for starting statistical analysis.<\/p>\n<h2 id=\"demonstrating-validity\">Demonstrating Validity<\/h2>\n<h3 id=\"face-validity\">Face Validity<\/h3>\n<p>Face validity requires participates to take the effort seriously. This program is not designed to handle adversarial participants. This article, by being open and available to participants, should aid in face validity since participants can understand the intent, design decisions, and proposed analysis process. Informed participants can decide if the approach is sound and potentially useful.<\/p>\n<h3 id=\"content-validity\">Content Validity<\/h3>\n<p>Content validity will be shown if teams use and continue to use the program within their operational program. If the program generates actionable insight, it will be shown in their weekly reports.<\/p>\n<h3 id=\"concurrent-validity\">Concurrent Validity<\/h3>\n<p>Since the author is unaware of existing on-call experience metric programs, concurrent validity is an aspirational goal that future experience programs will have statistically similar measurability. If there were existing programs, part of the validation effort would be to compare results against those programs.<\/p>\n<h3 id=\"construct-validity\">Construct Validity<\/h3>\n<p>I hypothesize that:<\/p>\n<ol>\n<li>PE will correlate with the closing of operational improvement tickets (tickets created by team to improve operations)<\/li>\n<li>NE will correlate with incidents<\/li>\n<li>The affects of Nervous and Afraid will increase ahead of major operational items (e.g. non-standard database migrations, data center moves)<\/li>\n<li>The Alert affect will negatively correlate with off-hours alerts (although some bucketing\/combining of alerts by time may be required)<\/li>\n<\/ol>\n<h1 id=\"further-reading\">Further Reading<\/h1>\n<p>Dennis C. Turk (Ed.) and Ronald Melzack (Ed.). 2001. <em>Handbook of Pain Assessment<\/em> (2nd. ed.). The Guilford Press, New York, NY.<\/p>\n<p>John Rust and Susan Golombok. 1999. <em>Modern Psychometrics: The Science of Psychological Assessment<\/em> (2nd. ed.). Routledge, New York, NY.<\/p>"},{"title":"COBOL's Map Reduce","link":"https:\/\/ztoz.blog\/posts\/cobol-map-reduce\/","pubDate":"Mon, 29 Aug 2022 16:29:13 -0600","guid":"https:\/\/ztoz.blog\/posts\/cobol-map-reduce\/","description":"<p>COBOL is for Big Data. Well, sort of. Awhile back, I noticed that the COBOL SORT verb was overpowered. Rather than sorting an array of items or even sorting a file, it included a generalized ability to stream in arbitrary inputs &mdash; <code>() =&gt; Stream[T]<\/code> &mdash;, where T is a key\/value pair, and process the outputs in order &mdash; <code>SortedStream[T] =&gt; ()<\/code>. This power is useful if you are writing a map-reduce program, but excessive for sorting. So, let&rsquo;s look at how this was implemented, why it was part of the design, and what prevented COBOL from jump-starting the <em>distributed<\/em> map-reduce revolution.<\/p>\n<h1 id=\"map-reduce\">Map-Reduce<\/h1>\n<p>Many real-world tasks involved <em>mapping<\/em> the input to a key\/value pair, and then <em>reducing<\/em> sequences of values that share the same key to some useful value. The Census, for example, involves transforming a census record into the pair state\/number of people per record, and then summing the number of people per record into a number of people per state. Before automation, census workers performed this task in a distributed fashion by tallying small stacks of records themselves and then combining their tallies to produce the final results.<\/p>\n<p>When Google&rsquo;s <a href=\"https:\/\/research.google\/pubs\/pub62\/\">&ldquo;MapReduce: Simplified Data Processing on Large Clusters&rdquo;<\/a> paper was released, I was surprised by the attention it was getting. In my scientific computing class we had covered the model in the first week and I think we implemented a version in MPI as one of the initial homework assignments. What I wasn&rsquo;t realizing was that the technology was greatly democratizing the programming model; there was a lot of value in the 80% solution.<\/p>\n<p>COBOL&rsquo;s design for the SORT was meant for map-reduce applications. Although the specification did not state it explicitly, COBOL textbooks of the time used map-reduce examples, just not using the term map-reduce specifically.<\/p>\n<h1 id=\"cobols-sort-verb\">COBOL&rsquo;s SORT Verb<\/h1>\n<p>I&rsquo;ve written a census-like application to demonstrate the map-reduce nature of SORT. COBOL has two kinds of SORT: file sort and table sort. Table sort is straight-forward and acts like sorting an array. File Sort supports several variants, but <code>INPUT PROCEDURE<\/code> and <code>OUTPUT PROCEDURE<\/code> are the two options that provide the necessary power.<\/p>\n<blockquote>\n<p>On COBOL Verisimilitude: The SORT VERB was added to COBOL 61, so I&rsquo;ve tried to keep my discussion specific to COBOL of that era. However, my code is in &lsquo;free style&rsquo; so the code&rsquo;s indentation is freer than allowed at the time. I use mixed-case to make the code easier to read, although it is against the standard (the specification&rsquo;s character set did not include lower case) and a-historical (computer systems used 6-bit characters). I&rsquo;m also using PERFORM and in-line code, which weren&rsquo;t standardized until 1985. I&rsquo;ve verified the code is at least 1985 compliant by using the <code>-free -std=cobol85<\/code> flags for GnuCobol 2.2.0.<\/p>\n<\/blockquote>\n<p>The entire program is around 90 lines. (The word frequency C++ program is around 50 lines without comments and whitespace.) About a third of the lines define the program&rsquo;s I\/O and data structures, including file schemas. COBOL unifies the definition of data with the declaration of storage for that data, so you&rsquo;ll see data moved into the definitions.<\/p>\n<p>The phrase <code>ORGANIZATION LINE SEQUENTIAL<\/code> means the file uses some character sequence to delimit records (lines); since I ran this on a Unix system, it&rsquo;s the newline character. <code>PIC A(x)<\/code> indicates a alphanumeric field that is x characters in length; <code>9(x)<\/code> is a numeric field with x digits (including sign), and <code>Z(x)<\/code> means zeros are replaced by spaces in the  field (otherwise, numbers will be padded with zeros). File names are hard-coded for simplicity.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-cobol\" data-lang=\"cobol\"><span style=\"color:#75715e\">IDENTI<\/span>FICATION <span style=\"color:#66d9ef\">DIVISION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tPROGR<\/span>AM-ID. MapReduce.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">ENVIRO<\/span>NMENT <span style=\"color:#66d9ef\">DIVISION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tINPUT<\/span><span style=\"color:#f92672\">-<\/span>OUTPUT <span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tFILE<\/span><span style=\"color:#f92672\">-<\/span>CONTROL.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tSEL<\/span>ECT input-file <span style=\"color:#66d9ef\">ASSIGN<\/span> <span style=\"color:#66d9ef\">TO <\/span><span style=\"color:#e6db74\">&#34;census.txt&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tOR<\/span>GANIZATION <span style=\"color:#66d9ef\">LINE<\/span> <span style=\"color:#66d9ef\">SEQUENTIAL<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tSEL<\/span>ECT sort-work-file <span style=\"color:#66d9ef\">ASSIGN<\/span> <span style=\"color:#66d9ef\">TO <\/span><span style=\"color:#e6db74\">&#34;sort.txt&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tOR<\/span>GANIZATION <span style=\"color:#66d9ef\">LINE<\/span> <span style=\"color:#66d9ef\">SEQUENTIAL<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tSEL<\/span>ECT reduce-output-file <span style=\"color:#66d9ef\">ASSIGN<\/span> <span style=\"color:#66d9ef\">TO <\/span><span style=\"color:#e6db74\">&#34;report.txt&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tOR<\/span>GANIZATION <span style=\"color:#66d9ef\">LINE<\/span> <span style=\"color:#66d9ef\">SEQUENTIAL<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">DATA D<\/span>IVISION.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tFILE <\/span><span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tFD\ti<\/span>nput-file.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t01 <\/span>census-record.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> household <span style=\"color:#66d9ef\">PIC A(8)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> region1 <span style=\"color:#66d9ef\">PIC A(2)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> heads <span style=\"color:#66d9ef\">PIC 9(3)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tSD\ts<\/span>ort-work-file.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t01 <\/span>reduce-input.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> region2 <span style=\"color:#66d9ef\">PIC A(2)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> heads-by-household <span style=\"color:#66d9ef\">PIC 9(3)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tFD\tr<\/span>educe-output-file.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t01 <\/span>population-by-region-output.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> region3 <span style=\"color:#66d9ef\">PIC A(2)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t05<\/span> heads-by-region <span style=\"color:#66d9ef\">PIC Z(4)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tWORKI<\/span>NG-STORAGE <span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t01 w<\/span>s-census-record.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t05 <\/span>ws-household <span style=\"color:#66d9ef\">PIC A(8)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t05 <\/span>ws-region1 <span style=\"color:#66d9ef\">PIC A(2)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t05 <\/span>ws-heads <span style=\"color:#66d9ef\">PIC 9(3)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t01 w<\/span>s-eof <span style=\"color:#66d9ef\">PIC A(1)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t01 l<\/span>ast-region-seen <span style=\"color:#66d9ef\">PIC A(2)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t01 r<\/span>educe-eof <span style=\"color:#66d9ef\">PIC A(1)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t01 r<\/span>egion-tally <span style=\"color:#66d9ef\">PIC 9(4)<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><\/code><\/pre><\/div><p>The <code>SORT<\/code> is the driver for the map-reduce process. <code>SORT<\/code> is told the location of a &lsquo;work file&rsquo; to be used for its own processing; the underlying SORT machinery does not <em>have<\/em> to use it, but can perform the work in-memory if the work is small enough. Depending on the type of sort work file, the machinery may use different algorithms. The source of data for <code>SORT<\/code> is <code>map-function<\/code>, which will <code>RELEASE<\/code> records (which must conform to the schema defined in <code>reduce-input<\/code>). The <code>map-function<\/code> is responsible for handling its own I\/O. Similarly, <code>reduce-function<\/code> accepts records one-by-one via <code>RETURN<\/code> from the work file and writes to a report file a record once it has processed all of a single key. The end-of-region procedure is a helper for the reduce-function.<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-cobol\" data-lang=\"cobol\"><span style=\"color:#75715e\">PROCED<\/span>URE <span style=\"color:#66d9ef\">DIVISION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tmain <\/span><span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tDISPL<\/span>AY <span style=\"color:#e6db74\">&#34;Map Reduce in COBOL&#34;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tSORT <\/span>sort-work-file <span style=\"color:#66d9ef\">ON<\/span> <span style=\"color:#66d9ef\">ASCENDING<\/span> <span style=\"color:#66d9ef\">KEY<\/span> region2<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tINPU<\/span>T <span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#66d9ef\">IS<\/span> map-function<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tOUTP<\/span>UT <span style=\"color:#66d9ef\">PROCEDURE<\/span> <span style=\"color:#66d9ef\">IS<\/span> reduce-function.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tSTOP <\/span><span style=\"color:#66d9ef\">RUN\n<\/span><span style=\"color:#66d9ef\">\t<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tmap-f<\/span>unction <span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tOPEN <\/span><span style=\"color:#66d9ef\">INPUT<\/span> input-file.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tPERFO<\/span>RM <span style=\"color:#66d9ef\">UNTIL<\/span> ws-eof<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;Y&#39;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tREAD<\/span> input-file <span style=\"color:#66d9ef\">INTO<\/span> ws-census-record<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tAT <\/span><span style=\"color:#66d9ef\">END<\/span> <span style=\"color:#66d9ef\">MOVE <\/span><span style=\"color:#e6db74\">&#39;Y&#39;<\/span> <span style=\"color:#66d9ef\">TO<\/span> ws-eof<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tNOT<\/span> <span style=\"color:#66d9ef\">AT<\/span> <span style=\"color:#66d9ef\">END \n<\/span><span style=\"color:#66d9ef\">\t\t\t<\/span>\t<span style=\"color:#66d9ef\">MOVE<\/span> ws-region1 <span style=\"color:#66d9ef\">TO<\/span> region2<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tMO<\/span>VE ws-heads <span style=\"color:#66d9ef\">TO<\/span> heads-by-household<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tRE<\/span>LEASE reduce-input<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tEND<\/span><span style=\"color:#f92672\">-<\/span>READ<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tEND-P<\/span>ERFORM<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tCLOSE<\/span> input-file<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\treduc<\/span>e-function <span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tOPEN <\/span><span style=\"color:#66d9ef\">OUTPUT<\/span> reduce-output-file.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tMOVE <\/span><span style=\"color:#e6db74\">&#39;  &#39;<\/span> <span style=\"color:#66d9ef\">TO<\/span> last-region-seen.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tPERFO<\/span>RM <span style=\"color:#66d9ef\">UNTIL<\/span> reduce-eof<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">&#39;Y&#39;<\/span><span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tRETU<\/span>RN sort-work-file<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tAT <\/span><span style=\"color:#66d9ef\">END<\/span> <span style=\"color:#66d9ef\">MOVE <\/span><span style=\"color:#e6db74\">&#39;Y&#39;<\/span> <span style=\"color:#66d9ef\">TO<\/span> reduce-eof<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\tNOT<\/span> <span style=\"color:#66d9ef\">AT<\/span> <span style=\"color:#66d9ef\">END\n<\/span><span style=\"color:#66d9ef\">\t\t\t<\/span>\t<span style=\"color:#66d9ef\">IF<\/span> last-region-seen <span style=\"color:#66d9ef\">IS<\/span> <span style=\"color:#f92672\">NOT<\/span> <span style=\"color:#f92672\">EQUAL<\/span> <span style=\"color:#66d9ef\">TO <\/span><span style=\"color:#e6db74\">&#39;  &#39;<\/span> <span style=\"color:#f92672\">AND \n<\/span><span style=\"color:#f92672\">\t\t\t\t<\/span>\tlast-region-seen <span style=\"color:#66d9ef\">IS<\/span> <span style=\"color:#f92672\">NOT<\/span> <span style=\"color:#f92672\">EQUAL<\/span> <span style=\"color:#66d9ef\">TO<\/span> region2<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\t\tP<\/span>ERFORM end-of-region <span style=\"color:#66d9ef\">END-IF\n<\/span><span style=\"color:#66d9ef\">\t\t\t<\/span>\t<span style=\"color:#66d9ef\">MOVE<\/span> region2 <span style=\"color:#66d9ef\">TO<\/span> last-region-seen<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tMO<\/span>VE region2 <span style=\"color:#66d9ef\">TO<\/span> region3<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\t\t\tAD<\/span>D heads-by-household <span style=\"color:#66d9ef\">TO<\/span> region-tally<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\t\tEND-<\/span>RETURN<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tEND-P<\/span>ERFORM.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tPERFO<\/span>RM end-of-region.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tCLOSE<\/span> reduce-output-file<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tend-o<\/span>f-region <span style=\"color:#66d9ef\">SECTION<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tMOVE <\/span>region-tally <span style=\"color:#66d9ef\">TO<\/span> heads-by-region.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tWRITE<\/span> population-by-region-output <span style=\"color:#66d9ef\">BEFORE<\/span> <span style=\"color:#66d9ef\">ADVANCING<\/span> <span style=\"color:#ae81ff\">1 <\/span><span style=\"color:#66d9ef\">LINE<\/span> <span style=\"color:#66d9ef\">END-WRITE<\/span>.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span><span style=\"color:#75715e\">\tMOVE <\/span><span style=\"color:#66d9ef\">ZERO<\/span> <span style=\"color:#66d9ef\">TO<\/span> region-tally<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><span style=\"color:#960050;background-color:#1e0010\"><\/span>\t.<span style=\"color:#960050;background-color:#1e0010\">\n<\/span><\/code><\/pre><\/div><h1 id=\"importance\">Importance<\/h1>\n<p>COBOL was designed for batch, serial processing of records. Given the limitations of the computers of the day, almost all datasets processed would exceed the core storage available (e.g. the IBM 360\/50, released in 1965, only had between 64 and 512kB of memory). Processing had to be performed in a streaming fashion.<\/p>\n<p>Since the programmer might not know the number of keys beforehand, sorting the data beforehand allowed the program to handle an arbitrary number of keys with minimal overhead since it could process all records with the same key in a single pass. COBOL lacked an associative array mechanism at the time, so even if you knew the keys beforehand, it may be painful to program out-of-order processing. Data could be sorted by a different program (or a device) beforehand, but there wasn&rsquo;t a good way to represent &ldquo;this program needs sorted data&rdquo;. Since the programmer and operator tended to be separate roles, removing this potential error mode was desirable as computer time was very expensive.<\/p>\n<p>Finally, writing a sorting algorithm <em>in<\/em> COBOL was pretty painful given the limited array (table) functionality and the inability to create reusable user functions.<\/p>\n<p>Adding the SORT verb was such a convenience ACM <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/366552.366588\">published an article about it<\/a>.<\/p>\n<h1 id=\"delegation-of-concerns-and-distributed-processing\">Delegation of Concerns and Distributed Processing<\/h1>\n<p>Apart from the fact software engineers are trained to loathe COBOL, why wasn&rsquo;t it a model for later big data systems?<\/p>\n<ol>\n<li>COBOL and RDBMS became closely related<\/li>\n<li>COBOL lacked traction in small systems<\/li>\n<li>INPUT and OUTPUT PROCEDUREs design prevented delegation of concerns<\/li>\n<\/ol>\n<p>First, COBOL was an innovator in schematized hierarchical data and its strength was always in processing records rather than streams of bytes. Relational databases were the natural stores of this data as it moved away from tapes and punched cards. COBOL could delegate the computation of data to the database. After all,<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-sql\" data-lang=\"sql\"><span style=\"color:#66d9ef\">SELECT<\/span> region, <span style=\"color:#66d9ef\">SUM<\/span>(heads) <span style=\"color:#66d9ef\">FROM<\/span> census <span style=\"color:#66d9ef\">GROUP<\/span> <span style=\"color:#66d9ef\">BY<\/span> region <span style=\"color:#66d9ef\">ORDER<\/span> <span style=\"color:#66d9ef\">BY<\/span> region\n<\/code><\/pre><\/div><p>is a lot shorter than the program above. An organization, faced with lots of data, would be more likely to scale the database rather than the programs accessing the database which were probably just handling business logic and reporting. This is similar to DeWitt and  Stonebraker&rsquo;s <a href=\"http:\/\/craig-henderson.blogspot.com\/2009\/11\/dewitt-and-stonebrakers-mapreduce-major.html\">early criticism<\/a> of MapReduce, which they felt lacked novelty versus prior work in RDMSs and was a backwards step away from schematized data.<\/p>\n<p>Secondly, although COBOL was available on small systems such as personal computers, its strength was in mainframes. Mainframes specialize in high-throughput, multiplexed I\/O which is accomplished by being very &ldquo;wide&rdquo; in processing. This leads to vertical scaling, as opposed to horizontal scaling, as <a href=\"https:\/\/www.ibm.com\/docs\/en\/zos-basic-skills?topic=it-mainframe-strengths-scalability\">IBM continues to advertise<\/a>. Culturally and economically (prices are set per CPU), there wasn&rsquo;t an incentive to change designs to spread work over a multitude of machines.<\/p>\n<p>Third, the protocols between the SORT machinery and the INPUT and OUTPUT procedures prevent (easy) parallelization. The INPUT procedure is entirely responsible for managing its I\/O and thus the SORT machinery has no knowledge of what files are being read. An implementation of the INPUT procedure could distribute work itself, although that would be onerous. If all the records were distributed and then needed to be RELEASED on a single node (for the sort), this would create a huge bottleneck. The OUTPUT procedure is guaranteed to be RETURNed records in-order which would lead to high levels of idleness for parallel workers. The OUTPUT procedure could broadcast the values to distribute work, but again that would be an onerous responsibility to put on a procedure versus having the system provide that functionality.<\/p>\n<p>However, these issues can be overcome. SORT already has variants that are given the list of input and output files and knows the keys used for the sort order, which could be assumed to be the key for distributed processing. The input and output procedures could be replaced by references to user defined intrinsic functions (added to the language in 1989), although a version that used procedures should be possible by defining a proper protocol. These functions\/procedures would then perform just the mapping and reducing (e.g. <code>T =&gt; U<\/code> and <code>(U, W) =&gt; W<\/code>), without any direct involvement in I\/O nor be provided guarantees on the order of invocation. A sort data (SD) variant could be added to specifically support the distributed sorting application.<\/p>\n<p>Given the economics and culture, I don&rsquo;t see a plausible alternate history where distributed map-reduce COBOL programs rose to prominence in the 1990s. However, the map-reduce architecture was quite close to one that supported distributed processing.<\/p>\n"},{"title":"Hero Code Examples for Programming Languages: Revisiting Trabb-Pardo_Knuth","link":"https:\/\/ztoz.blog\/posts\/tpk\/","pubDate":"Tue, 16 Aug 2022 14:38:12 -0600","guid":"https:\/\/ztoz.blog\/posts\/tpk\/","description":"<p>&ldquo;The best way to grasp the spirit of a programming language is to read example programs&rdquo; as Knuth wrote in the &ldquo;The Early Development of Programming Languages&rdquo; <sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>. Today, it is common practice to provide an example program on the front page of a programming language&rsquo;s marketing site (i.e., a &ldquo;hero code example&rdquo; as it were). Since the objective of the example is to showcase the language&rsquo;s value versus others, what coding tasks best fulfill that? Beyond Hello World, which is the most popular example coding task, there is little agreement between sites. I survey existing practice, discuss the utility of the Trabb-Pardo Knuth algorithm, and make recommendations on tasks that can demonstrate value versus the &ldquo;next best alternative&rdquo;.<\/p>\n<p>Showing a complete example program at the beginning of a language&rsquo;s documentation has a long, but not universal, tradition. The 1956 FORTRAN Programmer&rsquo;s Reference Manual illustrated the new language with an example of finding the maximum element in an array on the very first page of chapter 1. The 1964 Dartmouth BASIC Manual demonstrating solving two simultaneous linear equations on page 3. Iverson&rsquo;s 1962 <em>A Programming Language<\/em> computes the area of a circle on page 2. However, the 1962 COBOL specification did not include a full program <em>anywhere<\/em> in the document and the LISP I Programmer&rsquo;s Manual took 23 pages until you saw something close to a full example of a program.<\/p>\n<h2 id=\"survey-of-current-practice\">Survey of Current Practice<\/h2>\n<p>Based on 33 websites for general purpose languages that featured at least one coding example on the landing page (pulling from posts to Lobste.rs and HN where a language was advertised), I recorded 71 tasks across 91 code examples. The most popular tasks and their rough Rosetta Code equivalents are:<\/p>\n<ol>\n<li>&ldquo;Hello World&rdquo; with 7 examples <a href=\"https:\/\/rosettacode.org\/wiki\/Hello_world\/Text\">Hello World\/Text<\/a><\/li>\n<li>Examples of defining a data class\/record\/class with 5 examples <a href=\"https:\/\/rosettacode.org\/wiki\/Classes\">Classes<\/a> <a href=\"https:\/\/rosettacode.org\/wiki\/Compound_data_type\">Compound Data Type<\/a><\/li>\n<li>&ldquo;Hello World&rdquo; variants (e.g. cases where the example demonstrated string concatenation or other additional computation prior to displaying the message) with 4 examples<\/li>\n<li>&ldquo;Hello World&rdquo; provided via a web call with 3 examples <a href=\"https:\/\/rosettacode.org\/wiki\/Hello_world\/Web_server\">Hello World\/Web Server<\/a><\/li>\n<li>Fibonacci with 3 examples <a href=\"https:\/\/rosettacode.org\/wiki\/Fibonacci_sequence\">Fibonacci Sequence<\/a><\/li>\n<\/ol>\n<p>There were 2 examples of straight-forward equations\/math computations, array processing, counting the number of occurrences of a value in a list, and computing factorials. All other tasks had a singular example. Some tasks were about the ecosystem rather than the language and demonstrated features such as unit tests, dependency management, or builds.<\/p>\n<p>I only encountered one instance where a site <em>solely<\/em> used Hello World, versus as one example of many.<\/p>\n<h2 id=\"limitations-of-hello-world-for-modern-languages\">Limitations of Hello World for Modern Languages<\/h2>\n<p>Hello World as the introductory code example was popularized by K&amp;R&rsquo;s <em>The C Programming Language<\/em> (with the first Hello World appearing (obliquely) in a <a href=\"https:\/\/www.bell-labs.com\/usr\/dmr\/www\/bintro.html\">B language tutorial<\/a>). In K&amp;R&rsquo;s second edition, the virtues of Hello World are described thus:<\/p>\n<blockquote>\n<p>This is the big hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went. With these mechanical details mastered, everything else is comparatively easy.<\/p>\n<\/blockquote>\n<p>Hello World has a didactic purpose in demonstrating that the <em>infrastructure works<\/em>.<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup> When this was written in 1978, these mechanical details were particularly difficult, but there is a reason most programmers are given a day to setup a new computer given how many things need to be downloaded and configured before they are productive.<\/p>\n<p>However, Hello World demonstrates very little about the programming language itself. The example shows a minimal skeleton of a program, how to import a module (if I\/O is provided via a module), and a minimal example of formatted I\/O (i.e. the newline) and strings. Given that a language designer needs to differentiate their language, this example provides little opportunity to do so.<\/p>\n<p>To demonstrate, here are over 300 examples drawn from Rosetta Code for Hello World. This list limits the examples to those which take a single line. I imagine your eyes will glaze over with how many languages are so close to <code>print &quot;Hello, World!&quot;<\/code> (you may need to open the image in a new tab).<\/p>\n\n\n\n<figure>\n  \n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/tpk\/Hello_World_Single_Line.png\" alt=\"Three columns of Hello World programs from Rosetta Code\">\n  \n  <figcaption>Single Line Hello World Examples<\/figcaption>\n<\/figure>\n\n\n<p>So, what alternative tasks may be useful?<\/p>\n<h2 id=\"trabb-pardo-knuth-tpk-algorithm\">Trabb-Pardo-Knuth (TPK) Algorithm<\/h2>\n<p>The Trabb-Pardo Knuth (TPK) algorithm is a coding task used to illustrate and compare early programming languages. The historical survey <sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup> coverered languages from Zuse&rsquo;s Plancalculus (1945) to IAL in 1958, and included an example implementation (as far as possible) in twenty languages. TPK is an artificial task designed to demonstrate multiple aspects of a programming language. The definition of the task is given by the ALGOL60 program:<\/p>\n<pre tabindex=\"0\"><code class=\"language-algol60\" data-lang=\"algol60\">TPK: begin integer i; real y; real array a[0:10];\n\treal procedure f(t); real t; value t;\n\t\tf := sqrt(abs(t)) + 5 x t\ud83e\udc153;\n\tfor i := 0 step 1 until 10 do read(a[i]);\n\tfor i := 10 step -1 until 0 do\n\t\tbegin y := f(a[i]);\n\t\t\tif y &gt; 400 then write(i, 'TOO LARGE')\n\t\t\t\t\t   else write(i, y);\n\t\tend\n\tend TPK.\n<\/code><\/pre><p>In English, the task is to read 11 numbers, compute f() for each, and in reverse order, print the index and f() result unless f() is over 400, in which case print &lsquo;TOO LARGE&rsquo;.<\/p>\n<p>The Computer History Museum has a <a href=\"https:\/\/www.computerhistory.org\/collections\/catalog\/102622137\">lecture from Donald Knuth &ndash; A Dozen Precursors of Fortran<\/a> that discusses the TPK algorithm and early programming languages. The TPK discussion starts at 10:50.<\/p>\n<p>TPK illustrates:<\/p>\n<ol>\n<li>Function definition and invocation<\/li>\n<li>Mathematical expressions (involving precedence and operations beyond arithmetic)<\/li>\n<li>Data types (not all languages supported data types, so this was a &lsquo;best effort&rsquo; requirement)<\/li>\n<li>Loops\/iteration<\/li>\n<li>Conditionals<\/li>\n<li>I\/O (not all languages supported I\/O, so this was a &lsquo;best effort&rsquo; requirement)<\/li>\n<li>Program structure (e.g. program entry point)<\/li>\n<\/ol>\n<p>In the lecture, Knuth states the objective of the task was &ldquo;a short program that does nothing important, but [&hellip;] almost everything that a language has to do in general&rdquo;.<\/p>\n<h2 id=\"advantages-and-limitations-of-tpk-for-modern-languages\">Advantages and Limitations of TPK for Modern Languages<\/h2>\n<p>The task definition is sufficiently vague that an implementer can fit in additional language features. For example, the test and handling for the overflow condition can be converted into an Optional type. Explicit for-loops and indexing can be replaced by iterators and more functional-style streaming operations.<\/p>\n<p>A major focus in the survey is the computation. In the survey, many of the languages had &hellip; involved&hellip; syntax for mathematical expressions. Parsing a general expression was a difficult problem because 1) most parsing theory hadn&rsquo;t been discovered yet and 2) hardware limits. However, the international effort in the late 50s towards languages that supported algebraic expressions was successful. Today, Lisp and Forth descended languages may still have prefix and postfix notation, but infix has won. Math functions may be hidden within a module and exponentiation may be either a library function or built-in to the language, but math expressions in languages look very similar.<\/p>\n<p>In the spirit of &ldquo;everything that a language has to do&rdquo;, in comparison to typical modern programming tasks, the task does not require a compound data type nor lends itself to any sort of object. The task does not require error handling and would not benefit from concurrency (although both could be added to the example).<\/p>\n<h2 id=\"recommendation\">Recommendation<\/h2>\n<p>Since Hello World has didactic value rather than marketing value, it is best placed in the tutorial\/documentation portion of the website, rather than on the front-page.<\/p>\n<p>TPK, while an artificial task, demonstrates a large number of language features succinctly and different implementations can be compared easily.<\/p>\n<p>Since TPK does not involve any compound data types or data modeling, a good secondary task is modeling <a href=\"https:\/\/rosettacode.org\/wiki\/Arithmetic\/Complex\">Complex numbers<\/a> or <a href=\"https:\/\/rosettacode.org\/wiki\/Quaternion_type\">Quaternions<\/a>, which can showcase object-oriented features and definition of new operators. Since both are still very mathematical, an alternative like <a href=\"https:\/\/rosettacode.org\/wiki\/Search_a_list_of_records\">Search a List of Records<\/a> includes strings and numbers, data structures and (potentially) compound types, as well as the ability to showcase higher-order functions.<\/p>\n<p>I suspect that any example for concurrency will need to be tailored to the specific concurrency models implemented, as well as differences between support via language versus libraries. Given how many sites use concurrency in an example (and its relevance to SaaS developers), there is evident marketing value in showing those features, but the use cases are very divergent.<\/p>\n<h1 id=\"references\">References<\/h1>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Donald E. Knuth, Luis Trabb Pardo. &ldquo;The Early Development of Programming Languages.&rdquo; <em>Encyclopedia of Computer Science and Technology<\/em>, edited by\nJ. Belzer, A. G. Holzman, and A. Kent, Dekker, 1977, Vol. 6, pp. 419\u2013493. <a href=\"https:\/\/doi.org\/10.1016\/B978-0-12-491650-0.50019-8\">doi<\/a> <a href=\"https:\/\/archive.org\/details\/DTIC_ADA032123\">archive\/DTIC<\/a> <a href=\"https:\/\/www-cs-faculty.stanford.edu\/~knuth\/cl.html\">website<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Special:BookSources?isbn=1-57586-382-0\">collected papers isbn search<\/a>&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>This purpose was lost on the critics who complained the example failed to demonstrate object-oriented programming. See <a href=\"https:\/\/doi.org\/10.1145\/383845.383874\">Technical opinion: Hello, world considered harmful<\/a> and the works that cite it.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>"},{"title":"Rosetta Code","link":"https:\/\/ztoz.blog\/posts\/rosetta\/","pubDate":"Fri, 05 Aug 2022 12:56:42 -0600","guid":"https:\/\/ztoz.blog\/posts\/rosetta\/","description":"<p><a href=\"https:\/\/rosettacode.org\/wiki\/Rosetta_Code\">Rosetta Code<\/a> is a programming chrestomathy wiki, that is, it is a site with lots of examples of tasks completed in multiple programming languages. The site has been around since 2007 and now has 1,100+ tasks and 100,000+ code submissions over 900+ languages. To help other researchers, I&rsquo;m publishing an export of the code samples as a sqlite database via <a href=\"https:\/\/dbhub.io\/jeffrey\/rosettacode.sqlite\">DBHub.io<\/a> and the source code via <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/rosetta2sqlite\">Gitlab<\/a>.<\/p>\n<h1 id=\"chrestomathy\">Chrestomathy?<\/h1>\n<p>The OED defines chrestomathy as:<\/p>\n<blockquote>\n<p>A collection of choice passages from an author or authors, <em>esp.<\/em> one compiled to assist in the acquirement of a language.<\/p>\n<\/blockquote>\n<p>For example, if you were interested in how languages differed in their APIs for renaming files, you might find the <a href=\"https:\/\/rosettacode.org\/wiki\/Rename_a_file\">rename a file task<\/a> and look at some examples:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-lisp\" data-lang=\"lisp\">(<span style=\"color:#a6e22e\">rename-file<\/span> <span style=\"color:#e6db74\">&#34;input.txt&#34;<\/span> <span style=\"color:#e6db74\">&#34;output.txt&#34;<\/span>)\n(<span style=\"color:#a6e22e\">rename-file<\/span> <span style=\"color:#e6db74\">&#34;docs&#34;<\/span> <span style=\"color:#e6db74\">&#34;mydocs&#34;<\/span>)\n(<span style=\"color:#a6e22e\">rename-file<\/span> <span style=\"color:#e6db74\">&#34;\/input.txt&#34;<\/span> <span style=\"color:#e6db74\">&#34;\/output.txt&#34;<\/span>)\n(<span style=\"color:#a6e22e\">rename-file<\/span> <span style=\"color:#e6db74\">&#34;\/docs&#34;<\/span> <span style=\"color:#e6db74\">&#34;\/mydocs&#34;<\/span>)\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-ocaml\" data-lang=\"ocaml\"><span style=\"color:#66d9ef\">let<\/span> () <span style=\"color:#f92672\">=<\/span>\n  Sys.rename <span style=\"color:#e6db74\">&#34;input.txt&#34;<\/span> <span style=\"color:#e6db74\">&#34;output.txt&#34;<\/span><span style=\"color:#f92672\">;<\/span>\n  Sys.rename <span style=\"color:#e6db74\">&#34;docs&#34;<\/span> <span style=\"color:#e6db74\">&#34;mydocs&#34;<\/span><span style=\"color:#f92672\">;<\/span>\n  Sys.rename <span style=\"color:#e6db74\">&#34;\/input.txt&#34;<\/span> <span style=\"color:#e6db74\">&#34;\/output.txt&#34;<\/span><span style=\"color:#f92672\">;<\/span>\n  Sys.rename <span style=\"color:#e6db74\">&#34;\/docs&#34;<\/span> <span style=\"color:#e6db74\">&#34;\/mydocs&#34;<\/span><span style=\"color:#f92672\">;<\/span>\n<\/code><\/pre><\/div><div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-smalltalk\" data-lang=\"smalltalk\"><span style=\"color:#e6db74\">&#39;input.txt&#39;<\/span> <span style=\"color:#a6e22e\">asFilename<\/span> <span style=\"color:#a6e22e\">renameTo:<\/span> <span style=\"color:#e6db74\">&#39;output.txt&#39;<\/span>.\n<span style=\"color:#e6db74\">&#39;docs&#39;<\/span> <span style=\"color:#a6e22e\">asFilename<\/span> <span style=\"color:#a6e22e\">renameTo:<\/span> <span style=\"color:#e6db74\">&#39;mydocs&#39;<\/span>.\n<span style=\"color:#e6db74\">&#39;\/input.txt&#39;<\/span> <span style=\"color:#a6e22e\">asFilename<\/span> <span style=\"color:#a6e22e\">renameTo:<\/span> <span style=\"color:#e6db74\">&#39;\/output.txt&#39;<\/span>.\n<span style=\"color:#e6db74\">&#39;\/docs&#39;<\/span> <span style=\"color:#a6e22e\">asFilename<\/span> <span style=\"color:#a6e22e\">renameTo:<\/span> <span style=\"color:#e6db74\">&#39;\/mydocs&#39;<\/span>\n<\/code><\/pre><\/div><p>These three examples (in Lisp, OCaml, and Smalltalk) demonstrate similarities and differences in how the languages treat the REPL\/program entry point, namespaces and modules for functions, string definition, and statement parsing.<\/p>\n<h1 id=\"licensing\">Licensing<\/h1>\n<p>The Rosetta Code content is licensed under the <a href=\"https:\/\/www.gnu.org\/licenses\/old-licenses\/fdl-1.2.html\">GNU Free Documentation License 1.2<\/a>. Based on my reading of the terms, that would cover the contents of the database as well.<\/p>\n<p>The script is released under the MIT License.<\/p>\n<h1 id=\"database-overview\">Database Overview<\/h1>\n<p>The database is hosted by DBHub.io and is available <a href=\"https:\/\/dbhub.io\/jeffrey\/rosettacode.sqlite\">here<\/a>. You can browse and analyze the data within the site or download the database (282 mb).<\/p>\n<p>DBHub.io is a cloud storage and analytics solution for SQLite databases provided by the same folks as <a href=\"https:\/\/sqlitebrowser.org\/\">DB Browser for SQLite<\/a>.<\/p>\n<p>I also have an approximately ten percent sample of the data hosted <a href=\"https:\/\/dbhub.io\/jeffrey\/code-10.sqlite\">here<\/a>. The sample is 27 mb.<\/p>\n<p>The database has three tables: <em>task<\/em>, <em>language<\/em>, and <em>submission<\/em>.<\/p>\n<p>The <code>task<\/code> describes what the program should do. The <code>content<\/code> field is close to the raw data if you need to look at the source data.<\/p>\n<p>The <code>language<\/code> is a collection of language keys used in the submissions or code blocks. Some language names have been normalized for compatibility; C++ and C# are cpp and csharp, for example. The keys are drawn directly from the Rosetta Code site.<\/p>\n<p>The <code>submission<\/code> is a collection of code samples. A given submission will be linked to a specific task and language, and may include a description (drawn from the preface) and output to provide extra context. The database only contains code samples that were explicitly marked as code; the loose nature of the wiki means that some code samples are embedded in explanatory text without any markup. The export script does not attempt to find these code instances.<\/p>\n<h2 id=\"example-hello-world\">Example: Hello World<\/h2>\n<p>The classic Hello World! programs are in task id 1514. To fetch the ten longest code examples for printing Hello, World, you can use the query:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-sql\" data-lang=\"sql\"><span style=\"color:#66d9ef\">SELECT<\/span> <span style=\"color:#66d9ef\">language<\/span>.<span style=\"color:#66d9ef\">key<\/span>, code \n\t<span style=\"color:#66d9ef\">FROM<\/span> submission \n\t<span style=\"color:#66d9ef\">JOIN<\/span> <span style=\"color:#66d9ef\">language<\/span> <span style=\"color:#66d9ef\">ON<\/span> submission.<span style=\"color:#66d9ef\">language<\/span><span style=\"color:#f92672\">=<\/span><span style=\"color:#66d9ef\">language<\/span>.id \n\t<span style=\"color:#66d9ef\">WHERE<\/span> task<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1514<\/span> \n\t<span style=\"color:#66d9ef\">ORDER<\/span> <span style=\"color:#66d9ef\">BY<\/span> <span style=\"color:#66d9ef\">length<\/span>(code) <span style=\"color:#66d9ef\">DESC<\/span> \n\t<span style=\"color:#66d9ef\">LIMIT<\/span> <span style=\"color:#ae81ff\">10<\/span>\n<\/code><\/pre><\/div><p>As you might imagine, the longest programs are for joke languages and very low level languages like assembly.<\/p>\n<h1 id=\"script\">Script<\/h1>\n<p>The <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/rosetta2sqlite\">rosetta2sqlite script<\/a> is written in Python, targeting version 3.8 or higher, with no external dependencies. The script is structured as a fairly standard extract-transform-load script and includes unit tests. The script uses data classes and type hints, both of which I&rsquo;ve found to be very useful in my professional work.<\/p>\n<p>The README file in the project contains more details on the database schema and input format.<\/p>\n<p>Processing the full export takes slightly less than two hours.<\/p>\n<h2 id=\"why-no-external-dependencies\">Why no external dependencies?<\/h2>\n<p>There are some Python libraries that might have simplified parsing the MediaWiki text, but none passed my personal threshold for supply chain risk for this project. Research software tends to decay rapidly as it is usually secondary to the author&rsquo;s purpose. Further, as an ETL script, it is even more &ldquo;one-and-done&rdquo; because my purpose only needs a snapshot of data, not a continual feed. Thus, maintenance (either by me or those who fork it for their own needs) is greatly simplified:<\/p>\n<ol>\n<li>The only updates are to the language and the standard library<\/li>\n<li>All the code is in one place<\/li>\n<li>Avoids bit rot from Python&rsquo;s ever evolving approach to dependencies<\/li>\n<\/ol>"},{"title":"PLATO and Approximate String Matching","link":"https:\/\/ztoz.blog\/posts\/plato-approximate\/","pubDate":"Sun, 24 Jul 2022 19:41:33 -0600","guid":"https:\/\/ztoz.blog\/posts\/plato-approximate\/","description":"<p>In 1986, a survey of approximate string matching algorithms found the algorithm used in the PLATO computer-aided instruction system was third place in quality (F1 score) and was an order of magnitude faster than the best performing algorithms. The PLATO algorithm used an unusual dimensional projection approach, while the competing algorithms of the day used edit distances or phonetics. However, this approach failed to catch on and the design is rarely mentioned in the literature. This video describes how the algorithm worked and some reasons why it became forgotten.<\/p>\n<script>!function(r,u,m,b,l,e){r._Rumble=b,r[b]||(r[b]=function(){(r[b]._=r[b]._||[]).push(arguments);if(r[b]._.length==1){l=u.createElement(m),e=u.getElementsByTagName(m)[0],l.async=1,l.src=\"https:\/\/rumble.com\/embedJS\/u13csgf\"+(arguments[1].video?'.'+arguments[1].video:'')+\"\/?url=\"+encodeURIComponent(location.href)+\"&args=\"+encodeURIComponent(JSON.stringify([].slice.apply(arguments))),e.parentNode.insertBefore(l,e)}})}(window, document, \"script\", \"Rumble\");<\/script>\n<div id=\"rumble_v1avh3j\"><\/div>\n<script>\nRumble(\"play\", {\"video\":\"v1avh3j\",\"div\":\"rumble_v1avh3j\"});<\/script>"},{"title":"Quality of early 1970s approximate string matching algorithms","link":"https:\/\/ztoz.blog\/posts\/nesbit\/","pubDate":"Thu, 16 Jun 2022 12:55:33 -0600","guid":"https:\/\/ztoz.blog\/posts\/nesbit\/","description":"<p>Approximate string matching algorithms (ASMA) determine if two strings are the same, &ldquo;close enough&rdquo;, or are distinct. Spelling error detection and correction were early applications of these algorithms, and today ASMAs (or similar techniques) are used in natural language understanding applications. Hundreds of papers have been written on the subject (<a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/356827.356830\">1980 survey<\/a>, <a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/375360.375365\">2001 survey<\/a>), but this post is focused on the subset of algorithms that were considered commercially viable for the field of computer-assisted instruction (CAI), an early hardware\/software-as-a-service business.<\/p>\n<h1 id=\"background\">Background<\/h1>\n<p>Between 1985 and 1986, John C. Nesbit wrote a pair of articles<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup><sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup> for the <em>Journal of Computer-Based Instruction<\/em> examining the accuracy of ASMAs. His interest was using the algorithms for student response matching in computer-aided instruction. A student response is a typed answer to a question posed by an educational program. For example, the question &ldquo;who was the second president of the United States?&rdquo; might wish to accept variants such as:<\/p>\n<ol>\n<li>John Adams<\/li>\n<li>John A Adams<\/li>\n<li>John A. Adams<\/li>\n<li>President Adams<\/li>\n<li>Pres. Adams<\/li>\n<li>Adams<\/li>\n<\/ol>\n<p>If the student responded with &ldquo;Jon Adams&rdquo; (misspelling John), the developer may wish that answer to be accepted or to coax the student to fix their answer rather than rejecting it outright. Similarly, if the student answered &ldquo;Thomas Jefferson&rdquo; (or variants thereof), the developer may want to reject the answer with a suggestive message such as &ldquo;Jefferson was the third president&rdquo;. As you likely intuiting, the sheer range of variants possible would be very tedious for the developer unless the CAI system added labor-saving techniques.<\/p>\n<p>Some CAI systems would go beyond simple spell-correction and provide response normalization features. For instance, PLATO would treat &ldquo;three&rdquo; and &ldquo;3&rdquo; as equivalent, as well as &ldquo;1\/2&rdquo;, &ldquo;0.5&rdquo;, &ldquo;2\/4&rdquo;, and &ldquo;50\/100&rdquo;. There were also facilities to treat mathematical expressions such as &ldquo;100 + (a * b)\/(c + d)&rdquo; as equivalent to all the other possible forms. Given the limited CPU and memory available (and the fact that response matching was a very common operation), these features used a number of tricks and heuristics, rather than implementing a computer algebra system.<\/p>\n<p>Nesbit selected a group of ten algorithms that were either being used for CAI applications or had performed well in previous studies. He implemented and evaluated all of the algorithms on a DEC VAX-11\/780 except for PLATO, which ran on its native CDC CYBER 170\/720.<\/p>\n<p>There were two operative theories for how words became misspelled. The first theory was that words were often misspelled such that they were correct phonetically. The second theory is that misspellings come from random errors that drop, insert, substitutions, and sometimes transpose characters. Blair, PLANIT, Soundex, and Symonds were based on the phonetic theory. Alberga, EDIST1, EDIST2, and Damerau were based on edit distance. Damerau-Symonds combined the two theories. PLATO projected words into a bitstring via a variety of feature extraction rules.<\/p>\n<h1 id=\"results\">Results<\/h1>\n<p>Table 1 is based on the four figures from (Nesbit 1986). We extracted the Type I and Type II errors from the figures via visual examination, and based on data about the four datasets (Table 2), computed a F1 score. For the algorithms Alberga, EDIST1, and EDIST2, since the figures included a range of values, we took the value closest to the origin. Table 3 summarizes the results for each algorithm using the harmonic mean of the F1 score from each of the four datasets.<\/p>\n<h2 id=\"table-1-qualitative-measurements-from-nesbit-1986-plus-f1-score\">Table 1: Qualitative measurements from (Nesbit 1986) plus F1 Score<\/h2>\n<table>\n<thead>\n<tr>\n<th>Dataset<\/th>\n<th>Algorithm<\/th>\n<th>Type I Error: FP %<\/th>\n<th>Type II Error:      FN %<\/th>\n<th>F1<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Blair<\/td>\n<td>Alberga<\/td>\n<td>6<\/td>\n<td>6<\/td>\n<td>0.94<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Blair<\/td>\n<td>24<\/td>\n<td>4<\/td>\n<td>0.87<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau<\/td>\n<td>27<\/td>\n<td>0<\/td>\n<td>0.86<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau-Symonds<\/td>\n<td>6<\/td>\n<td>4<\/td>\n<td>0.95<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST1<\/td>\n<td>2<\/td>\n<td>3<\/td>\n<td>0.98<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST2<\/td>\n<td>2<\/td>\n<td>2<\/td>\n<td>0.98<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLANIT<\/td>\n<td>8<\/td>\n<td>19<\/td>\n<td>0.87<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLATO<\/td>\n<td>6<\/td>\n<td>2<\/td>\n<td>0.96<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Soundex<\/td>\n<td>6<\/td>\n<td>21<\/td>\n<td>0.86<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Symonds<\/td>\n<td>17<\/td>\n<td>2<\/td>\n<td>0.91<\/td>\n<\/tr>\n<tr>\n<td>Damerau<\/td>\n<td>Alberga<\/td>\n<td>9<\/td>\n<td>4<\/td>\n<td>0.88<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Blair<\/td>\n<td>29<\/td>\n<td>3<\/td>\n<td>0.66<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau<\/td>\n<td>18<\/td>\n<td>0<\/td>\n<td>0.79<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau-Symonds<\/td>\n<td>9<\/td>\n<td>4<\/td>\n<td>0.88<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST1<\/td>\n<td>2<\/td>\n<td>2<\/td>\n<td>0.97<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST2<\/td>\n<td>2<\/td>\n<td>3<\/td>\n<td>0.96<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLANIT<\/td>\n<td>30<\/td>\n<td>21<\/td>\n<td>0.60<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLATO<\/td>\n<td>12<\/td>\n<td>2<\/td>\n<td>0.85<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Soundex<\/td>\n<td>21<\/td>\n<td>14<\/td>\n<td>0.71<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Symonds<\/td>\n<td>46<\/td>\n<td>2<\/td>\n<td>0.49<\/td>\n<\/tr>\n<tr>\n<td>Masters<\/td>\n<td>Alberga<\/td>\n<td>11<\/td>\n<td>8<\/td>\n<td>0.93<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Blair<\/td>\n<td>34<\/td>\n<td>4<\/td>\n<td>0.86<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau<\/td>\n<td>28<\/td>\n<td>0<\/td>\n<td>0.90<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau-Symonds<\/td>\n<td>14<\/td>\n<td>5<\/td>\n<td>0.93<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST1<\/td>\n<td>7<\/td>\n<td>4<\/td>\n<td>0.96<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST2<\/td>\n<td>5<\/td>\n<td>4<\/td>\n<td>0.97<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLANIT<\/td>\n<td>19<\/td>\n<td>18<\/td>\n<td>0.85<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLATO<\/td>\n<td>13<\/td>\n<td>4<\/td>\n<td>0.94<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Soundex<\/td>\n<td>12<\/td>\n<td>21<\/td>\n<td>0.85<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Symonds<\/td>\n<td>22<\/td>\n<td>3<\/td>\n<td>0.91<\/td>\n<\/tr>\n<tr>\n<td>Nesbit<\/td>\n<td>Alberga<\/td>\n<td>29<\/td>\n<td>21<\/td>\n<td>0.81<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Blair<\/td>\n<td>59<\/td>\n<td>2<\/td>\n<td>0.74<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau<\/td>\n<td>47<\/td>\n<td>2<\/td>\n<td>0.82<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Damerau-Symonds<\/td>\n<td>20<\/td>\n<td>17<\/td>\n<td>0.86<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST1<\/td>\n<td>21<\/td>\n<td>17<\/td>\n<td>0.86<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>EDIST2<\/td>\n<td>14<\/td>\n<td>12<\/td>\n<td>0.90<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLANIT<\/td>\n<td>17<\/td>\n<td>37<\/td>\n<td>0.75<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>PLATO<\/td>\n<td>27<\/td>\n<td>8<\/td>\n<td>0.88<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Soundex<\/td>\n<td>17<\/td>\n<td>20<\/td>\n<td>0.85<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Symonds<\/td>\n<td>27<\/td>\n<td>7<\/td>\n<td>0.89<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"table-2-datasets-used-in-nesbit-1986\">Table 2: Datasets used in (Nesbit 1986)<\/h2>\n<table>\n<thead>\n<tr>\n<th>Dataset<\/th>\n<th>Words<\/th>\n<th>Mispellings<\/th>\n<th>Total<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Blair<\/td>\n<td>117<\/td>\n<td>117<\/td>\n<td>234<\/td>\n<td>Common misspellings<\/td>\n<\/tr>\n<tr>\n<td>Damerau<\/td>\n<td>41<\/td>\n<td>44<\/td>\n<td>85<\/td>\n<td>Newspaper errors<\/td>\n<\/tr>\n<tr>\n<td>Masters<\/td>\n<td>179<\/td>\n<td>320<\/td>\n<td>499<\/td>\n<td>Grades 8, 12, 16<\/td>\n<\/tr>\n<tr>\n<td>Nesbit<\/td>\n<td>213<\/td>\n<td>524<\/td>\n<td>737<\/td>\n<td>Grades 2-6<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"table-3-harmonic-mean-of-f1-scores\">Table 3: Harmonic Mean of F1 Scores<\/h2>\n<table>\n<thead>\n<tr>\n<th>Algorithm<\/th>\n<th>Harmonic Mean<\/th>\n<th>Publication Year<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Alberga<\/td>\n<td>0.89<\/td>\n<td>1967<\/td>\n<\/tr>\n<tr>\n<td>Blair<\/td>\n<td>0.77<\/td>\n<td>1960<\/td>\n<\/tr>\n<tr>\n<td>Damerau<\/td>\n<td>0.84<\/td>\n<td>1964<\/td>\n<\/tr>\n<tr>\n<td>Damerau-Symonds<\/td>\n<td>0.90<\/td>\n<td>1970<\/td>\n<\/tr>\n<tr>\n<td>EDIST1<\/td>\n<td>0.94<\/td>\n<td>1974<\/td>\n<\/tr>\n<tr>\n<td>EDIST2<\/td>\n<td>0.95<\/td>\n<td>1983<\/td>\n<\/tr>\n<tr>\n<td>PLANIT<\/td>\n<td>0.75<\/td>\n<td>1966<\/td>\n<\/tr>\n<tr>\n<td>PLATO<\/td>\n<td>0.91<\/td>\n<td>1972 \/ 1985<sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote-ref\" role=\"doc-noteref\">3<\/a><\/sup><\/td>\n<\/tr>\n<tr>\n<td>Soundex<\/td>\n<td>0.81<\/td>\n<td>1930<sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote-ref\" role=\"doc-noteref\">4<\/a><\/sup><\/td>\n<\/tr>\n<tr>\n<td>Symonds<\/td>\n<td>0.75<\/td>\n<td>1970<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>In descending order, the best performing algorithms by quality were EDIST2, EDIST1, PLATO, Damerau-Symonds, Alberga, Damerau, Soundex, Blair, and Symonds.<\/p>\n<h1 id=\"conclusion\">Conclusion<\/h1>\n<p>By the early 1970s, edit distance based algorithms were showing their qualitative superiority to phonetically based algorithms. Furthermore, edit distance based algorithms were more amenable to multiple languages, while the phonetic algorithms evaluated were all tuned for English text. (However, all the datasets used English examples.) However, edit distance algorithms were more costly to compute than the others in the sample (except Damerau, which restricted the error count to one).  EDIST1 and EDIST2 both required computing the edit distance, which cost \\(O(mn)\\) time and \\(O(mn)\\) space where <em>m<\/em> and <em>n<\/em> were the length of the two strings, although in 1980 a solution with \\(O(m)\\) space was found (Navarro 2001). Solutions with \\(O(kn)\\) time were found in the 80s and 90s, where <em>k<\/em> was the error threshold. Today, edit distance is the standard technique for handling errors in information retrieval, conversational analytics, and biological informatics.<\/p>\n<p>The PLATO algorithm stands apart in the group due to its excellent performance, both in terms of quality and computational speed. Its design was inspired by a numerical taxonomy approach and uses techniques from cluster analysis. Due to the novelty and effectiveness of the approach, we will discuss details of the implementation in a future production.<\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Nesbit, John C. &ldquo;Approximate string matching in response analysis.&rdquo; Journal of Computer-Based Instruction 12.3 (1985): 71-75. <a href=\"https:\/\/archive.org\/details\/sim_journal-of-computer-based-instruction_summer-1985_12_3\/page\/n15\/mode\/1up\">archive.org<\/a>&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>Nesbit, John C. &ldquo;The accuracy of approximate string matching algorithms.&rdquo; Journal of Computer Based Instruction 13.3 (1986): 80-83. <a href=\"https:\/\/archive.org\/details\/sim_journal-of-computer-based-instruction_summer-1986_13_3\/page\/n18\/mode\/1up\">archive.org<\/a>&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:3\" role=\"doc-endnote\">\n<p>The PLATO matching algorithm evolved over time. 1972 was the date of the first publication describing its design in the TUTOR language, although some version of the algorithm may date back as far as 1965. The algorithm that was tested by Nesbit was based on the 1985\/1986 version.&#160;<a href=\"#fnref:3\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:4\" role=\"doc-endnote\">\n<p>Soundex was first patented in 1918. The U.S. Census uses a version called &ldquo;American Soundex&rdquo; which was defined in 1930.&#160;<a href=\"#fnref:4\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>\n"},{"title":"Computing's Woodstock - Los Alamos 1976","link":"https:\/\/ztoz.blog\/posts\/woodstock\/","pubDate":"Wed, 08 Jun 2022 10:46:55 -0600","guid":"https:\/\/ztoz.blog\/posts\/woodstock\/","description":"<p>In 1976, Los Alamos hosted the International History of Computing Conference where the pioneering efforts of computing were discussed, often by the pioneers themselves. The Computer History Museum has posted <a href=\"https:\/\/computerhistory.org\/playlists\/international-research-conference-on-the-history-of-computing\/\">videos from the talks<\/a>, which cover important machines, languages, people, institutions, and practices invented in the first 30 years of the craft. Dag Spicer, the senior curator at CHM, has written a <a href=\"https:\/\/computerhistory.org\/blog\/computings-woodstock\/\">blog post<\/a> giving an overview and background of the event.<\/p>"},{"title":"Dawn of Computer Music","link":"https:\/\/ztoz.blog\/posts\/dawn-computer-music\/","pubDate":"Thu, 26 May 2022 12:58:44 -0600","guid":"https:\/\/ztoz.blog\/posts\/dawn-computer-music\/","description":"<p>The first song composed and played by a computer was <em>The Silver Scale<\/em> on May 17, 1957. This video discusses the physics and hardware necessary for this\ninvention, as well as the domain specific language, MUSIC-V, that arose\nfor efficient programming of sound waves.<\/p>\n<script>!function(r,u,m,b,l,e){r._Rumble=b,r[b]||(r[b]=function(){(r[b]._=r[b]._||[]).push(arguments);if(r[b]._.length==1){l=u.createElement(m),e=u.getElementsByTagName(m)[0],l.async=1,l.src=\"https:\/\/rumble.com\/embedJS\/u13csgf\"+(arguments[1].video?'.'+arguments[1].video:'')+\"\/?url=\"+encodeURIComponent(location.href)+\"&args=\"+encodeURIComponent(JSON.stringify([].slice.apply(arguments))),e.parentNode.insertBefore(l,e)}})}(window, document, \"script\", \"Rumble\");<\/script>\n<div id=\"rumble_v12hc5d\"><\/div>\n<script>\nRumble(\"play\", {\"video\":\"v12hc5d\",\"div\":\"rumble_v12hc5d\"});<\/script>"},{"title":"Fifty years of strings: Language design and the string datatype","link":"https:\/\/ztoz.blog\/posts\/strings-history\/","pubDate":"Mon, 02 May 2022 10:53:44 -0600","guid":"https:\/\/ztoz.blog\/posts\/strings-history\/","description":"<p><a href=\"https:\/\/ztoz.blog\/strings-history-draft.pdf\">PDF Version<\/a><\/p>\n<h1 id=\"introduction\">INTRODUCTION<\/h1>\n<p>In a programming language, a string is a sequence of characters that\noften represents some message to the user or contains input such as a\nname or the contents of a file. Common operations on strings include\ncomparison and collation, extraction of substrings (tokenization),\npattern matching, and data clean up such as trimming whitespace. In the\n1960s, language designers started to add first-class string datatypes to\ntheir languages as business computing and interactive use cases\nexpanded. Although programmers could use numerical arrays as strings (as\nwas the practice in Fortran at the time), character-based strings\nsimplified programmer\u2019s jobs through re-use of library functions and\nreduction of errors caused by accidental intermixing of numeric and\ncharacter based arrays. Today, a language implementation may feature\nhundreds of built-in functions for string manipulation, presentation,\nand localization.<\/p>\n<p>Language designers have flexibility around the in-memory representation\nand storage of the string, encoding of characters, breadth of built-in\nfunctionality, and facilities for supporting a multi-lingual\nenvironment. Their choices impact performance, security, and usability.\nOver the past 50 years, designers have explored many options and\nalthough there is convergence in a number of areas, there is still room\nfor innovation.<\/p>\n<h2 id=\"what-is-a-character\">WHAT IS A CHARACTER?<\/h2>\n<p>A string is a sequence of characters. However, a character is both a\nphysical and semantic concept. Physically, characters are stored as one\nor more integral numbers with an implicit or explicit character set. A\ncharacter set is a mapping of numbers to a semantic meaning.\nSemantically, a character may map to a graphical symbol or to a\ngrapheme. A symbol has a visual appearance, but a grapheme is the\nsmallest functional unit in a language and may not necessarily have a\nsingle visual representation. The semantics of a character depend on the\ncharacter set. (Characters may also have only semantic meaning to a\ncomputer, such as the line-feed control character in ASCII. Some\nlanguages, like C, use the <code>char<\/code> data type to also store random bytes,\nwhich has implications on memory representation as noted below.)<\/p>\n<p>To illustrate the distinction, the abstract character \u201creturn\u201d has a\nsingle grapheme (Unicode U+8FD4), but the visual representation is\ndifferent in Chinese, Japanese, and Korean (see\nFigure 1). A character set based on graphemes would\nhave a single entry for return while one based on symbols would have at\nleast five entries. Unicode adopted the grapheme approach but the\ncomplexities of that approach have left indelible traces in programming\nlanguages that sought early Unicode interoperability (see\nSection <a href=\"#MULTILINGUAL\">5<\/a>).<\/p>\n<p><strong>Figure 1: Differences for the same Unicode character (U+8FD4) in regional versions of Source Han Sans<\/strong><\/p>\n<p><img src=\"Source_Han_Sans_Version_Difference.svg\" alt=\"Differences for the same Unicode character (Emphrase 2019)\"><\/p>\n<h1 id=\"timeline-and-major-design-choices\">TIMELINE AND MAJOR DESIGN CHOICES<\/h1>\n<p>Table 1 lists languages with their release year,\nin-memory representation of strings, character types, and whether\nstrings are immutable. For inclusion in this timeline, we selected\nlanguages that achieved popularity, were intended for industrial use\nacross multiple problem domains, and where the designer had latitude in\ndesign decisions. This list excludes many languages that run on the Java\nVirtual Machine or Common Language Runtime as those languages would\ninherit that environment\u2019s standard string representation. Spreadsheets\nwere also excluded because their design was not intended to be\nmulti-domain. The date for each language is based on their first major\npublic release or when there was a significant change in their handling\nof strings. Programmers could acquire libraries that expanded string\nfunctionality or implemented a different approach from the underlying\nlanguage, but our focus is on choices made by language designers. For\nexample, the C++ standard used C strings until the 98 version introduced\n<code>std::string<\/code>, but there were many common libraries that added\nalternative string support to C++.<\/p>\n<p><strong>Table 1: String Representations in Popular Languages<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>COBOL<\/th>\n<th>Pascal (UCSD)<\/th>\n<th>C (K&amp;R)<\/th>\n<th>Smalltalk<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Release Year<\/td>\n<td>1968<\/td>\n<td>1978<\/td>\n<td>1978<\/td>\n<td>1980<\/td>\n<\/tr>\n<tr>\n<td>Representation<\/td>\n<td>FL<\/td>\n<td>LP<\/td>\n<td>NT<\/td>\n<td>RE<\/td>\n<\/tr>\n<tr>\n<td>Character Type<\/td>\n<td>B<\/td>\n<td>B<\/td>\n<td>B<\/td>\n<td>B<\/td>\n<\/tr>\n<tr>\n<td>Immutable?<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td>Reference<\/td>\n<td>(Data Systems Languages, n.d.)<\/td>\n<td>(Shillington and Ackland, n.d.)<\/td>\n<td>(Kernighan and Ritchie 1978)<\/td>\n<td>(Goldberg and Robson 1983)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>Objective-C<\/th>\n<th>Perl 1.0<\/th>\n<th>Python 1.0<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Release Year<\/td>\n<td>1986<\/td>\n<td>1987<\/td>\n<td>1994<\/td>\n<\/tr>\n<tr>\n<td>Representation<\/td>\n<td>NT<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<\/tr>\n<tr>\n<td>Character Type<\/td>\n<td>B<\/td>\n<td>B<\/td>\n<td>B<\/td>\n<\/tr>\n<tr>\n<td>Immutable?<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>Yes<\/td>\n<\/tr>\n<tr>\n<td>Reference<\/td>\n<td>(Cox 1986)<\/td>\n<td>(Wall 1987)<\/td>\n<td>(Rossum 1994)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>Ada 95<\/th>\n<th>Java 1.0<\/th>\n<th>C++ 98<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Release Year<\/td>\n<td>1995<\/td>\n<td>1996<\/td>\n<td>1998<\/td>\n<\/tr>\n<tr>\n<td>Representation<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<\/tr>\n<tr>\n<td>Character Type<\/td>\n<td>BW<\/td>\n<td>U16e<\/td>\n<td>BW<\/td>\n<\/tr>\n<tr>\n<td>Immutable?<\/td>\n<td>No<\/td>\n<td>Yes<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td>Reference<\/td>\n<td>(\u201cAda Reference Manual: ISO\/Iec 8652:1995(E) with Technical Corrigendum 1\u201d 1995)<\/td>\n<td>(Gosling, Joy, and Steele 1996)<\/td>\n<td>(Stroustrup 1997)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>C# 1.0<\/th>\n<th>Go<\/th>\n<th>Swift\t1.0<\/th>\n<th>Rust 1.0<\/th>\n<th>Swift 5<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Release Year<\/td>\n<td>2002<\/td>\n<td>2012<\/td>\n<td>2014<\/td>\n<td>2015<\/td>\n<td>2019<\/td>\n<\/tr>\n<tr>\n<td>Representation<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<td>RE<\/td>\n<\/tr>\n<tr>\n<td>Character Type<\/td>\n<td>U16<\/td>\n<td>B\/U8<\/td>\n<td>U16<\/td>\n<td>U8<\/td>\n<td>U8<\/td>\n<\/tr>\n<tr>\n<td>Immutable?<\/td>\n<td>Yes<\/td>\n<td>Yes<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td>Reference<\/td>\n<td>(Hejlsberg et al. 2008)<\/td>\n<td>(Pike 2013)<\/td>\n<td>(Apple 2021)<\/td>\n<td>(Klabnik and Nichols 2018)<\/td>\n<td>(Ilseman 2019)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Representations:<\/p>\n<dl>\n<dt>FL<\/dt>\n<dd>\n<p>fixed length array (whitespace padded)<\/p>\n<\/dd>\n<dt>LP<\/dt>\n<dd>\n<p>array of characters prefixed with length (length is inline)<\/p>\n<\/dd>\n<dt>NT<\/dt>\n<dd>\n<p>array of characters terminated by NUL or other sentinel value<\/p>\n<\/dd>\n<dt>RE<\/dt>\n<dd>\n<p>record (or richer) encapsulation of the in-memory representation<\/p>\n<\/dd>\n<\/dl>\n<p>Designers that chose a record for the in-memory representation often\nused the opportunity to store additional data such as pre-computed\nhashes of the content or specialize the representation such that short\nstrings will take less space. For interoperability, designers may also\nchoose to include a NUL byte within the array, or store a length-prefix\nbyte, even if the language does not expose this fact.<\/p>\n<p>Character types:<\/p>\n<dl>\n<dt>B<\/dt>\n<dd>\n<p>Byte (Older languages may support 6-bit bytes)<\/p>\n<\/dd>\n<dt>BW<\/dt>\n<dd>\n<p>Byte (Octet) or Wide Character<\/p>\n<\/dd>\n<dt>U16e<\/dt>\n<dd>\n<p>16-bit Unicode, prior to creation of surrogate pairs (UCS-2)<\/p>\n<\/dd>\n<dt>U16<\/dt>\n<dd>\n<p>16-bit Unicode (UTF-16)<\/p>\n<\/dd>\n<dt>U8<\/dt>\n<dd>\n<p>8-bit Unicode (UTF-8)<\/p>\n<\/dd>\n<\/dl>\n<p>If a designer decides that strings will follow a specific character\nencoding, they need to decide how strictly that guarantee will be met.\nPrior to the mid-1990s, designers allowed strings to hold arbitrary\nbytes. An individual program could enforce that a string held bytes that\nwere valid per some encoding and character set, but that was not a\nlanguage concern. Since neither ISO 646 nor ISO 8859 character sets,\nwhich include ASCII, prohibited any specific byte values or sequences,\nconformance was a non-issue. With the adoption of Unicode, though, which\ndid include invalid sequences, language designers had a choice whether a\nstring was always valid or whether an initialized string\u2019s operations\ncould be in an error state. In Go\u2019s case, the string type can store\narbitrary bytes but the bytes are processed as if they are encoded with\nUTF-8 (Pike 2013) which may lead to runtime errors. In contrast, Rust\nenforces that strings are always valid so once a string is constructed,\nthere are no runtime errors (\u201cString in Std::String - Rust,\u201d n.d.).\nSince there are performance costs for validating UTF-8 encoding, and\nsome use cases require working with potentially bad or corrupt data,\nthis is an implementation trade-off between supporting some use cases\nmore directly and at higher performance versus type safety and avoidance\nof run-time errors.<\/p>\n<h1 id=\"in-memory-representation\">IN-MEMORY REPRESENTATION<\/h1>\n<p>All the languages in Table 1 leverage an array to store a string (although\nthat fact may not be exposed and, in Swift\u2019s case, there are multiple\npotential backing stores for a string), so the in-memory representation\nof strings is a reflection of how the language represents arrays. The\narray may be storing bytes or wide characters, but the storage is\ncontiguous.<\/p>\n<p>Not all languages choose to use an array as the backing store.\nHistorically, some languages packed characters within a word (e.g.\n36-bit architectures included 6, 7, and 9 bit packing schemes) but with\nbyte-addressing and richer type systems designers have stopped using\nthis approach. More recently, Haskell\u2019s designers implemented the core\n<code>String<\/code> class with a linked list of Unicode code points, but Haskell\npractitioners tend to use alternative packed-array based representations\nfor better performance (Haskell 2021) as the linked list representation requires\nseveral words of memory per character and most operations on a list are\nmore expensive than an array.<\/p>\n<p>One of the earliest representations of strings in a higher-level\nlanguage was the Hollerith datum or constant, as implemented in Fortran\nI. Syntactically, the punch cards featured a <em>n<\/em>H followed by <em>n<\/em>\ncharacters. The remaining <em>n<\/em> characters were stored as an array of\nintegers, padded if necessary to the given length, as Fortran I did not\nhave a character type. (Fortran 77 normalized the syntax with quoted\nstrings.) COBOL 68 similarly represented strings as padded fixed-length\nsequences, but featured alphanumeric data types \u2014 characters and numbers\nwere distinct.<\/p>\n<p>Pascal (as initially defined by Jensen and Wirth) and Ada 83 represented\nstrings as fixed-length arrays. Both of these languages featured a\nstrong type system and chose to incorporate the length of the array\nwithin the type definition. Thus, an array of 64 characters was a\ndistinct data type than an array of 128 characters. Lacking a way to\nreference a generic array of character, programmers were restricted from\nwriting generic string functions. To work around these limitations,\nimplementations, such as the influential UCSD version, added\nnon-standard functionality. UCSD used length-prefix arrays, which proved\npopular and saw adoption in multiple Pascal implementations. Later\nversions of Pascal and Ada added standardized ways to interact with\nvariable length strings. More recently, Go also chose to include the\nlength of the array in its type definition. However, Go\u2019s use of slices,\nor views into the array, permit the creation of generic functions.<\/p>\n<p>Length-prefix arrays are arrays of characters where the first\ncharacter\u2019s ordinal value is interpreted as the length of the string.\nSo-called \u201cPascal Strings\u201d, these could range up to 255 characters (the\nmaximum size of an unsigned byte). The term pascal string is a misnomer\nboth because it was popularized by the UCSD implementation of Pascal,\nnot by the Pascal language standard itself, and the fact that the\ntechnique predated the language and implementation by at least a decade.\nIn 1968, the Multics PL\/I compiler used the length-prefix technique to\nstore the length of varying length strings (Freiburghouse 1969). The\nPL\/I language used the length for arrays bounds checking.<\/p>\n<p>A distinction between length-prefix and Hollerith constants is that the\nlength-prefix is interpreted as a machine integer while Hollerith\nconstants were interpreted as human-input digits. Net strings are a\nmodern version of Hollerith constants. Since multiple digits are allowed\nfor Hollerith constants (and net strings), much larger lengths can be\nmodeled but at a greater cost in storage of the length.<\/p>\n<p>While the length prefix representation is efficient in storage requiring\njust a single byte, maintains locality of definition, and is still\nwidely used in \u201con the wire\u201d protocols (see appendix in (Lucks, Grosch,\nand K\u00f6nig 2017)), it has limitations. The straight-forward encoding of\nthe length as a single byte limited strings to a maximum length of 255\ncharacters. (Some implementations with wide characters allowed lengths\nup to 65,535. A more complex representation could allow for greater\nrange by using multiple bytes, but although the approach was discussed,\nit was not widely adopted.) Furthermore, because the length was inline\nto the array, programmers had to be careful to not invalidate the length\nand start processing after the first index.<\/p>\n<p>In the 1980s, implementations started to converge on NUL terminated\nstrings. A NUL terminated string is an array of characters ended by the\nNUL or zero value character. (Languages could use other symbols for\ntermination, such as B\u2019s sequence of \u201c*e\u201d (Ritchie 1993) but NUL was\nconvenient and widely available (Plauger 1991).) In contrast to\nlength-prefix arrays, this representation allows for arbitrarily long\nstrings and removes the need for special logic around the start of the\narray. Pointers can also index into the array to model a changing start\nposition without requiring copying of the data, thus rendering many\nparsing and scanning operations cheaper. However, computing the length\nof a string requires a linear search for the terminator symbol whereas\nthe length-prefix representation has constant time lookup.<\/p>\n<p>Dennis Ritchie defended the design choice in C in his History of\nProgramming Language presentation with:<\/p>\n<blockquote>\n<p>Aside from one special rule about initialization by string literals,\nthe semantics of strings are fully subsumed by more general rules\ngoverning all arrays, and as a result the language is simpler to\ndescribe and to translate than one incorporating the string as a\nunique data type. Some costs accrue from its approach: certain string\noperations are more expensive than in other designs because\napplication code or a library routine may occasionally search for the\nend of a string, because few built-in operations are available, and\nbecause the burden of storage management for strings falls more\nheavily on the user. Nevertheless, C\u2019s approach to strings works\nwell. (Ritchie 1993)<\/p>\n<\/blockquote>\n<p>Although NUL terminated strings are associated with C, C did not\noriginate them. A likely influence on C\u2019s use of NUL terminated strings\nwas the existence of the ASCIZ (ASCII with a zero suffix byte) assembly\nmacro instruction introduced for the PDP-10 (as part of MACRO-10) and\ncarried forward for the PDP-11. This macro created NUL terminated\nstrings. Architecturally, NUL terminated operates similar to reading\nfrom a tape where the quantity of data is unknown but there is an\nterminating symbol at the end.<\/p>\n<p>The inability to represent the NUL byte embedded within a string was\nseen as an acceptable restriction for some languages (e.g. in C, the\nprogrammer could leverage the underlying array and avoid NUL-expecting\nlibrary functions). Python, which did not differentiate between byte\nstrings and strings in its initial type system, stored both an explicit\nlength in the <code>objstring<\/code> record and terminated the internal character\narray with a NUL in order to be both interoperable as well as handle\narbitrary byte sequences. The fact that the strings were NUL terminated\nis not exposed within the language itself, but is a useful property for\nPython modules that leverage the cPython\u2019s internal API.<\/p>\n<p>Neither representation above supports encoding the <em>capacity<\/em> of the\nunderlying buffer or array (in cases where the current length of the\nstring does not match the capacity). If the underlying array\u2019s capacity\ncannot be checked (like in C for dynamically allocated arrays), programs\nin the language become vulnerable to buffer overflows and buffer\nover-reads. These are vulnerabilities where a program can access a\nmemory location outside the bounds of the array, which can lead to\nintentional or unintentional memory reads or memory corruption.<\/p>\n<p>Array bounds checking was initially seen as a way of protecting a\nprogrammer from their own mistakes, but the Morris worm demonstrated\npublicly the information security implications of unchecked\naccess (Cowan et al. 2000). The need for better security led to a\nresurgence of \u201cchecked\u201d languages that included features such as bounds\nchecking.<\/p>\n<p>By storing information about a string \u201cout-of-band\u201d from the underlying\nbuffer, languages can support flexible use cases efficiently and provide\nbetter security. For example, in the 1972 version of SNOBOL (Griswold\n1972), strings were represented by a \u201cqualifier\u201d, a record with\n\u2004&lt;\u2004<em>T<\/em>,\u2006<em>F<\/em>,\u2006<em>V<\/em>,\u2006<em>O<\/em>,\u2006<em>L<\/em>&gt; fields. The T and F fields stored the\ndata type and metadata about the type, the V was a memory pointer to an\narray of characters, O was an offset within that array, and L was the\nlength of the access. Access to a string was always made through a\nqualifier. This representation allowed slices of strings to be\nrepresented without requiring any new copying or allocation. This was an\nimportant capability due to SNOBOL\u2019s heavy use of pattern matching. In\ncomparison, neither length-prefix or null-terminated in-band\nrepresentations allow a slice to have the same ergonomics as the base\nstring.<\/p>\n<p>For languages where arrays can be queried for their capacity, this\nout-of-band mechanism is the array itself. For instance, the Java\nvirtual machine has an instruction to return the length of an array. The\nstorage of array lengths is abstracted away. With the capacity\npersisted, languages can check array accesses and provide safety\nguarantees. (Bounds checking can sometimes be achieved statically at\ncompile time rather than run-time.) However, because the in-memory\nrepresentation is encapsulated rather than being exposed in a known\naddressable manner, interoperability between processes requires more\ncomplex solutions than just <code>mmap<\/code>\u2019ing shared memory and exchanging\naddresses. Since security requirements have pushed for greater isolation\nbetween processes and more safety guarantees, this is another trade-off\nlanguage designers make between performance, interoperability, and\nsecurity.<\/p>\n<p>The debate between safety, necessary use cases, and the cost of array\nbounds checking rages on. As a recent example of a systems language,\nRust implements bounds-checking on arrays but includes escape hatches to\nallow for arbitrary and unchecked memory access, thus demonstrating the\nrange of options available to designers.<\/p>\n<h1 id=\"built-in-functionality\">BUILT-IN FUNCTIONALITY<\/h1>\n<p>The ergonomics for strings is driven by their ease-of-use and included\nfunctionality. As soon as languages started to support strings as\nfirst-class values, they included standard functionality such as string\nlength computation, access to individual characters, and concatenation.<\/p>\n<p>The 1965 manual for PL\/I running on the IBM360 lists the following\ngeneric string functions: substring, string index (contains), length,\nhigh and low (uppercase and lowercase conversion), concatenation,\ncomparison\/collating, repeat (repeated self-concatenation), and several\nfunctions for conversions to and fro character string and bit strings.\nThis is a significant list of functionality; for comparison, COBOL in\n1968 only included concatenation, a method to count the number of\noccurrences of a substring, and a tokenization routine. PL\/I is often\ncriticized as a very large language (for its time) and most languages\nthat followed in the next decade provided a more limited set of\nfunctions. In fact, the first release of Ada in 1983 included no string\nfunctions at all (the language design of arrays rendered generic string\nfunctions infeasible). Objective-C in 1986 had similar functionality on\nrelease as C from nearly a decade earlier, supporting comparison,\nconcatenation, length, and replace functions.<\/p>\n<p>From the late 1980s on, string functionality starts to standardize, with\nfunctions for comparison, concatenation, contains, length, replace,\nsubstring or slices, tokenization, and trimming of whitespace appearing\nin all the survey languages. In contrast to Ada 83, Ada 95 includes\neverything from the previous list except tokenization.<\/p>\n<p>Independent of other design choices, a language designer may decide to\nuse immutable strings. An immutable string is one where operations do\nnot change the underlying data, but only change a view of the string or\ncreate a copy. Immutability simplifies performance improvements such as\ncaching, re-use of strings (in cases where duplicates are likely), and\nconcurrent access to the data. Immutability carries a trade-off that\ncertain operations, like replacement, can be more expensive in time and\nmemory.<\/p>\n<p>SNOBOL included immutable strings in the 1960s (in this case, to save\nmemory by allowing many partial views of a string) but their adoption\nexpanded in the 1990s. Both Python and its predecessor ABC had immutable\nstrings (Rossum 2009). In Java\u2019s case, immutability was borne from\nsecurity requirements \u2014 if a string (e.g. a file path) could change\nduring a virtual machine authorization check, it could be used to defeat\nthe check (\u201cJames Gosling on Java, May 2001: A Conversation with Java\u2019s\nCreator, James Gosling\u201d 2001). As arrays are not immutable, this is an\nindication that language designers saw value in distinguishing use cases\nof strings versus arrays of characters.<\/p>\n<p>Although immutable strings have been adopted in many languages, there is\nnot an engineering consensus on their use. The two most recent language\nexamples in the list, Swift and Rust, use mutable strings (although they\nhave various immutable views on a string). Both of these languages\ninstead support a &ldquo;mixed-mode&rdquo; where mutability can be used if\nnecessary, but mutability is not the default access pattern.<\/p>\n<h1 id=\"MULTILINGUAL\">MULTILINGUAL SUPPORT<\/h1>\n<p>By the 1970s, computer systems were converging on a 8-bit byte or octet.\n(Trigraph support in C, for support of 6-bit source character sets, is a\nrelic from this era (Brodie 1992).) With different hardware\nmanufacturers implementing their own character sets, sharing data\nbetween systems was difficult. The U.S. government put their weight\nbehind the ASCII character set in order to establish some baseline of\ninteroperability. Although ASCII was sufficient for most U.S.\napplications, it required modification to work for even other\nEnglish-speaking countries. To support their own languages or add\ncharacters for specialized applications like box drawing, governments\nand manufacturers implemented new code pages. These alternative\ncharacter sets tended to be similar to ASCII and simply changed a few\ncharacters (e.g. ISO\/IEC-646 used ASCII as a baseline but created space\nfor \u201cnational characters\u201d that could represent certain letters or\ncurrency symbols). (Meanwhile, IBM systems stuck to EBCDIC, which\nrequired the creation of parallel national code pages.) Character sets\nalso started to use the full 8-bit space such as ISO-8859. The seven bit\nspace of ISO\/IEC-8859 was sufficient for locales using the Latin script,\nbut the space was too small for languages such as Chinese, Japanese, and\nKorean.<\/p>\n<p>To support the thousands of characters required for Chinese, Japanese,\nand Korean, designers followed one of two implementation strategies:\nwide characters or multi-byte characters. (The terminology is standard\nbut confusing, since both encodings require multiple bytes.\nAlternatively, the two strategies could be called fixed-length and\nvariable-length.) The wide character strategy expanded all characters\nfrom an 8-bit to a 16-bit value while the multi-byte strategy used shift\ncharacters to use one, two, or potentially more bytes per character. The\nadvantage of wide characters is simplicity \u2014 computing the octet length\nis straightforward and access to a character is a simple array look-up \u2014\nwhile the advantage of multi-byte is more efficient use of space when\nthe range of values are not used uniformly. If the consuming program\nused the wrong character set, a wide character encoding would include\nthe NUL byte, leading to early truncation errors for programs using NUL\nterminating strings.<\/p>\n<p>Although code pages allowed tailoring of computer systems for specific\nlocales, there were problems with interoperability. First, a file reader\nhad to know a priori the code page for a file as there was no way to\nidentify the code page being used within the encoding. Second,\nconversions between code pages were lossy because characters might be\nmissing between the source and destination. Third, each major vendor\nmaintained their own database of code pages and those databases\ndiverged, even with multi-vendor attempts at cooperation. Thus, files\nmay not even be portable between computer systems within the same\nlocale. This was a major tax on internationalization of software and the\nincreasing need to share digital data globally.<\/p>\n<p>In 1988, Joe Becker started discussing Unicode publicly, first at\nUniforum in Dallas, Texas and he later released the Unicode \u201988 proposal\nat ISO WG2 (Wideburg 1998). \u201cUnique, universal, and uniform\u201d, the new\ncharacter set was proposed as one that would \u201cencompass the characters\nof all the world\u2019s living languages\u201d (Becker 1988). Becker proposed\nusing a fixed width 16-bit wide character encoding, and argued that the\nsimplicity of the representation was a better tradeoff than the storage\nsavings from a multi-byte encoding. As there was an open debate on\nwhether 16-bits would be sufficient for CJK languages as well as all\nother modern scripts, Becker made two arguments:<\/p>\n<ol>\n<li>\n<p>A character is not a glyph. Becker argued that a grapheme approach\nwould fit within the space budget. Nations could map the semantic\nmeanings to their language\u2019s glyphs, thus handling at the graphical\nsymbol resolution at the typeface level.<\/p>\n<\/li>\n<li>\n<p>Modern use only. Since the scope of the project is to only handle\nwidely used, in-use languages, many characters can be ignored.<\/p>\n<\/li>\n<\/ol>\n<p>Unicode starts to win against competing standards and the two-volume\ndefinition of Unicode 1.0 is published in 1992. Version 1.0 describes a\nsingle wide character encoding (where each value corresponds to a single\ncode point). To handle byte order ambiguity, the standard includes byte\norder markers. Version 2.0, published in 1996, incorporates the UTF-7\nand UTF-8 encodings, and a surrogate mechanism to encode characters\noutside the 16-bit range. The surrogate mechanism is meant to \u201callow\nrepresentation of rare characters in future extensions of the Unicode\nstandard\u201d. At this point, there are no characters defined in the\nsurrogate area. The addition of surrogates effectively switches the wide\ncharacter encoding to a multi-byte encoding, although implementations\nmade before the change (or implementations that deliberately ignore\neverything outside Unicode\u2019s basic multilingual plane) remain wide\ncharacter based. The former encoding becomes the UCS-2 encoding while\nthe UTF-16 encoding includes support for surrogates.<\/p>\n<p>The use of surrogates, but without assigning any code points to that\nplane, continued through version 3.0 published in 1999. Version 3.1 in\n2001 added 42,711 characters to the \u201cCJK Unified Ideographs Extension B\u201d\nsurrogate plane. This was approximately 2\/3 of the original 16-bit space\nof 65,536 and double the originally allocated amount of 20,940\ncharacters for Chinese, Japanese, and Korean ideographs. Since the\noriginal 16-bit design objective was no longer tenable, version 4.0 in\n2003 carried a significant design change. In the previous two versions,\nthe very first design principle (of 10) was \u201cSixteen-bit character\ncodes\u201d with the scope of Unicode defined as only covering modern use\ncharacters. Version 4 changed this design principle (the only one to\nchange) to \u201cUniversality\u201d and expanded the scope of Unicode to include\nhistoric writing systems.<\/p>\n<p>These changes rendered the simple UCS-2 encoding untenable if a program\nwanted to support Unicode and target the world\u2019s markets. Languages that\nwere designed in the interim required some redesign. For example, Ada 95\nintroduced wide characters to the language and fixed the width at\n16-bits; this necessitated the addition of \u201cwide wide\u201d characters in the\n2003 version of the language. Java kept 16-bits for individual\ncharacters, but expanded the <code>Character<\/code> functionality to accept 32-bits\nints as code points and implemented surrogate support. C# followed a\nsimilar strategy of using UTF-16 and surrogate pairs. For C and C++,\n<code>wchar_t<\/code>\u2019s width was always implementation-defined (e.g. Windows used\n16 bits, GNU libc used 32 bits), so this avoided changes to the\nlanguage. However, the ambiguity of an implemented-defined width led to\n<code>char8_t<\/code>, <code>char16_t<\/code>, and <code>char32_t<\/code> being added to the C++ standard.<\/p>\n<p>The UTF-8 encoding was designed in 1992 by Rob Pike and Ken Thompson\nwith the goals (Pike 2003):<\/p>\n<blockquote>\n<ol>\n<li>Compatibility with historical file systems:<\/li>\n<\/ol>\n<p>Historical file systems disallow the null byte and the ASCII slash\ncharacter as a part of the file name.<\/p>\n<ol start=\"2\">\n<li>Compatibility with existing programs:<\/li>\n<\/ol>\n<p>The existing model for multibyte processing is that ASCII does not\noccur anywhere in a multibyte encoding. There should be no ASCII code\nvalues for any part of a transformation format representation of a\ncharacter that was not in the ASCII character set in the UCS\nrepresentation of the character.<\/p>\n<ol start=\"3\">\n<li>\n<p>Ease of conversion from\/to UCS.<\/p>\n<\/li>\n<li>\n<p>The first byte should indicate the number of bytes to follow in a\nmultibyte sequence.<\/p>\n<\/li>\n<li>\n<p>The transformation format should not be extravagant in terms of\nnumber of bytes used for encoding.<\/p>\n<\/li>\n<li>\n<p>It should be possible to find the start of a character efficiently\nstarting from an arbitrary location in a byte stream.<\/p>\n<\/li>\n<\/ol>\n<\/blockquote>\n<p>The UTF-8 encoding was first used in the Plan 9 operating system and saw\nbroad adoption in Unix systems early on. By the 2010s, the UTF-8\nencoding has won both in terms of serving as the majority encoding on\nthe internet, and being used internally within programming languages. Go\nand Rust were launched with UTF-8 support, and although Swift initially\nused UTF-16 for backwards compatibility with Objective-C, switched to\nUTF-8 in 2019 (Ilseman 2019).<\/p>\n<h2 id=\"other-attempts\">OTHER ATTEMPTS<\/h2>\n<p>Unicode was not the only multilingual character set being developed in\nthe 1980s. The ISO organization sought to create a Universal Coded\nCharacter Set (UCS) and published a draft of ISO 10646 in 1990. This\ndraft included multiple planes of characters and anticipated storage of\nmillions of characters.<\/p>\n<p>The TRON character set (first published in English in 1987 with work\nstarting in Japan in 1984), was a sort of meta character set that could\nsubsume existing encodings. TRON was a multibyte standard that included\ncharacters to switch between languages and a number of features to\nsupport Asian languages. The encoding allowed use of ISO 8859-1 (the\nmost popular code page for English at the time), Braille, and multiple\nexisting Japanese and Korean encodings.<\/p>\n<p>Due to political and economic reasons, neither of these encodings won,\nwith the former eventually aligning with Unicode and the latter being\nlargely limited to hardware running TRON. Technically, though, they both\nanticipated using a multibyte encoding and the need for more than a\n16-bit space.<\/p>\n<h1 id=\"impact-of-unicode-on-standard-functionality\">IMPACT OF UNICODE ON STANDARD FUNCTIONALITY<\/h1>\n<p>In the Objective-C library reference, the documentation on strings\nstates that a language only need to provide string length and character\nat index functions, since all the other functions can be built on these\ntwo primitives. Although it was once possible to have individual\ndevelopers fill-in missing string functionality, with the adoption of\nUnicode, standard functions have become much more complex.<\/p>\n<p>In terms of the primitives, string length is an ambiguous measurement\ndue to the multi-byte encoding and semantic subtleties around the\ndefinition of a character. For instance, is the length a measurement of\nthe buffer size, the number of individual code points, or the number of\nglyphs? Programmers need to understand the distinctions and how to apply\nthem to their use cases. Languages such as Swift and Rust include\ndistinct API calls to differentiate these states. String equality is\nsimilarly complex. Two byte sequences may differ yet yield equivalent\nUnicode sequences. Computing equality requires first normalizing the\nbyte sequences; the documentation for this process runs over 30 pages.\nThese complexities require a greater effort by language implementors to\ncorrectly handle the standard.<\/p>\n<h1 id=\"conclusion\">CONCLUSION<\/h1>\n<p>Programmers today can expect rich functionality and a high-degree of\nsupport for multilingual text from programming languages. Language\ndesigners face an increased burden to add initial support for these\nfeatures due to the complex interoperability requirements, but continue\nto explore the performance and safety boundaries enabled by\nencapsulation of the string machinery.<\/p>\n<h1 id=\"acknowledgment\" class=\"unnumbered\">ACKNOWLEDGMENT<\/h1>\n<p>We thank Bryan Short and Catherine Williams for providing time for this\nstudy.<\/p>\n<h1 id=\"references\">REFERENCES<\/h1>\n<p>\u201cAda Reference Manual: ISO\/Iec 8652:1995(E) with Technical Corrigendum\n1.\u201d 1995. International Organization for Standardization.<\/p>\n<p>Apple. 2021. \u201cSwift String.\u201d 2021.\n<a href=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">https:\/\/developer.apple.com\/documentation\/swift\/string<\/a>.<\/p>\n<p>Becker, Joseph D. 1988. \u201cUnicode 88.\u201d August 28, 1988.\n<a href=\"http:\/\/www.unicode.org\/history\/Unicode88.pdf\">http:\/\/www.unicode.org\/history\/Unicode88.pdf<\/a>.<\/p>\n<p>Brodie, Jim. 1992. \u201cTrigraphs: The Search for Alternatives.\u201d <em>The\nJournal of C Language Translation<\/em> 3 (4): 310\u201318.<\/p>\n<p>Cowan, C., F. Wagle, Calton Pu, S. Beattie, and J. Walpole. 2000.\n\u201cBuffer Overflows: Attacks and Defenses for the Vulnerability of the\nDecade.\u201d In <em>Proceedings Darpa Information Survivability Conference and\nExposition. DISCEX\u201900<\/em>, 2:119\u201329 vol.2.\n<a href=\"https:\/\/doi.org\/10.1109\/DISCEX.2000.821514\">https:\/\/doi.org\/10.1109\/DISCEX.2000.821514<\/a>.<\/p>\n<p>Cox, Brad J. 1986. <em>Object Oriented Programming: An Evolutionary\nApproach<\/em>. USA: Addison-Wesley Longman Publishing Co., Inc.<\/p>\n<p>Data Systems Languages, Conference on. n.d. <em>CODASYL Cobol: Journal of\nDevelopment 1968<\/em>. National Bureau of Standards, United States.\n<a href=\"https:\/\/archive.org\/embed\/codasylcoboljour00conf\">https:\/\/archive.org\/embed\/codasylcoboljour00conf<\/a>.<\/p>\n<p>Emphrase. 2019. \u201cSource Han Sans Version Difference.\u201d\n<a href=\"https:\/\/commons.wikimedia.org\/wiki\/File:Source_Han_Sans_Version_Difference.svg\">https:\/\/commons.wikimedia.org\/wiki\/File:Source\\_Han\\_Sans\\_Version\\_Difference.svg<\/a>.<\/p>\n<p>Freiburghouse, R. A. 1969. \u201cThe Multics Pl\/1 Compiler.\u201d In <em>Proceedings\nof the November 18-20, 1969, Fall Joint Computer Conference<\/em>, 187\u201399.\nAFIPS \u201969 (Fall). New York, NY, USA: Association for Computing\nMachinery. <a href=\"https:\/\/doi.org\/10.1145\/1478559.1478581\">https:\/\/doi.org\/10.1145\/1478559.1478581<\/a>.<\/p>\n<p>Goldberg, Adele, and David Robson. 1983. <em>Smalltalk-80: The Language and\nIts Implementation<\/em>. USA: Addison-Wesley Longman Publishing Co., Inc.<\/p>\n<p>Gosling, James, Bill Joy, and Guy L. Steele. 1996. <em>The Java Language\nSpecification<\/em>. 1st ed. USA: Addison-Wesley Longman Publishing Co., Inc.<\/p>\n<p>Griswold, Ralph. 1972. <em>The Macro Implementation of Snobol4; a Case\nStudy of Machine-Independent Software Development<\/em>.<\/p>\n<p>Haskell 2021. April 21, 2021. <a href=\"https:\/\/wiki.haskell.org\/Performance\/Strings\">https:\/\/wiki.haskell.org\/Performance\/Strings<\/a>.<\/p>\n<p>Hejlsberg, Anders, Mads Torgersen, Scott Wiltamuth, and Peter Golde.\n2008. <em>The c# Programming Language<\/em>. 3rd ed. Addison-Wesley\nProfessional.<\/p>\n<p>Ilseman, Michael. 2019. \u201cUTF-8 String.\u201d March 20, 2019.\n<a href=\"https:\/\/swift.org\/blog\/utf8-string\/\">https:\/\/swift.org\/blog\/utf8-string\/<\/a>.<\/p>\n<p>\u201cJames Gosling on Java, May 2001: A Conversation with Java\u2019s Creator,\nJames Gosling.\u201d 2001. June 2001.\n<a href=\"https:\/\/www.artima.com\/articles\/james-gosling-on-java-may-2001#part13\">https:\/\/www.artima.com\/articles\/james-gosling-on-java-may-2001#part13<\/a>.<\/p>\n<p>Kernighan, B. W., and D. M. Ritchie. 1978. <em>The c Programming Language<\/em>.\nUSA: Prentice-Hall, Inc.<\/p>\n<p>Klabnik, Steve, and Carol Nichols. 2018. <em>The Rust Programming\nLanguage<\/em>. USA: No Starch Press.<\/p>\n<p>Lucks, Stefan, Norina Marie Grosch, and Joshua K\u00f6nig. 2017. \u201cTaming the\nLength Field in Binary Data: Calc-Regular Languages.\u201d In <em>2017 Ieee\nSecurity and Privacy Workshops (Spw)<\/em>, 66\u201379.\n<a href=\"https:\/\/doi.org\/10.1109\/SPW.2017.33\">https:\/\/doi.org\/10.1109\/SPW.2017.33<\/a>.<\/p>\n<p>Pike, Rob. 2003. \u201cUTF-8 History.\u201d April 30, 2003.\n<a href=\"https:\/\/www.cl.cam.ac.uk\/~mgk25\/ucs\/utf-8-history.txt\">https:\/\/www.cl.cam.ac.uk\/~mgk25\/ucs\/utf-8-history.txt<\/a>.<\/p>\n<p>\u2014\u2014\u2014. 2013. \u201cStrings, Bytes, Runes and Characters in Go.\u201d October 23,\n2013. <a href=\"https:\/\/blog.golang.org\/strings\">https:\/\/blog.golang.org\/strings<\/a>.<\/p>\n<p>Plauger, P.J. 1991. \u201cCharacter Sets and c.\u201d <em>The Journal of C Language\nTranslation<\/em> 3 (3): 169\u201376.<\/p>\n<p>Ritchie, Dennis M. 1993. \u201cThe Development of the c Language.\u201d In <em>The\nSecond Acm Sigplan Conference on History of Programming Languages<\/em>,\n201\u20138. HOPL-Ii. New York, NY, USA: Association for Computing Machinery.\n<a href=\"https:\/\/doi.org\/10.1145\/154766.155580\">https:\/\/doi.org\/10.1145\/154766.155580<\/a>.<\/p>\n<p>Rossum, Guido van. 1994. \u201cPython Release 1.0.1.\u201d<\/p>\n<p>\u2014\u2014\u2014. 2009. \u201cEarly Language Design and Development.\u201d February 3, 2009.\n<a href=\"https:\/\/python-history.blogspot.com\/2009\/02\/early-language-design-and-development.html\">https:\/\/python-history.blogspot.com\/2009\/02\/early-language-design-and-development.html<\/a>.<\/p>\n<p>Shillington, Keith Allan, and Gillian M. Ackland. n.d. <em>UCSD Pascal 1.5\nManual<\/em>.<\/p>\n<p>\u201cString in Std::String - Rust.\u201d n.d.\n<a href=\"https:\/\/doc.rust-lang.org\/std\/string\/struct.String.html\">https:\/\/doc.rust-lang.org\/std\/string\/struct.String.html<\/a>.<\/p>\n<p>Stroustrup, Bjarne. 1997. <em>The C++ Programming Language, Third Edition<\/em>.\n3rd ed. USA: Addison-Wesley Longman Publishing Co., Inc.<\/p>\n<p>Wall, Larry. 1987. \u201cPerl Kit, Version 1.0.\u201d<\/p>\n<p>Wideburg, Laura. 1998. \u201cEarly Years of Unicode.\u201d 1998.\n<a href=\"https:\/\/www.unicode.org\/history\/earlyyears.html\">https:\/\/www.unicode.org\/history\/earlyyears.html<\/a>.<\/p>\n"},{"title":"Format Specifiers","link":"https:\/\/ztoz.blog\/posts\/format-specifiers\/","pubDate":"Thu, 28 Apr 2022 12:58:44 -0600","guid":"https:\/\/ztoz.blog\/posts\/format-specifiers\/","description":"<p>Programming languages use format specifiers to control the formatting of numbers and text. We look at their invention in FORTRAN I, adoption in BCPL, and three innovative attempts to include them in ALGOL60.<\/p>\n<script>!function(r,u,m,b,l,e){r._Rumble=b,r[b]||(r[b]=function(){(r[b]._=r[b]._||[]).push(arguments);if(r[b]._.length==1){l=u.createElement(m),e=u.getElementsByTagName(m)[0],l.async=1,l.src=\"https:\/\/rumble.com\/embedJS\/u13csgf\"+(arguments[1].video?'.'+arguments[1].video:'')+\"\/?url=\"+encodeURIComponent(location.href)+\"&args=\"+encodeURIComponent(JSON.stringify([].slice.apply(arguments))),e.parentNode.insertBefore(l,e)}})}(window, document, \"script\", \"Rumble\");<\/script>\n<div id=\"rumble_vzhebw\"><\/div>\n<script>\nRumble(\"play\", {\"video\":\"vzhebw\",\"div\":\"rumble_vzhebw\"});<\/script>"},{"title":"Page Ranges","link":"https:\/\/ztoz.blog\/posts\/page-ranges\/","pubDate":"Sat, 02 Apr 2022 13:54:34 -0600","guid":"https:\/\/ztoz.blog\/posts\/page-ranges\/","description":"<p>In my <a href=\"https:\/\/ztoz.blog\/posts\/interviewing-questions\/\">earlier post on interviewing<\/a>, I discussed aspects of a coding interview question that I thought provided better signal on the candidate than algorithmic-focused questions. In this post, I&rsquo;ll provide an example fairly open-ended interview question, a solution, and a transcript between the interviewer and interviewee as an example for analyzing signal.<\/p>\n<h1 id=\"page-ranges-a-question\">Page Ranges, a Question<\/h1>\n<p>When a user prints a document, they might not want to print every page. The user interface allows them to write a &lsquo;page range specification&rsquo; to express the range of pages to print. For example, <code>1-3, 5<\/code> will print pages 1, 2, 3, and 5. Your task is to write code that will:<\/p>\n<ol>\n<li>Parse a string containing a page range specification<\/li>\n<li>Report if the string is valid or invalid<\/li>\n<li>Include functionality to report if a page should be printed or not<\/li>\n<\/ol>\n<p>You do not need to write unit tests, debug your code, or write documentation.<\/p>\n<p>After you write your code, you will perform a code review of the code. Provide positive and negative feedback. Requirements, design, and implementation are all fair game.<\/p>\n<h2 id=\"detailed-specs\">Detailed Specs<\/h2>\n<p>Pages may be numbered from 1 to 999999. Your program may be given the maximum number of pages within the document, which will not exceed 999999.<\/p>\n<p>Ranges are inclusive, i.e. <code>1-3<\/code> means print page 1, 2, and 3.<\/p>\n<p>A valid specification string will only contain white space, digits, and commas. Commas separate ranges only; numbers do not include separators.<\/p>\n<p>We will assume the code will be part of a library or a larger program, so you do not need to write a <code>main<\/code> or the calling code.<\/p>\n<h1 id=\"a-solution\">A Solution<\/h1>\n<p>In Scala:<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-scala\" data-lang=\"scala\">\n<span style=\"color:#66d9ef\">import<\/span> scala.util.Try\n\n<span style=\"color:#66d9ef\">trait<\/span> <span style=\"color:#a6e22e\">PageRange<\/span> <span style=\"color:#f92672\">{<\/span>\n  <span style=\"color:#66d9ef\">def<\/span> contains<span style=\"color:#f92672\">(<\/span>pageNo<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Int<\/span><span style=\"color:#f92672\">)<\/span><span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Boolean<\/span>\n<span style=\"color:#f92672\">}<\/span>\n\n<span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">DiscretePageRange<\/span><span style=\"color:#f92672\">(<\/span>ranges<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Vector<\/span><span style=\"color:#f92672\">[<\/span><span style=\"color:#66d9ef\">Range<\/span><span style=\"color:#f92672\">])<\/span> <span style=\"color:#66d9ef\">extends<\/span> <span style=\"color:#a6e22e\">PageRange<\/span> <span style=\"color:#f92672\">{<\/span>\n  <span style=\"color:#66d9ef\">override<\/span> <span style=\"color:#66d9ef\">def<\/span> contains<span style=\"color:#f92672\">(<\/span>pageNo<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Int<\/span><span style=\"color:#f92672\">)<\/span><span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Boolean<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">{<\/span>\n    ranges<span style=\"color:#f92672\">.<\/span>exists <span style=\"color:#f92672\">{<\/span> <span style=\"color:#66d9ef\">_<\/span><span style=\"color:#f92672\">.<\/span>contains<span style=\"color:#f92672\">(<\/span>pageNo<span style=\"color:#f92672\">)<\/span> <span style=\"color:#f92672\">}<\/span>\n  <span style=\"color:#f92672\">}<\/span>\n<span style=\"color:#f92672\">}<\/span>\n\n<span style=\"color:#66d9ef\">object<\/span> <span style=\"color:#a6e22e\">PageRange<\/span> <span style=\"color:#f92672\">{<\/span>\n  <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">val<\/span> singleNumber <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#e6db74\">&#34;&#34;&#34;\\s*([0-9]+)\\s*&#34;&#34;&#34;<\/span><span style=\"color:#f92672\">.<\/span>r\n  <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">val<\/span> pairNumber <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#e6db74\">&#34;&#34;&#34;\\s*([0-9]+)\\s*-\\s*([0-9]+)\\s*&#34;&#34;&#34;<\/span><span style=\"color:#f92672\">.<\/span>r\n\n  <span style=\"color:#66d9ef\">private<\/span> <span style=\"color:#66d9ef\">def<\/span> checkRange<span style=\"color:#f92672\">(<\/span>page<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Int<\/span><span style=\"color:#f92672\">,<\/span> maxPageNo<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Int<\/span><span style=\"color:#f92672\">)<\/span><span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Unit<\/span> <span style=\"color:#f92672\">=<\/span> <span style=\"color:#f92672\">{<\/span>\n    <span style=\"color:#66d9ef\">if<\/span> <span style=\"color:#f92672\">(<\/span>page <span style=\"color:#f92672\">&lt;<\/span> <span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">)<\/span> <span style=\"color:#f92672\">{<\/span>\n      <span style=\"color:#66d9ef\">throw<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">Exception<\/span><span style=\"color:#f92672\">(<\/span><span style=\"color:#e6db74\">&#34;page must be at least 1&#34;<\/span><span style=\"color:#f92672\">)<\/span>\n    <span style=\"color:#f92672\">}<\/span>\n    <span style=\"color:#66d9ef\">if<\/span> <span style=\"color:#f92672\">(<\/span>page <span style=\"color:#f92672\">&gt;<\/span> maxPageNo<span style=\"color:#f92672\">)<\/span> <span style=\"color:#f92672\">{<\/span>\n      <span style=\"color:#66d9ef\">throw<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">Exception<\/span><span style=\"color:#f92672\">(<\/span><span style=\"color:#e6db74\">s&#34;page must be equal or less than <\/span><span style=\"color:#e6db74\">$maxPageNo<\/span><span style=\"color:#e6db74\">&#34;<\/span><span style=\"color:#f92672\">)<\/span>\n    <span style=\"color:#f92672\">}<\/span>\n  <span style=\"color:#f92672\">}<\/span>\n\n  <span style=\"color:#66d9ef\">def<\/span> from<span style=\"color:#f92672\">(<\/span>spec<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">String<\/span><span style=\"color:#f92672\">,<\/span> maxPageNo<span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Int<\/span><span style=\"color:#f92672\">)<\/span><span style=\"color:#66d9ef\">:<\/span> <span style=\"color:#66d9ef\">Try<\/span><span style=\"color:#f92672\">[<\/span><span style=\"color:#66d9ef\">PageRange<\/span><span style=\"color:#f92672\">]<\/span> <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#a6e22e\">Try<\/span> <span style=\"color:#f92672\">{<\/span>\n    <span style=\"color:#66d9ef\">val<\/span> ranges <span style=\"color:#66d9ef\">=<\/span> spec<span style=\"color:#f92672\">.<\/span>split<span style=\"color:#f92672\">(<\/span><span style=\"color:#e6db74\">&#39;,&#39;<\/span><span style=\"color:#f92672\">).<\/span>map <span style=\"color:#f92672\">{<\/span> s <span style=\"color:#66d9ef\">=&gt;<\/span>\n      <span style=\"color:#f92672\">(<\/span>pairNumber<span style=\"color:#f92672\">.<\/span>findFirstMatchIn<span style=\"color:#f92672\">(<\/span>s<span style=\"color:#f92672\">),<\/span> singleNumber<span style=\"color:#f92672\">.<\/span>findFirstMatchIn<span style=\"color:#f92672\">(<\/span>s<span style=\"color:#f92672\">))<\/span> <span style=\"color:#66d9ef\">match<\/span> <span style=\"color:#f92672\">{<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#f92672\">(<\/span><span style=\"color:#a6e22e\">Some<\/span><span style=\"color:#f92672\">(<\/span>pair<span style=\"color:#f92672\">),<\/span> <span style=\"color:#66d9ef\">_<\/span><span style=\"color:#f92672\">)<\/span> <span style=\"color:#66d9ef\">=&gt;<\/span>\n          <span style=\"color:#66d9ef\">val<\/span> pageL <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#a6e22e\">Integer<\/span><span style=\"color:#f92672\">.<\/span>parseInt<span style=\"color:#f92672\">(<\/span>pair<span style=\"color:#f92672\">.<\/span>group<span style=\"color:#f92672\">(<\/span><span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">))<\/span>\n          <span style=\"color:#66d9ef\">val<\/span> pageR <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#a6e22e\">Integer<\/span><span style=\"color:#f92672\">.<\/span>parseInt<span style=\"color:#f92672\">(<\/span>pair<span style=\"color:#f92672\">.<\/span>group<span style=\"color:#f92672\">(<\/span><span style=\"color:#ae81ff\">2<\/span><span style=\"color:#f92672\">))<\/span>\n          checkRange<span style=\"color:#f92672\">(<\/span>pageL<span style=\"color:#f92672\">,<\/span> maxPageNo<span style=\"color:#f92672\">)<\/span>\n          checkRange<span style=\"color:#f92672\">(<\/span>pageR<span style=\"color:#f92672\">,<\/span> maxPageNo<span style=\"color:#f92672\">)<\/span>\n\n          <span style=\"color:#a6e22e\">Range<\/span><span style=\"color:#f92672\">.<\/span>inclusive<span style=\"color:#f92672\">(<\/span>pageL<span style=\"color:#f92672\">,<\/span> pageR<span style=\"color:#f92672\">)<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#f92672\">(<\/span><span style=\"color:#66d9ef\">_<\/span><span style=\"color:#f92672\">,<\/span> <span style=\"color:#a6e22e\">Some<\/span><span style=\"color:#f92672\">(<\/span>single<span style=\"color:#f92672\">))<\/span> <span style=\"color:#66d9ef\">=&gt;<\/span>\n          <span style=\"color:#66d9ef\">val<\/span> page <span style=\"color:#66d9ef\">=<\/span> <span style=\"color:#a6e22e\">Integer<\/span><span style=\"color:#f92672\">.<\/span>parseInt<span style=\"color:#f92672\">(<\/span>single<span style=\"color:#f92672\">.<\/span>group<span style=\"color:#f92672\">(<\/span><span style=\"color:#ae81ff\">1<\/span><span style=\"color:#f92672\">))<\/span>\n          checkRange<span style=\"color:#f92672\">(<\/span>page<span style=\"color:#f92672\">,<\/span> maxPageNo<span style=\"color:#f92672\">)<\/span>\n\n          <span style=\"color:#a6e22e\">Range<\/span><span style=\"color:#f92672\">.<\/span>inclusive<span style=\"color:#f92672\">(<\/span>page<span style=\"color:#f92672\">,<\/span> page<span style=\"color:#f92672\">)<\/span>\n        <span style=\"color:#66d9ef\">case<\/span> <span style=\"color:#f92672\">(<\/span><span style=\"color:#a6e22e\">None<\/span><span style=\"color:#f92672\">,<\/span> <span style=\"color:#a6e22e\">None<\/span><span style=\"color:#f92672\">)<\/span> <span style=\"color:#66d9ef\">=&gt;<\/span>\n          <span style=\"color:#66d9ef\">throw<\/span> <span style=\"color:#66d9ef\">new<\/span> <span style=\"color:#a6e22e\">Exception<\/span><span style=\"color:#f92672\">(<\/span><span style=\"color:#e6db74\">s&#34;Bad format for <\/span><span style=\"color:#e6db74\">$s<\/span><span style=\"color:#e6db74\">&#34;<\/span><span style=\"color:#f92672\">)<\/span>\n      <span style=\"color:#f92672\">}<\/span>\n    <span style=\"color:#f92672\">}.<\/span>toVector\n\n    <span style=\"color:#a6e22e\">DiscretePageRange<\/span><span style=\"color:#f92672\">(<\/span>ranges<span style=\"color:#f92672\">)<\/span>\n  <span style=\"color:#f92672\">}<\/span>\n<span style=\"color:#f92672\">}<\/span>\n<\/code><\/pre><\/div><h1 id=\"transcript\">Transcript<\/h1>\n<p><em>(I is the interviewer, S is the subject or interviewee)<\/em><\/p>\n<p>I: Now that you have a solution, can you act as a code reviewer? Consider not only the code, but also the design and requirements.<\/p>\n<p>S: From the requirements side, there seems to be an assumption that the caller will query each page of the document. A more natural API may be for the caller to get an iterator that returns each page number to be printed.<\/p>\n<p>S: Also, the specification is very open. Ranges may overlap, may be in any order, and there is no constraint on the length. If the interface was more restrictive, users might be better protected from their own mistakes.<\/p>\n<p>S: From a class design perspective, we have an interface, a concrete implementation, and a companion object to handle construction. Since the strings may be invalid, the companion object is an idiomatic way to separate the construction concerns from the &lsquo;runtime&rsquo; concerns. Having an interface may be overkill; it is uncertain if there would be need for multiple implementations.<\/p>\n<p>I: The user interface may provide convenience options to print all even pages or odd pages. Does that influence your opinion?<\/p>\n<p>S: Maybe. Having odd and even methods on the companion object can return the same concrete class, as <code>Range<\/code> supports steps &mdash; <code>odd()<\/code> is <code>Range(1, maxPageNo, 2)<\/code>. Specializing the class for a vector with a single element would just eliminate a single memory indirection.<\/p>\n<p>I: Okay, what else in the code review?<\/p>\n<p>S: The <code>contains<\/code> method does a linear search of the ranges for the first inclusive item, or returns false. Positively, the code is very short and direct. Negatively, this can be a <code>O(P*N)<\/code> algorithm where P is the number of pages in the document and N is the number of ranges.<\/p>\n<p>S: One solution would be to change the interface to an iterator like design so it just returns the next valid page. That&rsquo;ll be linear. Another solution would be to sort the ranges and use a binary search, although if the number of ranges is small the cost will be effectively the same. Third, we could change the data structure to allow a constant time lookup. A bitset, where each page is represented by a single bit, would take very little memory and support efficient look-ups and scans. Fourth, there are data structures meant for storing ranges, but that seems overkill for this application unless we had one at hand.<\/p>\n<p>I: How about the construction code?<\/p>\n<p>S: Since this is user-facing code, the error messages don&rsquo;t say how to fix the problem and don&rsquo;t include information about the location of the error. They can be improved, but rich error reporting is hard to include in an hour of coding.<\/p>\n<p>S: In Scala, companion objects often build the object via an <code>apply()<\/code> method so it will look like normal object creation. In this case, since we are returning a Try of the object, to signify that construction may fail, I think it is appropriate to use a named function instead.<\/p>\n<p>S: The parsing code goes through the string multiple times &mdash; we first split, and then apply two regular expressions to each character. The regular expression could be written to handle all the groups in a single invocation, although that might become hard to read. The <code>[0-9]<\/code> in the regular expressions could be replaced by <code>\\d<\/code>. The regular expressions could also be dropped and the code could use a simple state machine. It would be more code, but error reporting could be better since it would retain all the context. A parser library could also be used, but adding a dependency to handle this simple of a format is likely overkill, unless we already had it.<\/p>\n<p>S: Storing a vector of ranges is pretty intensive memory-wise. There are more efficient ways to store the data; bitsets or even arrays of numbers, but page ranges are likely going to be short-lived and, outside of very long examples, we&rsquo;re talking hundreds of kilobytes or less of RAM.<\/p>\n<p>I: To summarize, how well does this code meet the requirements and use case?<\/p>\n<p>S: Overall, the requirements and use case don&rsquo;t seem that performance sensitive, so my code review would focus on UX, clarity, and maintainability. If this was meant for a feature that the business wanted to invest a lot in to be very polished, then we&rsquo;d want more specific error messages. That could lead to a change in approach for parsing. Similarly, if the use case routinely handled very complex page ranges, then performance may become important. Otherwise, the design is fairly modular and separates concerns. Unit testing will be straight-forward and it doesn&rsquo;t involve anything outside the standard library. So, it&rsquo;s close to approved.<\/p>\n"},{"title":"Infinite Race","link":"https:\/\/ztoz.blog\/posts\/infinite-race\/","pubDate":"Sun, 30 Jan 2022 17:36:03 -0700","guid":"https:\/\/ztoz.blog\/posts\/infinite-race\/","description":"<p><em>Infinite Race<\/em> is a mathematical racing game. Two players compete to move their piece the greatest distance along the number line by choosing how the dice will fall. However, the rolls must fit a distribution perfectly, so the player&rsquo;s luck must revert to the mean. The strategy is to select a sequence of moves which allows one player to break free of the pack.<\/p>\n<h1 id=\"setup\">Setup<\/h1>\n<p>Each player specifies an infinite sequence of moves at the beginning of play, where the move depends on the time step (a non-negative integer) and no other inputs. A move may be zero. For the binary version of the game, a move may either be zero or one step forward. The moves must fit the game version&rsquo;s statistical distribution perfectly, although the player chooses the interval for when the conformance will be tested. For the binary version of the game, the distribution is uniform and expects an equal number of no moves (zeros) and forward moves (ones). The player&rsquo;s chosen interval is known as the period.<\/p>\n<h1 id=\"movement\">Movement<\/h1>\n<p>Players move simultaneously. If both players are moving to the same destination (integral position on the number line), one of the players may be bumped back. The player that was initially behind the other player will cause the other player to be bumped back one position. If both players were at equal positions initially, neither player is bumped back.<\/p>\n<p>For example, player A is at position 3 and player B is at position 4. Player A is moving forward three positions while B is moving forward two; this means both will end up at position 6. Since A started at a position before B, A will bump back B. Thus, the end position for A will be 6 and 5 for B.<\/p>\n<h1 id=\"winning\">Winning<\/h1>\n<p>Winning a race requires escaping the sinusoidal back-and-forth between both players. If a player manages to create a lead greater than their period, they win. In calculus terms, if we compute the quotient of each player&rsquo;s position, as the time step proceeds to infinite, the quotient will either go to infinity or zero, while a non-winning race will go to one.<\/p>\n<h1 id=\"simulator\">Simulator<\/h1>\n<p>I&rsquo;ve written an in-browser <a href=\"https:\/\/infiniterace.ztoz.blog\">game simulator<\/a>.<\/p>\n<p>The simulator is <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/infinite-race\">written<\/a> as a Rust library, compiled into WASM, and then run in the browser wrapped in a little bit of vanilla JavaScript.<\/p>\n<blockquote>\n<p>Why? I anticipate using and expanding the Rust library into other non-WASM environments, but WASM allows me to re-use the code inside the browser. Rust has one of the best code re-use stories of any major language and the type system and language design provide obstacles against sloppy coding.<\/p>\n<\/blockquote>\n"},{"title":"Update to Game of Ur Background","link":"https:\/\/ztoz.blog\/posts\/ur-background\/","pubDate":"Fri, 17 Dec 2021 11:16:48 -0700","guid":"https:\/\/ztoz.blog\/posts\/ur-background\/","description":"<p>I&rsquo;ve made an update to the <a href=\"https:\/\/ztoz.blog\/posts\/game-of-ur-background\/\">Game of Ur - Background<\/a> post. A reader took up my challenge to find a closed form solution for the number of valid game placements and <a href=\"https:\/\/distracted-jackson-c5763c.netlify.app\/math\/ur.html\">found a solution<\/a>.<\/p>"},{"title":"Finding Sierpi\u0144ski in the Oddest Places","link":"https:\/\/ztoz.blog\/posts\/finding-sierpinski\/","pubDate":"Thu, 16 Dec 2021 17:23:08 -0700","guid":"https:\/\/ztoz.blog\/posts\/finding-sierpinski\/","description":"<p>This is a tale of finding fractals in an unexpected place and why the appearance makes sense in hindsight.<\/p>\n<p>I was looking for patterns in valid board game configurations, specifically the middle contesting row in the Game of Ur. In the game, there are eight slots either of the two players may occupy, but a slot may only be occupied by at most one player. I represented the available slots as two bit strings, with each bit representing a single slot. A valid placement is one where the two corresponding bits are both false or only one is true, but never both are true. Formally, this is equivalent to the bitwise formula \\(\\neg (p \\land q)\\) or \\(p \\uparrow q\\).<\/p>\n<p>Plotting the p and q bitstrings as axes within a matrix, I saw:<\/p>\n<p><img src=\"sierpinski256.png\" alt=\"Sierpinski Triangle in 256x256 matrix form\"><\/p>\n<p>I recognized this set of repeated triangles as a Sierpi\u0144ski Triangle (or Gasket). Described by Wac\u0142aw Franciszek Sierpi\u0144ski in 1915 (mathematically; there may have been artistic representations in the 13th century by the Cosmati<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>), the Gasket is an iterative structure with infinite scaling. The Triangle is considered a fractal (and one of the first discussed in any primer), but the discovery predates the field of fractal geometry. A typical way to describe the construction<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup> is to inscribe a triangle with coordinates (0,0), (0,1), and (1,0). Then, inscribe three triangles via the transformations:<\/p>\n<ol>\n<li>\\((x,y) \\rightarrow (\\frac{1}{2}x, \\frac{1}{2}y) \\)<\/li>\n<li>\\((x,y) \\rightarrow (\\frac{1}{2}x + \\frac{1}{2}, \\frac{1}{2}y) \\)<\/li>\n<li>\\((x,y) \\rightarrow (\\frac{1}{2}x, \\frac{1}{2}y + \\frac{1}{2}) \\)<\/li>\n<\/ol>\n<p>Animating this construction (note that this figure has been rotated versus the matrix orientation):<\/p>\n<p><img src=\"https:\/\/www.wolframcloud.com\/obj\/85625c5a-11ad-47cd-9b03-755d36b4c3ea\" alt=\"Construction of a Sierpinski Triangle\" title=\"Construction of Sierpinski Triangle\"><\/p>\n<p>Of course, the Triangle&rsquo;s representation can also be rotated and projected without impacting the structure. Equilateral constructions are very popular.<\/p>\n<p>There are many ways to construct the Triangle, including many that aren&rsquo;t graphical. The Triangle is registered as <a href=\"https:\/\/oeis.org\/A001317\">OEIS A001317 Sierpi\u0144ski&rsquo;s triangle (Pascal&rsquo;s triangle mod 2) converted to decimal<\/a>. The catalog entry contains 16 formulas\tand 17 links. Due to the link to Pascal&rsquo;s Triangle, which has a tremendous number of constructions, this is a small subset of the known set.<\/p>\n<p>Looking through the catalog, I tried to find one that corresponded to how I was generating the Triangle but didn&rsquo;t find one. The exclusive-or method &mdash; \\(a_{n+1} = a_n \\otimes 2*a_n\\) &mdash; seemed like a good candidate, but I did not find an an algebraic equivalence.<\/p>\n<p>Since my formulation wasn&rsquo;t generating integers but rather a binary matrix, I found a different formulation that was defined for binary matrices:<\/p>\n<p>$$a_{i=1,j=*}=1$$<\/p>\n<p>$$a_{i=*,j=1}=1$$<\/p>\n<p>$$a_{i,j} = a_{i-1,j} \\otimes a_{i,j-1}$$<\/p>\n<p>This formulation can be easier to see in smaller matrices:<\/p>\n<p><img src=\"sierpinski8.png\" alt=\"Sierpinski Triangle in 8x8 matrix form\"><\/p>\n<p>I was still having difficulty finding an equivalence between this formula and mine, although the above seemed closer. In particular, I started thinking of the representations of binary numbers and how my ordering incorporated binary operations. Thinking that I might have discovered a novel representation, I expanded my research and found that, yes, someone had discussed this construction before (and they didn&rsquo;t even treat it as novel). Alas.<\/p>\n<p><em>Fractal Images of Formal Systems<\/em><sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote-ref\" role=\"doc-noteref\">3<\/a><\/sup> explores the intersection of formal systems (e.g. propositional calculus) and fractals via visualizations. (Academic works tend to be what they say on the tin.) In one of the examples, the authors explore computing NAND (<em>not and<\/em>; often depicted via the Sheffer stroke \\(\\uparrow\\)) of two sentences of propositions. They found that ordering these sentences into ascending order and plotting the results yielded a Sierpinski Triangle. This approach is analogous to my application of NAND&rsquo;ing two binary strings.<\/p>\n<p>So, why is the Triangle generated by the NAND of two bitstrings, projected onto a two-dimensional matrix? Drawing upon the discussion in <a href=\"http:\/\/pgrim.org\/fractal\/Sierpins.html\"><em>Fractal Images<\/em><\/a>:<\/p>\n<p>First, we can consider one of the classic constructions (similar to before): For any given triangle, the interior points can be divided into two classes. The first class of points are ones where, if we measure the distance from them to the closest vertex of the surrounding triangle, a doubling of that distance would be outside the triangle. The second class of points are the ones where the doubling of the distance would be inside the triangle. The first class will form an empty inverted triangle in the center. Of the points in the second class, we can repeat the process by separating points that would be outside the triangle by twice doubling the distance versus ones that would be inside. A Sierpi\u0144ski Triangle is formed via the infinite repetition of this process.<\/p>\n<p>How can we map the above construction of the Triangle into our NAND matrix world? Consider the NAND logic inscribed on a unit square, where the true values are connected to form a triangle:<\/p>\n<p><img src=\"sierpinski-unit.png\" alt=\"NAND Logic within a Unit Square\"><\/p>\n<p>If we implement the doubling logic within this unit square, the logic acts as:<\/p>\n<table>\n<thead>\n<tr>\n<th>Closest Vertex<\/th>\n<th>Translation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>A (Origin)<\/td>\n<td>\\( (x,y) \\rightarrow (2x, 2y) \\)<\/td>\n<\/tr>\n<tr>\n<td>B (1,0)<\/td>\n<td>\\( (x,y) \\rightarrow (1 - 2(1-x), 2y) \\rightarrow (2x - 1, 2y) \\)<\/td>\n<\/tr>\n<tr>\n<td>C (0,1)<\/td>\n<td>\\( (x,y) \\rightarrow (2x, 1 - 2(1-y)) \\rightarrow (2x, 2y - 1) \\)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>(It may be helpful to think of an inward pointing vector from the vertex.)<\/p>\n<p>What does doubling mean in terms of numbers encoded as binary strings? Multiplying a binary number by 2 is equivalent to a left shift (and halving is equivalent to a right shift). Subtracting a one will invert the lowest bit (among other potential changes).<\/p>\n<p>Shifts either insert a zero at the lowest bit or remove the lowest bit from the string. A pair of zeros will not change the result of a NAND computation, so if two strings are valid, we can left shift them infinitely with no change. Similarly, if two strings are valid under NAND, stripping the lowest bits will not change the result of the computation.<\/p>\n<p>Subtracting the one will eliminate any one-bits on the lowest bit. If the lowest bit is not set, this leads to an arithmetic subtraction that may invert a number of bits depending on the original string (borrowing). How does borrowing avoid violating the NAND rules? For the two original strings, there are three possibilities for the lowest bits: x=0 y=0, x=0 y=1, and x=1 y=0. (The case of x=1 y=1 would be invalid per NAND.) For the second and third case, the subtraction will borrow from a placement of 1 leading to a valid string. For the first case, rule A does not require borrowing. Thus, the recursive nature of the construction remains valid under the rules of NAND and the Triangle is approximated by NANDing binary numbers.<\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Brunori P., Magrone P., Lalli L.T. (2019) Imperial Porphiry and Golden Leaf: Sierpinski Triangle in a Medieval Roman Cloister. In: Cocchiarella L. (eds) ICGG 2018 - Proceedings of the 18th International Conference on Geometry and Graphics. ICGG 2018. Advances in Intelligent Systems and Computing, vol 809. Springer, Cham. <a href=\"http:\/\/www.formulas.it\/sito\/wp-content\/uploads\/2021\/03\/B-M-TedeschiniLalli_authorversion.pdf\">Author PDF<\/a>&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>Falconer, Kenneth. <em>Fractals: A very short introduction<\/em>. OUP Oxford, 2013.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:3\" role=\"doc-endnote\">\n<p>Denis, Paul St, and Patrick Grim. &ldquo;Fractal images of formal systems.&rdquo; Journal of philosophical logic 26.2 (1997): 181-222. <a href=\"https:\/\/philpapers.org\/archive\/DENFIO.pdf\">PDF<\/a> <a href=\"http:\/\/pgrim.org\/fractal\/index.html\">HTML<\/a>&#160;<a href=\"#fnref:3\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>\n"},{"title":"How to Solve Incidents - a research proposal","link":"https:\/\/ztoz.blog\/posts\/how-to-solve-incidents\/","pubDate":"Fri, 03 Dec 2021 15:09:56 -0700","guid":"https:\/\/ztoz.blog\/posts\/how-to-solve-incidents\/","description":"<p><a href=\"https:\/\/press.princeton.edu\/books\/paperback\/9780691164076\/how-to-solve-it\"><em>How to Solve It<\/em><\/a>, but for operational incidents.<\/p>\n<p>Engineers with site reliability responsibilities are often faced with operational issues (incidents) that have an unknown cause and uncertain solution. <em>How to Solve It<\/em> is a classic work describing heuristics for solving mathematical problems. There have been adaptations of this book for different domains, but none yet for software-intensive operations. Within this study, we aim to build a taxonomy or categorization of techniques\/heuristics in practice for understanding and overcoming operational issues. The catalog will be based on a survey of experienced engineers and real incident logs and postmortems from a variety of organizations. Readers will be able to leverage the catalog to solve tricky and novel production issues.<\/p>\n<h1 id=\"research-plan\">Research Plan<\/h1>\n<p>This project will use both survey and field study methods to catalog &ldquo;in practice&rdquo; incident mitigation techniques and heuristics. We expect the project to take between 6 and 9 months (to the point of a submission-ready article), although the timeline may be extended due to data access issues since operational data is often sensitive and negotiations may be prolonged.<\/p>\n<h2 id=\"data-collection\">Data Collection<\/h2>\n<p>We will reach out to organizations (primarily SaaS or computing intensive companies) for:<\/p>\n<ol>\n<li>Interviews with senior engineers &mdash; how do they approach an unfamiliar production problem? do they use any explicit mental models, heuristics, or existing processes?<\/li>\n<li>Access to incident logs, chat\/ticket threads, and\/or root cause analyses. Basically, documents describing how the team is working to resolve incidents, preferably &ldquo;raw&rdquo;. The research team will then:\n<ol>\n<li>Document problem solving threads by incident &mdash; i.e., faced with <em>x<\/em> situation, attempted <em>y<\/em> or sought to determine <em>z<\/em> status. Was this successful?<\/li>\n<li>Document what mental models (if any) were being used? Was this mental model correct?<\/li>\n<li>Note: Raw incident logs are preferred to RCAs. For example, <a href=\"https:\/\/slack.engineering\/what-happened-during-slacks-dnssec-rollout\/\">Slack&rsquo;s DNSSEC RCA<\/a> is very detailed on the cause of the problem, the prior attempts to prevent an outage, and actions to correct the outage once it happened, but is light on details about specific troubleshooting steps and unsuccessful threads. RCAs can provide useful data, but usually gloss over the troubleshooting activity.<\/li>\n<\/ol>\n<\/li>\n<li>Access to &ldquo;Try if everything else fails&rdquo;-style runbooks (if they exist).<\/li>\n<\/ol>\n<p>A suitable organization should have at least 30 relevant incidents. We are targeting at least 10 organizations within the study.<\/p>\n<p>A sample of incidents will be excluded from the coding effort to form an &ldquo;out of band&rdquo; pool. We will also augment the out of band pool with publicly documented incident RCAs.<\/p>\n<h3 id=\"data-protection\">Data Protection<\/h3>\n<p>We anticipate that all organizations that participate in the study will be acknowledged within the paper and that individuals interviewed may be quoted with their consent. The paper will not link heuristics to specific companies or incidents; heuristics and techniques are generalized and anonymized. A heuristic must be practiced by at least two organizations to be referenced.<\/p>\n<p>We expect to sign NDAs or similar legal agreements to protect sensitive data and may accept reasonable constraints on access and storage of any shared data.<\/p>\n<p>As members of the Association of Computing Machinery, we are bound by the Code of Ethics, in particular <a href=\"https:\/\/www.acm.org\/code-of-ethics#h-1.7-honor-confidentiality.\">section 1.7 Honor Confidentiality<\/a>:<\/p>\n<blockquote>\n<p>Computing professionals are often entrusted with confidential information such as trade secrets, client data, nonpublic business strategies, financial information, research data, pre-publication scholarly articles, and patent applications. Computing professionals should protect confidentiality except in cases where it is evidence of the violation of law, of organizational regulations, or of the Code. In these cases, the nature or contents of that information should not be disclosed except to appropriate authorities. A computing professional should consider thoughtfully whether such disclosures are consistent with the Code.<\/p>\n<\/blockquote>\n<h2 id=\"create-taxonomycategorization\">Create taxonomy\/categorization<\/h2>\n<p>Using the collected data, we expect to code it via a faceted coding approach. An example of a facet coding is Ranganathan&rsquo;s five core categories, typically labeled PMEST. P is the personality or most specific subject, M is the material or component, E is the energy or activity, operation, or process, S is the space or location, and T is the time. A specific heuristic may not have all five facets.<\/p>\n<p>As an example, let&rsquo;s say that within an incident the engaging team has detected a drop in production rate from a distributed group of queue-based workers. For coding, the personality is the queue-based job workers. The material could be the workers, the queue, the underlying compute infrastructure, a worker dependency, or even ancillary items like the logging and monitoring subsystem. The energy will be the specific action, such as restarting workers, checking connections to the queue, or verifying health of dependencies.<\/p>\n<p>The coded values can then be grouped into a taxonomy by grouping similar items by facets and frequency.<\/p>\n<p>Why go through the effort to create a taxonomy? <em>How to Solve It<\/em> organizes the heuristics alphabetically, <em>How to Solve It: Modern Heuristics<\/em> organizes items hierarchically by academic discipline, and <em>Discussion of the Method : Conducting the Engineer&rsquo;s Approach to Problem Solving<\/em> lacks any reference-able organization. There are two reasons to go through the effort: construction and reference. Construction-wise, a taxonomy of heuristics can identify gaps in the catalog (similar to gaps in the Periodic Table identified elements). Reference-wise, a good organization will make it easier for readers to apply to incidents.<\/p>\n<blockquote>\n<p><em>Typology versus Taxonomy.<\/em> Both are terms for categorization, but in some domains a typology is conceptual while a taxonomy is empirical. Typologies are organized by dimensions that a human specifies, while a taxonomy is created via cluster analysis based on natural attributes or dimensions selected by humans. Since heuristics lack natural attributes, the more correct term for this activity is typology, albeit more obscure.<\/p>\n<\/blockquote>\n<h2 id=\"testing-the-taxonomy\">Testing the Taxonomy<\/h2>\n<p>Ideally, the true test of efficacy would track reduction in repair time due to improved problem solving with and without the catalog. However, this would require controlled experiments or a large statistical study, both of which are studies in their own right. Instead, the validity of the catalog will be evaluated for (practical) completeness and likelihood of utility.<\/p>\n<p>Completeness will be evaluated via out-of-band examples. For the incidents and public RCAs marked as out-of-band during the data collection phase, we will map the incident&rsquo;s narrative to the coded catalog to check for any absences in the catalog or difficulties in following the organization. Out-of-band examples may be added to the catalog (assuming they meet our inclusion threshold), but their initial absence is a useful statistical indicator of completeness.<\/p>\n<p>Likelihood of utility will be evaluated via an outside group of both inexperienced and experienced engineers who will be asked to use the catalog on prior incidents (of their choosing) and current incidents and record their experiences. (See <a href=\"https:\/\/accidental-taxonomist.blogspot.com\/2013\/03\/testing-taxonomies.html\">Use testing<\/a> for more context.) Their experiences will be recorded via survey and aggregated.<\/p>\n<h1 id=\"hypothetical-categorization\">Hypothetical Categorization<\/h1>\n<p>To provide an idea of what the categorization might look like:<\/p>\n<ol>\n<li>Have you tried turning it off and on again? (Blind luck strategies)\n<ol>\n<li>Restarting the complaining service<\/li>\n<li>Restarting the complaining service&rsquo;s dependencies<\/li>\n<li>Redeploying<\/li>\n<li>Moving to another host \/ cluster<\/li>\n<li>Rollback<\/li>\n<li>Search internet for similar error messages<\/li>\n<\/ol>\n<\/li>\n<li>Configuration management\n<ol>\n<li>Restore previous configuration<\/li>\n<li>Attempt to isolate problem to certain instances \/ shard<\/li>\n<li>Verify configuration same between instances<\/li>\n<li>Restore previous configuration (dependency)<\/li>\n<\/ol>\n<\/li>\n<li>Capacity management\n<ol>\n<li>Increase allocation<\/li>\n<li>Decrease allocation<\/li>\n<li>Rebalance among shards<\/li>\n<\/ol>\n<\/li>\n<li>Traffic management\n<ol>\n<li>Throttle traffic source<\/li>\n<li>Eliminate traffic source<\/li>\n<li>Route traffic to another shard<\/li>\n<li>Verify traffic along route<\/li>\n<li>Eliminate traffic magnification<\/li>\n<\/ol>\n<\/li>\n<li>Broaden scope\n<ol>\n<li>Orchestration issue<\/li>\n<li>Operating System\/Host Issue<\/li>\n<li>DNS \/ Service Lookup Issue<\/li>\n<li>Network Issue<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>Potentially, the categorization could become a decision tree, depending on the statistical strength of the data.<\/p>\n<p>Each entry would contain a description, including any known failure modes and additional context.<\/p>\n<h1 id=\"prior-art-and-why-this-is-novel\">Prior Art (and why this is novel)<\/h1>\n<p><a href=\"https:\/\/press.princeton.edu\/books\/paperback\/9780691164076\/how-to-solve-it\">How to Solve It<\/a> by G.Polya<\/p>\n<p>This is a classic work on mathematical methods to solve problems. Although some of the heuristics described (e.g. Do you know a related problem, Draw a figure, Generalization) are applicable beyond mathematical problems, most of the heuristics fall within the mathematics domain. In contrast, the proposed project targets operating software intensive systems and also seeks to quantitatively find which heuristics seem to work most often.<\/p>\n<p><a href=\"https:\/\/link.springer.com\/book\/10.1007\/978-3-662-07807-5\">How to Solve It: Modern Heuristics<\/a> by Z. Michalewicz and B. Fogel<\/p>\n<p>This book follows a similar approach to <em>How to Solve It<\/em>, but changes the domain to computer science algorithms by discussing approaches like constraint solving, search, and evolution. None of the heuristics are particularly relevant to operations.<\/p>\n<p><a href=\"https:\/\/sre.google\/sre-book\/effective-troubleshooting\/\">Effective Troubleshooting<\/a> by Chris Jones (Google&rsquo;s SRE Book); <a href=\"https:\/\/www.atlassian.com\/incident-management\">Atlassian Incident Management Handbook<\/a><\/p>\n<p>Both focus on the management of incidents (e.g. communication, maintaining focus, leadership) and the general process of solving problems rather than specific actions or techniques the engineering group can perform, particularly if the cause is novel. Although a scientific or structured approach certainly aids in mitigating issues, our work will be focused on specific techniques to be used within that greater framework.<\/p>\n<p><a href=\"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/3368089.3417054\">How to mitigate the incident? An effective troubleshooting guide recommendation technique for online service systems<\/a><sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>\nThis paper studies &ldquo;troubleshooting guides&rdquo; (TSGs) at Microsoft and develops a recommendation system for suggesting specific actions operators can take based on an incident description. Examining incident records, they find a <em>lower-bound<\/em> of 27.2% of incidents that had an existing TSG and a large proportion (~60%) of incidents contained novel failures (no existing or relevant guide). The proposed work is not focused on making it easier to find existing run books to solve recurring or known failure modes, but shorten the time to restore for novel or ambiguous failure modes, which the paper suggests are the majority of cases within their sample.<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup><\/p>\n<h1 id=\"publication-venue\">Publication Venue<\/h1>\n<p>Peer reviewed with free access to final paper (e.g. Open Access). Potential venues include SREcon, DevOpsDays, and magazines with broader scope like <em>Communications of the ACM<\/em> (Contributed Articles section).<\/p>\n<p>If there is a large amount of material, this could become a book or long-form website, but we would prefer to start with an article.<\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Jiajun Jiang, Weihai Lu, Junjie Chen, Qingwei Lin, Pu Zhao, Yu Kang, Hongyu Zhang, Yingfei Xiong, Feng Gao, Zhangwei Xu, Yingnong Dang, and Dongmei Zhang. 2020. How to mitigate the incident? an effective troubleshooting guide recommendation technique for online service systems. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE 2020). Association for Computing Machinery, New York, NY, USA, 1410\u20131420. DOI:https:\/\/doi.org\/10.1145\/3368089.3417054&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>As noted in the paper, Microsoft does not require TSGs to be developed nor maintained. We expect that the availability of runbooks to vary widely between organizations based on their culture.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>\n"},{"title":"Game Wizards","link":"https:\/\/ztoz.blog\/posts\/game-wizards\/","pubDate":"Tue, 23 Nov 2021 11:50:58 -0700","guid":"https:\/\/ztoz.blog\/posts\/game-wizards\/","description":"<p>Game Wizards: The Epic Battle for Dungeons &amp; Dragons\nby Jon Peterson<\/p>\n<p><a href=\"https:\/\/mitpress.mit.edu\/books\/game-wizards\">Publisher&rsquo;s Website<\/a><\/p>\n<p><em>Game Wizards<\/em> is a business history spanning the creation of Dungeons &amp; Dragons and TSR to the firing of Gary Gygax in 1985. The book has a special focus on the legal battles between Gary Gygax and Dave Arneson around creator and royalty rights for D&amp;D, but also covers the rivalry between GenCon and Origin and the overall growth of TSR. Jon Peterson assembles the story from a broad-range of sources, including media portrayals, fanzines, interviews, legal documents, unexpected artifacts, and <a href=\"http:\/\/playingattheworld.blogspot.com\/2021\/09\/game-wizards-tsr-financials.html\">recreations of financial statements<\/a>.<\/p>\n<p>Peterson lays out the pivotal and mundane operations by year. This approach allows a very deep recalling of each year&rsquo;s happenings and an &ldquo;uncompressing&rdquo; of history. As D&amp;D takes off in the early 1980s, TSR becomes profligate with spending and bad habits become entrenched. By maintaining a constant velocity through history, the narrative is less of a &ldquo;bubble burst&rdquo; business story and more a nuanced look at the mechanics of selling and producing the games. The book&rsquo;s detail will make it a primary resource for later historians and should be of interest to game entrepreneurs for details on marketing, distribution, and rights management.<\/p>\n<p>The result is far-less hagiographic than <em>Empire of Imagination: Gary Gygax and the Birth of Dungeons &amp; Dragons<\/em> (2015) and showcases more of the personality issues, lousy business practices, and messiness inherit in starting a business within a novel market. The book spends little time on game design or why roleplaying games were successful or fun; for example, non-gamers may be mystified why modules were so lucrative as the reader is assumed to understand the mechanics of hosting a gaming session. However, audience restrictions aside, the book is a very valuable addition to the relatively young field of game studies and should be very useful for comparative studies for how other entertainment industries managed their growth and talent.<\/p>"},{"title":"Engagement Insider Threat","link":"https:\/\/ztoz.blog\/posts\/engagement-insider-threat\/","pubDate":"Mon, 15 Nov 2021 12:38:16 -0600","guid":"https:\/\/ztoz.blog\/posts\/engagement-insider-threat\/","description":"<p>Warning: Contains spoilers for the season 6, episode 5 &ldquo;Tethics&rdquo; of the show <em>Silicon Valley<\/em> which aired in 2019.<\/p>\n<p>In &ldquo;Tethics&rdquo;, two managers of Pied Piper, Gilfoyle and Monica, are threatened by HR due to their low employee engagement scores &mdash; there&rsquo;s a consensus among the employees that the two managers are jerks. They both promise to raise their scores, but stupidly promise to raise the scores from &ldquo;hate&rdquo; to &ldquo;love&rdquo; within a week. Faced with this impossible task, they hatch a plan to use social engineering to steal employee&rsquo;s passwords and then, using the compromised employee&rsquo;s accounts, modify the engagement scores to show that employees now love them. How realistic is this plan? How can we mitigate these kinds of attacks? Let&rsquo;s use a threat modeling approach to answer both.<\/p>\n<p>More formally, the attacker&rsquo;s objective is to modify the engagement data to benefit themselves. As managers within the company, they will know:<\/p>\n<ol>\n<li>what system is being used to collect and analyze the data (we will assume this is commercial, non-self hosted SaaS software),<\/li>\n<li>who (in their company) would be administering the project,<\/li>\n<li>who would be providing engagement scores (the respondents), and<\/li>\n<li>the high-level project plan and process.<\/li>\n<\/ol>\n<p>The data needs to be modified surreptitiously; if the engagement data is deleted or is exposed as doctored, the attackers may have gained a short reprieve but haven&rsquo;t accomplished their goal of appearing as better managers. In addition, we will assume the attackers do not have advanced technical skills to attack the SaaS software directly<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup> and the time budget is closer to days than months.<\/p>\n<p>The proposed attack is two-stage:<\/p>\n<ol>\n<li>Obtain employee passwords by using social engineering\/knowledge of the employee&rsquo;s backgrounds to guess passwords<\/li>\n<li>Leveraging those accounts, change the employee&rsquo;s ratings for the managers in the engagement monitoring system<\/li>\n<\/ol>\n<p>In the STRIDE threat model<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup>, this is a Spoofing threat (obtaining account access) that can then allow a Tampering threat (modifying engagement score data). The asset to be protected in the first stage are passwords and, in the second stage, the employee ratings. Spoofing threats are mitigated via authentication and tampering threats are mitigated via ensuring data integrity.<\/p>\n<p>(If you find this kind of analysis interesting, I strongly recommend Adam Shostack <em>Threat Modeling<\/em><sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote-ref\" role=\"doc-noteref\">3<\/a><\/sup> book. Also, Ross Anderson&rsquo;s <em>Security Engineering<\/em><sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote-ref\" role=\"doc-noteref\">4<\/a><\/sup> covers a broader systems-approach to security; the two books are complementary.)<\/p>\n<h1 id=\"spoofing-mitigations\">Spoofing mitigations<\/h1>\n<p>While the employee engagement responses may be anonymized or otherwise have some sort of privacy protections, the actual act of providing feedback is not anonymous. Instead, the collection system requires some ability to know who the respondent is in order to tie the responses to the respondent&rsquo;s place in the organization. (Engagement studies often include questions specific to certain roles or organizations and results need to be aggregated under the appropriate manager.) Thus, the collection process requires respondent authentication. Within the episode, the authentication is handled via passwords, so that is where we&rsquo;ll focus our model.<\/p>\n<h2 id=\"password-policyrecovery-policy\">Password policy\/recovery policy<\/h2>\n<p>Using personal information to aid in guessing a user&rsquo;s password is not a novel attack; it&rsquo;s a standard <a href=\"https:\/\/tvtropes.org\/pmwiki\/pmwiki.php\/Main\/ThePasswordIsAlwaysSwordfish\">trope in fiction<\/a> but one that still works in the real world. As a straight-forward attack, it relies on users choosing poor passwords and on the systems being vulnerable to guessing attacks. Better passwords can be encouraged via the use of password managers (which reduces the pain of a unique password per site), single sign-on (which increases the value of a strong password), or policies that encourage higher entropy passwords (although onerous policies can encourage users to use their creativity to perversely choose worse passwords).<\/p>\n<p>Additionally, if the system locks out (even temporarily) an account after a number of bad logins, this will slow the rate an attacker can guess passwords. Within this scenario, the managers do not have a large amount of time to guess passwords and there is no indication they can guess passwords &ldquo;off-line&rdquo;, so a good authentication system can throttle their attack.<\/p>\n<p>An alternative interpretation of their attack is they are not attacking passwords directly, but rather using personal details to attack the password recovery mechanism. Password recovery mechanisms typically require the user to answer a handful of personal questions and, if they answer correctly, the system will initiate a password change. Recovery questions often suffer from a low number of possible answers and answers can often be sourced with some public background checks. Good password recovery systems will notify the user (ideally through multiple channels) about the password change, so this can cause the attack to become known if the users see the notifications and decide to flag them as improper. However, not all recovery systems include these features, so the attackers could maintain their cover.<\/p>\n<h2 id=\"two-factor-authentication\">Two-factor authentication<\/h2>\n<p>The three-factors are something you know, something you have, and something you are. Two-factor authentication requires two of these factors, normally a password and a key (often via a multi-factor authentication app on a phone or a physical device). If providing an engagement score requires two-factor authentication, then the cost of spoofing dramatically increases because a password, by itself, is not useful. However, this also increases the cost to the respondent (and  the complexity of the collection) since the respondent will need to go through a longer and more error-prone authentication process. If this creates too much hassle, there will be a decrease in the response rate which may invalidate the utility of the engagement project.<\/p>\n<p>A single sign-on (SSO) system, which may require an onerous authentication process but will re-use the authentication across multiple systems, can spread the pain sufficiently to mitigate the friction concerns above. However, SSO is more complex to setup and not all vendors support it.<\/p>\n<h2 id=\"alternative-authentication-approaches\">Alternative authentication approaches<\/h2>\n<p>Could we mitigate the attack by not using passwords at all? In <em>The quest to replace passwords<\/em><sup id=\"fnref:5\"><a href=\"#fn:5\" class=\"footnote-ref\" role=\"doc-noteref\">5<\/a><\/sup>, the authors review three dozen approaches against two dozen criteria and find that none of the proposed alternatives to passwords is in-arguably superior across usability, deployability, and security dimensions. There are trade-offs in all approaches and all examined approaches have at least one feature that is &ldquo;worse&rdquo; than passwords.<\/p>\n<p>For example, take the &ldquo;One Time Password over Email&rdquo; approach<sup id=\"fnref:6\"><a href=\"#fn:6\" class=\"footnote-ref\" role=\"doc-noteref\">6<\/a><\/sup>. In this approach, users submit their email address to a server which persists one-half of a split token as a cookie and sends the other half to the user&rsquo;s email. The user supplies the email token and the server combines and verifies the two tokens. Versus passwords, this scheme has the advantages that the user does not need to remember any long-term tokens, is resilient to someone watching the user type, is resilient to both throttled and unthrottled guessing, resilient against release of information from other parties, and resilient to phishing. However, this scheme has the disadvantages of being inefficient for use <sup id=\"fnref:7\"><a href=\"#fn:7\" class=\"footnote-ref\" role=\"doc-noteref\">7<\/a><\/sup>, is incompatible with existing password-based systems, is immature versus other schemes, requires trusting a third-party (email system and intermediaries), and allows linking of accounts between disparate systems.<\/p>\n<p>For the employee engagement application, the fact that emails are required is likely a non-issue due to the system likely knowing quite a bit more about the employee; it&rsquo;s already a PII-sensitive application. Adding friction to the login process is non-ideal as the project needs to maintain response rates, although employee engagement projects tend to have motivated respondents. The immaturity of the scheme <em>may<\/em> be acceptable for the respondent use case. Respondents don&rsquo;t need typical user accounts as they are only supplying data to the system. However, using a novel system will require more development effort to implement, may incur a larger user training cost, and will have a higher cost in security audits.<\/p>\n<h1 id=\"tampering-mitigations\">Tampering mitigations<\/h1>\n<p>Tampering attacks involved modifying the asset under protection, in this case, the response data. (Tampering could also involve modifying software or hardware within the data flow, but those kinds of attacks would be beyond this kind of attacker.) Maintaining the integrity of the data relies on either preventing improper modification or having a path to restore the proper data after an attack.<\/p>\n<h2 id=\"enforce-deadline-on-updates\">Enforce deadline on updates<\/h2>\n<p>In a standard employee engagement study, there will be a collection period followed by an analysis period where the project teams assembles and disseminates the reports. If updates to responses are restricted to the collection period, this reduces the window of opportunity for attacks.<\/p>\n<h2 id=\"measure-trends-rather-than-moments\">Measure trends rather than moments<\/h2>\n<p>In the episode, they only need to have good scores on the next review date. This means that actions will be taken on anomalous readings; the attackers only need to adjust scores for one review. If the project instead looked at trends over time, the attackers would need to adjust scores over multiple phases, which would greatly increase the chance that their attack would be noticed.<\/p>\n<h2 id=\"report-back-to-the-respondents-on-updates\">Report back to the respondents on updates<\/h2>\n<p>If changes in data are notified to multiple systems, it can make tampering more apparent. However, notifications and logs are only useful if people look at them as there is a cost to monitoring, particularly if updates are expected in the normal course of business (i.e. false positive rate is significant). One way to counter the economic cost is to notify the respondent, as their volume of notifications will be small. Any additional notifications seen by them will likely be treated as suspicious. Additionally, some respondents will like a copy for their own records, making a security feature a user feature as well. Sending the respondent an email copy of whatever is submitted is an easy mitigation to improve detection of tampering. However, this approach also allows a new vector of information disclosure, since attackers can now access the response data from other data sources. An employee&rsquo;s company email account is generally more accessible to management and IT staff than an engagement response database.<\/p>\n<h2 id=\"use-robust-aggregate-statistics\">Use robust aggregate statistics<\/h2>\n<p>The attacker&rsquo;s goal is to increase their engagement score, which is an aggregate of individual scores. Which aggregate is used can greatly influence how the attack must be performed. If individual responses include a score, typically aggregates include mean and median. Alternatively, the program may use a system like Net Promoter Score\u00ae (where responses are classified into the three categories of promoter, detractor, and passive and the derived score is the delta of the percentage of promoters versus percentage of detractors) or some linear regression model. The choice of the aggregate determines how robust the calculation is towards outliers and the more <a href=\"https:\/\/en.wikipedia.org\/wiki\/Robust_statistics\">robust the statistic<\/a>, the more expensive the attack. (The use of &ldquo;outlier&rdquo; here is informal; the SaaS software presumably restricts value to a valid range, so someone can&rsquo;t break the calculation by setting an individual&rsquo;s score to a million.)<\/p>\n<p>For example, let&rsquo;s say the score is based on the average (arithmetic mean) of individual employee&rsquo;s scores. The sample includes 15 employees who have given the scores {1,1,1,2,2,3,3,3,3,3,4,4,5,5,6}. The mean of these scores is 3.1 and the median is 3. The attackers have a good idea of who will be giving the worst scores. By changing the scores to {10,10,10,2,2,10,3,3,3,3,10,4,5,5,6}, they&rsquo;ve managed to improve the mean to 5.7 and the median to 5. As the mean is less robust, the attackers can improve it (here by 2.6) more easily by modifying one-third of the scores. The median is more robust and has only increased by two.<\/p>\n<p>If clamping is not enforced, then the attackers just need to modify one score to allow them to set the mean score to any arbitrary value. To modify the median to an arbitrary value, they would need to modify one half of all scores.<\/p>\n<h2 id=\"retain-historical-data-and-report-on-it\">Retain historical data (and report on it)<\/h2>\n<p>There are two vectors for a data change: an updated response from the respondent and an administrative edit. Particularly for this kind of study, respondents may naturally update their responses if they remember some new detail or tempers cool from a rash response. Thus, collection systems often allow updates to a response, even after the respondent has hit &lsquo;submit&rsquo;. On the administrative side, project administrators may edit the data to correct errors introduced in the project setup, may perform some data clean-up such as re-scaling values, or may edit the data for broader release<sup id=\"fnref:7\"><a href=\"#fn:7\" class=\"footnote-ref\" role=\"doc-noteref\">7<\/a><\/sup> or upon request by the respondent. Reports and analytics resulting from the project should reflect the &ldquo;latest&rdquo; data and incorporate all edits.<\/p>\n<p>For both vectors the appropriate security feature that the SaaS developers and project administrators should maintain is &ldquo;non-repudiation&rdquo;. This feature means that someone can say &ldquo;I didn&rsquo;t do that&rdquo; and the system can prove them wrong (or at least evidence that they did). In bank applications, this means someone can&rsquo;t withdraw some money and then protest they didn&rsquo;t do the withdrawal and should get the money back; the bank&rsquo;s records will identify who performed the withdrawal, where it was done, and likely include photographic evidence of the act.<\/p>\n<p>Non-repudiation requires both data to be logged or persisted appropriately but the details of the changes to be available and evident to the project administrators. On the persistence side, developers will often use an immutable data design or event log to provide that audit trail, with some sort of &ldquo;latest&rdquo; snapshot to make analytical processing efficient. On the presentation side, this can be a challenge since there is often a &ldquo;natural&rdquo; amount of updates within a project that can obscure nefarious activity. Highlighting altered data within a data grid or showing the number of versions of each row are two straight-forward ways of exposing this data, as long as the volume of changes is small. In the case of a high volume of change, there is an opportunity for a system to provide anomaly or fraud detection to help administrators identify odd patterns.<\/p>\n<h1 id=\"next-best-alternative-threats\">&ldquo;Next Best Alternative&rdquo; Threats<\/h1>\n<p>Attackers are rational and will attempt to use the least risky and least expensive attacks available to them. Defenders, faced with their own resource limitations, should invest to raise the costs of attacks beyond the attack&rsquo;s acceptable threshold, but not over-invest such that the investment exceeds the risk. If the above mitigations are successful against the manager&rsquo;s attack, what would the managers do instead?<\/p>\n<p>First, instead of attacking respondent&rsquo;s accounts, they could instead attack a project administrator&rsquo;s account. Using the admin account, they could then edit the data or delete responses. Administrator accounts are more likely to include additional authentication protections than respondent&rsquo;s accounts, so this approach is more risky, although once the account is accessed they have greater power to accomplish their goals. Protecting the data against administrative modifications have similar strategies as in &ldquo;retain historical data&rdquo;.<\/p>\n<p>Secondly, they could try to discredit the study or process itself. This is not a typical &ldquo;security&rdquo; attack, but if sufficient doubt is placed on the accuracy of the study, then the engagement scores will be ignored or treated with little weight. In the context of the episode, though, they are well-known for being anti-social, so they would need to argue that although the results match expectations, they are somehow invalid. Within the episode, the attackers use an alternative form of this attack by making the HR lead&rsquo;s engagement scores terrible. The HR lead then cancels the study.<\/p>\n<p>Thirdly, an alternative form of discrediting could be accomplished through a denial of service attack. If the software becomes unavailable, the schedule for the study can be disrupted or random failures could shed doubt on the SaaS vendor&rsquo;s ability to collect all the relevant data or present the complete picture. While a denial of service attack launched directly against the SaaS vendor would be very public and risky, a company&rsquo;s network can be configured (perhaps through http proxies) to randomly drop packets or requests to the vendor&rsquo;s sites, thus making it seem like the symptoms of overloaded or unreliable software. Based on our profile of the attackers, this would be outside their ability unless they are part of an IT group or could gain support from someone managing the network.<\/p>\n<h1 id=\"conclusion\">Conclusion<\/h1>\n<p>By incorporating standard security controls and incorporating data integrity controls into their software, the cost and difficulty of the attack can be increased beyond the attacker&rsquo;s abilities. Since the Pied Piper company is run so shabbily, it is believable that their in-house engagement software could suffer from such an attack, but since these styles of attacks are well-studied and have off-the-shelf mitigations, they should not work against a professional SaaS vendor&rsquo;s solution.<\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>This is in contrast to the show, in which one of the managers, Gilfoyle, is an extremely skilled engineer. The episode does not explain why Gilfoyle does not attempt to change the data within the database as the episode implies their collection systems are in-house rather than hosted by a vendor.&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>Shostack, Adam. <em>Threat modeling: Designing for security.<\/em> John Wiley &amp; Sons, 2014.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:3\" role=\"doc-endnote\">\n<p>ibid.&#160;<a href=\"#fnref:3\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:4\" role=\"doc-endnote\">\n<p>Anderson, Ross. <em>Security engineering: a guide to building dependable distributed systems.<\/em> John Wiley &amp; Sons, 2020.&#160;<a href=\"#fnref:4\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:5\" role=\"doc-endnote\">\n<p>Bonneau, Joseph, et al. &ldquo;The quest to replace passwords: A framework for comparative evaluation of web authentication schemes.&rdquo; <em>2012 IEEE Symposium on Security and Privacy.<\/em> IEEE, 2012. <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/QuestToReplacePasswords.pdf\">pdf<\/a>&#160;<a href=\"#fnref:5\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:6\" role=\"doc-endnote\">\n<p>Van Der Horst, Timothy W., and Kent E. Seamons. &ldquo;Simple authentication for the web.&rdquo; <em>2007 Third International Conference on Security and Privacy in Communications Networks and the Workshops-SecureComm 2007.<\/em> IEEE, 2007. <a href=\"https:\/\/isrl.byu.edu\/pubs\/saw_TechReport%20(revised%20March%202007).pdf\">pdf<\/a>&#160;<a href=\"#fnref:6\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:7\" role=\"doc-endnote\">\n<p>Every time a user attempts to login, they will need to wait for an email (or some similar transmission).&#160;<a href=\"#fnref:7\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>"},{"title":"Why I like lobste.rs","link":"https:\/\/ztoz.blog\/posts\/why-lobste.rs\/","pubDate":"Sun, 07 Nov 2021 18:41:05 -0700","guid":"https:\/\/ztoz.blog\/posts\/why-lobste.rs\/","description":"<p>Lobste.rs is a &ldquo;computing focused community&rdquo; and this <a href=\"https:\/\/lobste.rs\/s\/ydno8w\/tree_structure_file_systems\">post<\/a> is a good example of why I like it:<\/p>\n<ul>\n<li>Thinking about the data model of file systems and alternatives is a pretty geeky subject, but impactful<\/li>\n<li>Comments are respectful and build on the post<\/li>\n<li>Comments add historical context<\/li>\n<\/ul>"},{"title":"Updated Rss Feed","link":"https:\/\/ztoz.blog\/posts\/updated-rss-feed\/","pubDate":"Sun, 31 Oct 2021 13:47:17 -0600","guid":"https:\/\/ztoz.blog\/posts\/updated-rss-feed\/","description":"<p>I&rsquo;ve updated the RSS feed to include the full content of the articles rather than just the summary.<\/p>\n"},{"title":"Royal Game of Ur (Background)","link":"https:\/\/ztoz.blog\/posts\/game-of-ur-background\/","pubDate":"Fri, 29 Oct 2021 14:40:31 -0600","guid":"https:\/\/ztoz.blog\/posts\/game-of-ur-background\/","description":"<p><strong>Updated 2021-12-17: A closed-form solution has been submitted<\/strong><\/p>\n<p>The <em>Royal Game of Ur<\/em>, also known as the <em>Game of Twenty Squares<\/em>, is one of the oldest known board games, with attestations from 3000 to 250 BC. It was an enduringly popular game, played across multiple civilizations and across the Middle East. As a racing game, two players compete to move their pieces along and off the board, where their own pieces can act as road blocks to their own movement but can also send the other&rsquo;s players pieces back to the beginning. In this article (part I), I present the rules for the game and discuss the game complexity metrics. In part II, I will discuss solving the game by computing the optimal strategy.<\/p>\n<h1 id=\"rules\">Rules<\/h1>\n<p>Contrary to claims that the rules have been fully discovered, we are still unsure how this game was played historically. Indeed, with a lifespan nearing 3000 years and being played by different cultures and geographies, there may have been many variants of the rules and changes over time. We know that in the latter period of time, the board straightened itself out (see figure). The path around the board, significance of the rosettes, and how to interpret dice rolls are all educated guesses. My copy of the game from <a href=\"https:\/\/mastersofgames.com\">Masters Traditional Games<\/a> lists three routes and four rule variants. The path denoted in the figure is the &ldquo;RC Bell&rdquo; route.<\/p>\n<p><img src=\"game-of-ur-board.png\" alt=\"img\" title=\"Game of Twenty Squares boards\"><\/p>\n<h2 id=\"standard-version\">&ldquo;Standard&rdquo; version<\/h2>\n<p>The &ldquo;standard&rdquo; version is the set of rules you are likely to find implemented in electronic versions. This version of the rules was established by Irving Finkel based on archaeological copies of the board, inferences from the Itti-Marduk-bal\u0101\u1e6du tablet, and experimentation. I imagine most people became aware of the game through the Youtube match between Irving Finkel and Tom Scott:<\/p>\n<iframe width=\"560\" height=\"315\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/WZskjLq040I\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe>\n<p>(User tracking should be disabled unless you play the video.)<\/p>\n<p>Since analyzing a half-dozen variants of the game would be tedious, I&rsquo;ll be focusing on this set of the rules.<\/p>\n<p>The game&rsquo;s objective is for a player to move their seven pieces along their route and off the board before the other player can do the same.<\/p>\n<ol>\n<li>All pieces begin in the &ldquo;starting pool&rdquo; and are not on the board.<\/li>\n<li>On a player&rsquo;s turn, they roll four tetrahedral dice to determine how far they can move a piece. Each die allows only a result of zero or one, so the range of results of a throw is between 0 and 4 inclusive. (A zero result is a lost turn.) In Johnstone\/standard dice notation <sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>, this is a roll of 4d2 - 4.<\/li>\n<li>Only one piece can be moved per roll and a piece must move forward. Pieces may be moved from those on the board or in the starting pool. If no pieces can be moved, the player loses the turn.<\/li>\n<li>A piece cannot move forward onto a piece owned by the same player and only one piece may occupy a square at a time. If a piece lands on a square owned by the opposing player, the opposing player&rsquo;s piece is placed back in the starting pool (off the board). However, the middle rosette square is &ldquo;safe&rdquo; and a piece on this square cannot be ejected from the board.<\/li>\n<li>If the piece lands on a rosette, the player must roll again. The next move does not need to use the same piece.<\/li>\n<li>Pieces are removed from the board on an exact roll.<\/li>\n<\/ol>\n<p>Humorously, in the video Finkel makes a big scene about Tom Scott calculating the odds for a roll of the dice and Finkel proclaims he&rsquo;s never had any use for maths. But in his own paper, he works out the odds for the various rolls.<\/p>\n<h2 id=\"itti-marduk-bal\u0101\u1e6du---gambling-version\">Itti-Marduk-bal\u0101\u1e6du - Gambling version<\/h2>\n<p>Irving Finkel&rsquo;s <em>On the rules for the Royal Game of Ur<\/em> <sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup> describes a different, more complex game based on a tablet written by Itti-Marduk-bal\u0101\u1e6du. Since the game in the tablet is a modification of a &ldquo;base&rdquo; game, the standard version is (partly) defined by reverse-engineering the tablet&rsquo;s version of the rules. In contrast to the standard version, the tablet version has five labeled pieces and a mechanism to be bet or gamble during play. This version is played on the elongated or revised board.<\/p>\n<p>Each player has five pieces, labeled Swallow, Storm-bird, Raven, Rooster, and Eagle respectively.<\/p>\n<p>There are two types of dice: a standard tetrahedral 1d4 and a yes\/no die or 1d2. The rules for how to interpret the dice from the tablet allow many options, but for sake of choosing a single option, a player will first roll the 1d4. They <em>may<\/em> then roll the yes\/no die. If the roll is &lsquo;no&rsquo;, the turn is lost. If the roll is &lsquo;yes&rsquo;, the 1d4 result is transformed to 1-&gt;5, 2-&gt;6, 3-&gt;7, and 4-&gt;10.<\/p>\n<p>Each player starts with 15 counters and the game pool starts with 20 counters initially.<\/p>\n<p>Similar to the standard version, players move their pieces around the board and attempt to exit all pieces before their opponent. While playing, players may gain by landing on the rosettes and lose points by skipping over them. A player also loses points based on pieces remaining on the board at the game end. The winner is the one with the most points at the end.<\/p>\n<ol>\n<li>To launch a piece onto the board, a specific role must be made. The swallow requires a 2, the storm-bird a 5, the raven a 6, the rooster a 7, and the eagle a 10. When launched, the swallow appears on the first rosette (position 4), while the others start at positions 5, 6, 7, and 10 respectively.<\/li>\n<li>Pieces must be launched in sequence. If a player has the swallow and storm-bird in play (or in the completed pool), they can only launch the raven on a 6, even if a 7 or 10 is rolled. If the storm-bird and raven are in play, but the swallow is sent back to the starting pool, the swallow will need to be re-launched before the rooster can be launched.<\/li>\n<li>When a player lands a piece on the rosette, they win from the game pool a number of counters equal to the piece&rsquo;s value. The swallow has a value of 3, the storm-bird, raven, and rooster have a value of 4, and the eagle has a value of 5.<\/li>\n<li>If a piece moves over the rosette without landing on it, the player loses a number of counter equal to the piece&rsquo;s value.<\/li>\n<li>Rosette positions are safe; a piece can not be knocked off the position by another piece.<\/li>\n<\/ol>\n<p>At the end of the game, a player loses a number of counter equal to the sum of value of pieces on the board.<\/p>\n<h1 id=\"state-representation-for-the-standard-version\">State representation for the standard version<\/h1>\n<p>On the original board, a piece moves from the &ldquo;ready pool&rdquo; (position 0) through the first four safe positions, then through eight contended spaces, and finally through two safe positions before exiting the board to the &ldquo;completed pool&rdquo; (position 15). Thus, a position can be encoded in 4 bits for the 16 potential locations. Since each player has seven pieces, a player&rsquo;s pieces can be encoded in 7 * 4 bits and both players pieces can be encoded in 2 * 7 * 4 or 56 bits.<\/p>\n<p>Alternatively, we can model the game from the perspective of the squares rather than the pieces. The 14 middle positions (which have a capacity of one) can be modeled in 14 bit and the number of pieces available in the starting pool require 3 bits. (The number of pieces in the ending pool can be calculated based on the difference of the total number of pieces and the number in the starting pool plus on the board.) This requires 17 bits of storage per player or 34 bits total.<\/p>\n<p>Naively, the state space for this board in the first representation allows \\(2^{56}=72,057,594,037,927,936\\) possible states, or approximately 72 quadrillion. This is a big number (half-way to the age of the Earth in seconds), but this representation also contains <em>many<\/em> symmetrical games (piece 1 on position 3 and piece 2 on position 6 is identical to piece 1 being on position 6 and piece 2 on position 3) and invalid games (multiple pieces on the same position).<\/p>\n<p>The alternative representation allows for \\(2^{34} = 17,179,869,184\\) possible states or approximately 17 billion. This representation also contains <em>many<\/em> invalid games.<\/p>\n<p>Can we compute a more precise value for the number of possible states?<\/p>\n<h2 id=\"how-many-possible-valid-states-for-the-standard-version\">How many possible, valid states for the standard version?<\/h2>\n<p>We can approximate the state space by using the &ldquo;ball and urn&rdquo; mental framework. We are trying to place pieces (balls) into board positions (urns). Each player has 16 urns (although some are shared) and all but two urns have a maximum capacity of one ball. The player&rsquo;s pieces are identical to each other but the urns are distinct.<\/p>\n<p>If we simplify the rules and state that all urns have a maximum capacity of one ball, then we can leverage the helpful <a href=\"https:\/\/www.johndcook.com\/TwelvefoldWay.pdf\">Twelvefold Way<\/a> table. (There are multiple versions of this table; I find Cook&rsquo;s one of the more readable on the web.) The entry for &ldquo;unlabeled&rdquo; balls, &ldquo;labeled&rdquo; urns, and a maximum capacity of one is equivalent to a <em>n choose k<\/em> or binomial coefficient calculation.<\/p>\n<p>Table: \\(\\binom{n}{p}\\) Values for \\(n=16\\) and \\(p\\) pieces<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(p\\) Pieces<\/th>\n<th style=\"text-align:right\">Approximate State Space Size<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">256<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">14,400<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td style=\"text-align:right\">313,600<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td style=\"text-align:right\">3,312,400<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td style=\"text-align:right\">19,079,424<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td style=\"text-align:right\">64,128,064<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td style=\"text-align:right\">130,873,600<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This is an approximation and includes invalid states where the two players are contesting the same slot while also excluding cases where multiple places may be in the starting and ending pool, but as we shall see, this is a pretty good approximation. (Since combinations are symmetric, e.g. \\(\\binom{16}{2} = \\binom{16}{14}\\), the approximation breaks down as \\(p\\) increases.) Furthermore, these numbers are fairly small, so it seems practical to have a computer iterate (smartly) through the cases as 131 million is far more reasonable than 17 billion.<\/p>\n<p>I wrote a <a href=\"https:\/\/gitlab.com\/jeffrey_starr\/valid-ur-state\">program in Rust<\/a> that will generate and count all valid states. I use a recursive, depth-first search approach. Each player is modeled via a vector which represents a placement into a labeled urn. Since some positions allow multiple placements we cannot generates states via a standard lexicographic combination algorithm, but do so via a slight modification of the algorithm that will generate duplicate 0 and 15 positions.<\/p>\n<p>Within the search, our base case is a solution being found. Due to the functional design, the two player&rsquo;s states are valid on entry to the function, so a valid full solution just requires a check that all placements have been made. If so, we pass the solution to a Visitor class that abstracts away what to do for every solution. (The visitor either prints out the solution or increments counts depending on the program&rsquo;s mode.)<\/p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">search_games_dfs<\/span>(pieces: <span style=\"color:#66d9ef\">usize<\/span>, visitor: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">mut<\/span> ModalVisitor, p1state: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">mut<\/span> Vec<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">usize<\/span><span style=\"color:#f92672\">&gt;<\/span>, p2state: <span style=\"color:#66d9ef\">&amp;<\/span><span style=\"color:#a6e22e\">mut<\/span> Vec<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">usize<\/span><span style=\"color:#f92672\">&gt;<\/span>) {\n    <span style=\"color:#66d9ef\">if<\/span> p1state.len() <span style=\"color:#f92672\">==<\/span> pieces <span style=\"color:#f92672\">&amp;&amp;<\/span> p2state.len() <span style=\"color:#f92672\">==<\/span> pieces {\n        visitor.found(p1state, p2state)\n    } <span style=\"color:#66d9ef\">else<\/span> {\n        <span style=\"color:#75715e\">\/\/ states are ordered monotonically; if the last move is the starting or ending pool,\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#75715e\">\/\/ we can create a new state in the same pool. Otherwise, place in the next potentially\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#75715e\">\/\/ valid spot.\n<\/span><span style=\"color:#75715e\"><\/span>        <span style=\"color:#66d9ef\">let<\/span> min_i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">match<\/span> (<span style=\"color:#f92672\">*<\/span>p1state).last() {\n            Some(<span style=\"color:#ae81ff\">0<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>,\n            Some(<span style=\"color:#ae81ff\">15<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">15<\/span>,\n            Some(min) <span style=\"color:#f92672\">=&gt;<\/span> min <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>,\n            None <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>\n        };\n        <span style=\"color:#66d9ef\">let<\/span> min_j <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">match<\/span> (<span style=\"color:#f92672\">*<\/span>p2state).last() {\n            Some(<span style=\"color:#ae81ff\">0<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>,\n            Some(<span style=\"color:#ae81ff\">15<\/span>) <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">15<\/span>,\n            Some(min) <span style=\"color:#f92672\">=&gt;<\/span> min <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>,\n            None <span style=\"color:#f92672\">=&gt;<\/span> <span style=\"color:#ae81ff\">0<\/span>\n        };\n\n        <span style=\"color:#66d9ef\">for<\/span> i <span style=\"color:#66d9ef\">in<\/span> min_i<span style=\"color:#f92672\">..<\/span><span style=\"color:#ae81ff\">16<\/span> {\n            p1state.push(i);\n            <span style=\"color:#66d9ef\">for<\/span> j <span style=\"color:#66d9ef\">in<\/span> min_j<span style=\"color:#f92672\">..<\/span><span style=\"color:#ae81ff\">16<\/span> {\n                p2state.push(j);\n                <span style=\"color:#66d9ef\">if<\/span> is_valid(p1state, p2state) {\n                    search_games_dfs(pieces, visitor, p1state, p2state);\n                }\n                p2state.pop();\n            }\n            p1state.pop();\n        }\n    }\n}\n<\/code><\/pre><\/div><p>If we are not at the base case, we either initialize a placement (<code>min_i<\/code>, <code>min_j<\/code>) at the zero position (no former placements have been made), try to continue at 0 or 15 (the pools that can handle more than one assignment), or the next entry (since all other urns have a maximum capacity of one and we are searching the space in ascending order).<\/p>\n<p>Then, we attempt all possibilities for player one and player two, verify they are valid (<code>is_valid<\/code>), and if so recursively check the next depth. The <code>is_valid<\/code> code verifies that placements are distinct (when necessary) within a player&rsquo;s placements and distinct within the eight contended squares.<\/p>\n<blockquote>\n<p>NB: This is not the most efficient representation of the board. If we used the 14-bit\nrepresentation of the board, we could detect collisions by masking off the first four bits,\nuse the and-bitwise operator and test\nagainst zero. Furthermore, the representation does not allow modeling of more than one piece\nwithin a position, so the first check in the code can be eliminated entirely.<\/p>\n<\/blockquote>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-rust\" data-lang=\"rust\"><span style=\"color:#66d9ef\">fn<\/span> <span style=\"color:#a6e22e\">is_valid<\/span>(player1: <span style=\"color:#66d9ef\">&amp;<\/span>[<span style=\"color:#66d9ef\">usize<\/span>], player2: <span style=\"color:#66d9ef\">&amp;<\/span>[<span style=\"color:#66d9ef\">usize<\/span>]) -&gt; <span style=\"color:#66d9ef\">bool<\/span> {\n    <span style=\"color:#75715e\">\/\/ verify each individual player&#39;s pieces are unique or in the starting\/ending pools\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">let<\/span> <span style=\"color:#66d9ef\">mut<\/span> seen <span style=\"color:#f92672\">=<\/span> [<span style=\"color:#66d9ef\">false<\/span>; <span style=\"color:#ae81ff\">16<\/span>];\n\n    <span style=\"color:#75715e\">\/\/ The placement logic ensures that these constraints are respected, so this code could\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#75715e\">\/\/ be skipped when called from search_games_dfs. Skipping this check halves the run-time.\n<\/span><span style=\"color:#75715e\"><\/span>    <span style=\"color:#66d9ef\">for<\/span> player <span style=\"color:#66d9ef\">in<\/span> [player1, player2].iter() {\n        <span style=\"color:#66d9ef\">for<\/span> placement <span style=\"color:#66d9ef\">in<\/span> player.iter() {\n            <span style=\"color:#66d9ef\">match<\/span> <span style=\"color:#f92672\">*<\/span>placement {\n                <span style=\"color:#ae81ff\">0<\/span> <span style=\"color:#f92672\">=&gt;<\/span> (),\n                <span style=\"color:#ae81ff\">15<\/span> <span style=\"color:#f92672\">=&gt;<\/span> (),\n                place <span style=\"color:#66d9ef\">if<\/span> seen[place] <span style=\"color:#f92672\">=&gt;<\/span>\n                    <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">false<\/span>, <span style=\"color:#75715e\">\/\/ duplicate\n<\/span><span style=\"color:#75715e\"><\/span>                place <span style=\"color:#f92672\">=&gt;<\/span>\n                    seen[place] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">true<\/span>\n            }\n        }\n        seen.fill(<span style=\"color:#66d9ef\">false<\/span>);  <span style=\"color:#75715e\">\/\/ reset for next player\n<\/span><span style=\"color:#75715e\"><\/span>    }\n\n    <span style=\"color:#66d9ef\">for<\/span> player <span style=\"color:#66d9ef\">in<\/span> [player1, player2].iter() {\n        <span style=\"color:#66d9ef\">for<\/span> placement <span style=\"color:#66d9ef\">in<\/span> player.iter() {\n            <span style=\"color:#66d9ef\">match<\/span> <span style=\"color:#f92672\">*<\/span>placement {\n                place <span style=\"color:#66d9ef\">if<\/span> place <span style=\"color:#f92672\">&gt;=<\/span> <span style=\"color:#ae81ff\">5<\/span> <span style=\"color:#f92672\">&amp;&amp;<\/span> place <span style=\"color:#f92672\">&lt;=<\/span> <span style=\"color:#ae81ff\">12<\/span> <span style=\"color:#f92672\">=&gt;<\/span>\n                    <span style=\"color:#66d9ef\">if<\/span> seen[place] {\n                        <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">false<\/span>\n                    } <span style=\"color:#66d9ef\">else<\/span> {\n                        seen[place] <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">true<\/span>\n                    },\n                _ <span style=\"color:#f92672\">=&gt;<\/span> ()\n            }\n        }\n    }\n\n    <span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#66d9ef\">true<\/span>;\n}\n<\/code><\/pre><\/div><p>Table: Programmatic Count of Valid States with \\(p\\) pieces<\/p>\n<table>\n<thead>\n<tr>\n<th>\\(p\\) Pieces<\/th>\n<th style=\"text-align:right\">Computed State Space Size<\/th>\n<th style=\"text-align:right\">Branches<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td style=\"text-align:right\">248<\/td>\n<td style=\"text-align:right\">232<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td style=\"text-align:right\">13,112<\/td>\n<td style=\"text-align:right\">24,148<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td style=\"text-align:right\">264,304<\/td>\n<td style=\"text-align:right\">714,720<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td style=\"text-align:right\">2,606,947<\/td>\n<td style=\"text-align:right\">9,131,362<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td style=\"text-align:right\">14,680,840<\/td>\n<td style=\"text-align:right\">61,834,256<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td style=\"text-align:right\">53,212,388<\/td>\n<td style=\"text-align:right\">255,752,544<\/td>\n<\/tr>\n<tr>\n<td><em>7<\/em><\/td>\n<td style=\"text-align:right\">137,913,936<\/td>\n<td style=\"text-align:right\">726,864,992<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td style=\"text-align:right\">280,408,902<\/td>\n<td style=\"text-align:right\">1,572,665,124<\/td>\n<\/tr>\n<tr>\n<td>9<\/td>\n<td style=\"text-align:right\">482,210,816<\/td>\n<td style=\"text-align:right\">2,817,358,656<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td style=\"text-align:right\">740,777,984<\/td>\n<td style=\"text-align:right\">4,446,087,488<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Computations are relatively quick. Finding all 138 million solutions for p=7 took less than three minutes (with the first block of checks in <code>is_valid<\/code> skipped) on a 2017 &ldquo;Coffee Lake&rdquo; Intel i7.<\/p>\n<p>The average branching factor at \\(p=7\\) is 5.3; this means that on a given ply (move), a player is making a potential decision between 5 different options, although depending on the roll an option may not be possible (e.g. the move is blocked by another piece). The branching factor is computed as the sum of at-most-one move from the starting pool, all moves on the board, and no moves from pieces in the ending pool.<\/p>\n<p>Comparing our earlier approximations and the programmatic counts finds the deltas to be (proportionally) small:<\/p>\n<p><img src=\"game-of-ur-count.png\" alt=\"image\" title=\"Comparing Programmatic and Binomial State Space Counts\"><\/p>\n<p>Instead of leveraging a computer, can we count the number of positions using a closed form equation?<\/p>\n<h2 id=\"closed-form\">Closed form?<\/h2>\n<blockquote>\n<p>Whoever wants to go about generating all partitions not only immerses himself in immense labor,\nbut also must take pains to keep fully attentive, so as not to be grossly deceived.<\/p>\n<p>&mdash; Leonhard Euler, <em>De Partitione Numerorum<\/em> (1750)<\/p>\n<\/blockquote>\n<p>A <a href=\"https:\/\/distracted-jackson-c5763c.netlify.app\/math\/ur.html\">reader<\/a> has worked out the following closed form equation:<\/p>\n<p>$$S = \\sum_{p_1 = 0}^{\\min(p,m+n)}\\sum_{p_2 = 0}^{\\min(p,m+n)}\\sum_{j = \\max(0,p_1-m)}^{\\min(p_1,n)}  (p-p_1+1)(p-p_2+1) {n \\choose j} {m \\choose p_1-j} {m+n-(p_1-j) \\choose p_2}$$<\/p>\n<p>In the above notation, \\(p_i\\) are the number of pieces from player <em>i<\/em> on the board, there are <em>m<\/em> shared slots on the board and <em>n<\/em> unshared\/uncontested slots.<\/p>\n<p><del>Alas, I have been unable to find a closed-form solution.<\/del> As the Euler quote warns, although one can use the inclusion\/exclusion method to iterate toward the correct number, it is very easy to forget a case or include some duplicates. Graphically, here are the conflicting states for p=1 to 3 (an index refers to a lexicographically ordered single player&rsquo;s valid position):<\/p>\n<p><img src=\"conflicts-p1.png\" alt=\"p1\" title=\"Conflicts for p=1\"><\/p>\n<p><img src=\"conflicts-p2.png\" alt=\"p2\" title=\"Conflicts for p=2\"><\/p>\n<p><img src=\"conflicts-p3.png\" alt=\"p3\" title=\"Conflicts for p=3\"><\/p>\n<p>As expected, conflicts are symmetric along the main diagonal. However, there are a number of sub-patterns that I&rsquo;ve had trouble modeling. <del>I may return to this problem (or some reader may solve it), but at this point a closed form solution is convenient but not necessary.<\/del><\/p>\n<h1 id=\"how-big-are-these-numbers\">How big are these numbers?<\/h1>\n<p>Not very big, but beyond trivial. Using log-10 notation, the state space size is 8 (or 9 if we round up), which is less than the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Game_complexity\">10 for Nine men&rsquo;s morris, 20 for Checkers, and 44 for Chess<\/a>.<\/p>\n<p>A different measurement for a game is the game tree size; this represents the total number of games that can be played. This differs from the state space because there may be multiple paths to a given state. The game tree size is technically infinite for this game (since on a ply the player may roll a zero or players may constantly return each other&rsquo;s pieces to the start). In practice, games tend to be quick and thus avoid becoming bogged down.<\/p>\n<p>We can approximate the gate tree size by raising the average branch to the average number of plies in a game, or \\(b^{d}\\). For an individual ply, a player needs to make a single decision of which piece to move; that decision is shared among all pieces on the board and at most one piece in the starting pool the board (since all pieces in the pool are identical). Through enumeration, we have found the average branch factor as 5.3. I don&rsquo;t have statistics for the average number of plies, but in uncontested play, you will need on average 8 plies to move a piece to the end (since the mode is 2) and with 7 pieces you could expect 56 plies. If we estimate 70 as the average number of plies per player, that would be 140 total plies. The log10 of \\(5.3^{120}\\) is 87. In comparison, Checkers is 40 and Chess is 123.<\/p>\n<p>These numbers allow us to compare games against each other, but they do not explain what makes a game fun or engaging. For the ancients, this game had the benefit of low cost of entry (there are example boards etched in the floor), rapid play, and allowing of multiple strategies but with an element of chance. The central squares force interaction between the players while the safe squares allow the readying of &ldquo;reserve&rdquo; troops to threaten the other player. These combination of factors, in my view, are what makes the game fun and make the sum better than the parts.<\/p>\n<h1 id=\"references\">References<\/h1>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Peterson, Jon. &ldquo;The Origins of Dice Notation.&rdquo; <a href=\"http:\/\/playingattheworld.blogspot.com\/2013\/08\/the-origins-of-dice-notation.html\">Playing at the World<\/a> (2013).&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>Finkel, Irving L. &ldquo;On the rules for the Royal Game of Ur.&rdquo; Ancient Board Games in Perspective (2007): 16-32.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>\n"},{"title":"Little Intuition","link":"https:\/\/ztoz.blog\/posts\/little-intuition\/","pubDate":"Fri, 08 Oct 2021 12:23:01 -0600","guid":"https:\/\/ztoz.blog\/posts\/little-intuition\/","description":"<p>This is the 60th anniversary of the proof of Little&rsquo;s Law, a widely useful rule for queuing systems that links throughput, response (cycle) time, and number of requests (work items). Unfortunately, although software engineers often operate queues or pipelines (modern computer systems are rife with queues, implicit and explicit), they aren&rsquo;t exposed to the mathematical foundations and often fail to build intuition around how queues act operationally. In this post, I aim to describe the Law, applications within software systems, and build an intuition in its use.<\/p>\n<h1 id=\"the-equation\">The equation<\/h1>\n<p>Little&rsquo;s Law has two equivalent formulations: \\(L = \\lambda W\\) and \\(WIP = TH \\times CT\\). The first formulation is that the average number of items within the system (\\(L\\)) is equal to  the average arrival rate of items (\\(\\lambda\\)) times the average waiting time (\\(W\\)) of an item. The second formulation is the average number of work in progress (WIP) items is equal to the average throughput (TH) times the average cycle time (CT). The two formulations are equivalent, with the first focusing on &ldquo;inputs&rdquo; to the system with the second focusing on &ldquo;outputs&rdquo;.<\/p>\n<p>Importantly, the waiting time or cycle time includes the time spent processing in addition to  the time in queue. Similarly, \\(L\\) and WIP include items sitting in queues as well as items being actively worked. All three values are expressed as averages; the law by itself won&rsquo;t provide insight into ranges of those values.<\/p>\n<h1 id=\"some-intuition\">Some intuition<\/h1>\n<p>First, logically, none of the three variables can be negative.<\/p>\n<p>Secondly, cycle time will have some minimal value, \\(CT_0\\), that represents the time to move an item through the pipeline with no schedule delay. However, cycle time can grow almost arbitrarily large if the amount of work exceeds the system&rsquo;s capacity.<\/p>\n<p>Thirdly, throughput will have some maximal value, \\(TH_\\Omega\\), that represents the maximum throughput manageable by the pipeline. (If you are modeling the system via the arrival rate, there might not be a maximum value although systems will need to reject requests at some point.)<\/p>\n<p>For the system to perform some work (for throughput to be positive), some work needs to be released into the system. If too much work is released, cycle time will grow and the system will bounce around its maximum throughput. However, it is not necessary (nor likely) for throughput to climb linearly and for cycle time to only increase once throughput is saturated.<\/p>\n<p>For instance, if a system has a minimal cycle time of 2 seconds and a maximum throughput of 4 units\/second, then an ideal WIP, TH, and CT may act as:<\/p>\n<table>\n<thead>\n<tr>\n<th>WIP<\/th>\n<th>TH<\/th>\n<th>CT<\/th>\n<th>Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td>0<\/td>\n<td>2<\/td>\n<td>\\(CT_0\\)<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>1\/2<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>1<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>2\/3<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>2<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>3<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>4<\/td>\n<td>2<\/td>\n<td>\\(TH_\\Omega\\) reached; end of flat \\(CT_0\\)<\/td>\n<\/tr>\n<tr>\n<td>16<\/td>\n<td>4<\/td>\n<td>4<\/td>\n<td>CT grows<\/td>\n<\/tr>\n<tr>\n<td>32<\/td>\n<td>4<\/td>\n<td>8<\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>However, a system could also behave like this:<\/p>\n<table>\n<thead>\n<tr>\n<th>WIP<\/th>\n<th>TH<\/th>\n<th>CT<\/th>\n<th>Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td>0<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>1\/2<\/td>\n<td>2<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>2\/3<\/td>\n<td>3<\/td>\n<td>CT grows, \\(TH_\\Omega\\) not yet reached<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>1<\/td>\n<td>3<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>4<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>6\/5<\/td>\n<td>5<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>4\/5<\/td>\n<td>10<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>16<\/td>\n<td>4\/3<\/td>\n<td>12<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>32<\/td>\n<td>2<\/td>\n<td>16<\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>A divergence from an ideal progression can indicate losses due to queuing overhead or failures but can also be attributed to natural inefficiencies and cost of variability.<\/p>\n<p>Because these numbers are <em>averages<\/em>, outliers can greatly influence the values. The law of small numbers and large numbers apply to observations. We plan to write more about this later, but the variability in a system (e.g. cycle times) greatly impacts the operator&rsquo;s ability to control the system within some service level objective.<\/p>\n<h1 id=\"applications\">Applications<\/h1>\n<p>Some applications that requiring solving for or controlling each of the three variables:<\/p>\n<ul>\n<li>L: If you need to estimate sizing requirements, like the size of the queue or how much disk will be required, you will be computing L via arrival rate and wait time.<\/li>\n<li>CT: If you want to control response time\/cycle time, you will estimate this value based on a count of accepted work and system throughput.<\/li>\n<li>TH: If you want to control throughput, you will be using work release strategies to improve batching and trying to improve cycle time.<\/li>\n<\/ul>\n<h1 id=\"further-reading\">Further Reading<\/h1>\n<p><a href=\"https:\/\/people.cs.umass.edu\/~emery\/classes\/cmpsci691st\/readings\/OS\/Littles-Law-50-Years-Later.pdf\">PDF Link<\/a> John D. C. Little. 2011. OR FORUM&mdash;Little&rsquo;s Law as Viewed on Its 50th Anniversary. Oper. Res. 59, 3 (May 2011), 536\u2013549. DOI:https:\/\/doi.org\/10.1287\/opre.1110.0940<\/p>\n<p><a href=\"http:\/\/ndl.ethernet.edu.et\/bitstream\/123456789\/25163\/1\/31.pdf\">PDF Link<\/a> Chhajed, Dilip &amp; Lowe, Timothy. (2008). Building Intuition: Insights From Basic Operations Management Models and Principles. Chapter &ldquo;Little&rsquo;s Law&rdquo; by Little, John &amp; Graves, Stephen. DOI:https:\/\/doi.org\/10.1007\/978-0-387-73699-0_5<\/p>\n"},{"title":"Do Not Erase","link":"https:\/\/ztoz.blog\/posts\/do-not-erase\/","pubDate":"Sat, 02 Oct 2021 16:36:03 -0600","guid":"https:\/\/ztoz.blog\/posts\/do-not-erase\/","description":"\n\n\n  <img class=\"fit-image\" src=\"https:\/\/ztoz.blog\/posts\/do-not-erase\/do-not-erase-cover.jpg\" alt=\"Cover of the Do Not Erase book\">\n  \n\n<p><a href=\"https:\/\/press.princeton.edu\/books\/hardcover\/9780691199221\/do-not-erase\">Do Not Erase: Mathematicians and their Chalkboards<\/a>\nby Jessica Wynne is a beautiful photography book. Each leaf contains a photograph of a mathematician&rsquo;s chalkboard and a short essay by that mathematician talking about their work, what is displayed, or something personal. Wynne&rsquo;s work only looks at blackboards or chalkboards, not whiteboards. The entries vary from the prosaic, such as diagrams left over from an office hours session on elementary calculus, to snapshots of current work, to results that were a life&rsquo;s work.<\/p>\n<p>One of the recurring themes in the mathematician&rsquo;s essays is the superiority of chalkboard over a digital medium like PowerPoint. Many of the essays leave their author&rsquo;s rationale unexplored, but the reasons articulated include:<\/p>\n<ol>\n<li>The act of writing slows down the speaker, providing more time for the audience to understand the content (and, in a class, take notes)<\/li>\n<li>The speaker is able to use their full body to communicate (versus a voice and a static image)<\/li>\n<li>High-quality chalk provides better legibility than a marker and skilled users can create subtle artwork, but chalk isn&rsquo;t so versatile as to encourage adding unnecessary and distracting details like digital works<\/li>\n<\/ol>\n<p>For why a chalkboard is superior to a whiteboard:<\/p>\n<ol>\n<li>Whiteboard markers smell<\/li>\n<li>Whiteboard markers are always running out<\/li>\n<li>Markers aren&rsquo;t as legible and flexible as chalk<\/li>\n<\/ol>\n<p>(Some of the authors do note that markers have the advantage of leaving less dust about, as in the case where the mathematician moved a chalkboard into their bedroom in case there was a late night burst of insight.)<\/p>\n<p>As the production of the book extended over the pandemic, many of the mathematicians express sorrow over their inability to use their boards for their work. Although most of the mathematicians in the book are in fields sufficiently esoteric that book readers will not grasp what is being shown on the board, we can appreciate the medium and skill that permits the transmission of deep concepts in such a succinct manner and wish we could hear and see the mathematician explaining their work in concert with the pictures.<\/p>\n"},{"title":"Interviewing Questions","link":"https:\/\/ztoz.blog\/posts\/interviewing-questions\/","pubDate":"Mon, 23 Aug 2021 10:47:28 -0600","guid":"https:\/\/ztoz.blog\/posts\/interviewing-questions\/","description":"<p><a href=\"https:\/\/news.ycombinator.com\">Hacker News<\/a> likes to complain about interviewing, and, well, there is a lot to complain about.\nOne of the recent <a href=\"https:\/\/news.ycombinator.com\/item?id=28274310\">links<\/a> on the matter was from Kislay Verma\n<a href=\"https:\/\/kislayverma.com\/organizations\/competitive-programming-is-useless\/\">&ldquo;Competitive programming is useless&rdquo;<\/a>.\nKislay decries an over-emphasis on competitive coding questions informing interview performance &mdash; i.e., asking\nincreasingly obscure and trivial algorithmic and data structures questions rather than focusing on fundamentals and\nquestions to engage and measure the breadth of the candidate&rsquo;s experience and talent.<\/p>\n<p>When I went to college, ACM Programming Competitions were the primary form of competitive competition. In contrast\nto today&rsquo;s Hacker Rank and Leetcode, these were team events. Over a period of six hours, a team would share a single\ncomputer and seek to solve the most questions in the least amount of time. Questions were written in a narrative style,\nso contestants had to recognize what approach to take; these were not &ldquo;implement methods for a priority queue&rdquo; but solve\na problem where a priority queue is a convenient and appropriate data structure. Questions were graded objectively by\njudging software; they were purely about generating the right output (within some processing time limit) rather than about\ncode quality itself. In the lower competitions, there was usually one question that required graph or dynamic\nprogramming and the other questions were more straight-forward and (in my opinion) could have appeared as homework\nfor sophomores and juniors. The higher competitions usually required combining multiple algorithms and targeted\nthose who had mastered a senior-level algorithms course (<a href=\"https:\/\/icpc.global\/worldfinals\/problems\/2018%20ACM-ICPC%20World%20Finals\/icpc2018.pdf\">2018 example<\/a>).\nAs I recall, these were usually open-book, although you weren&rsquo;t allowed any digital sources.<\/p>\n<p>In my experience, these competitions actually encouraged good form. A knowledge of idioms and the standard library\nreduced the amount of time writing code due to muscle memory and re-use. Since the team was restricted to a single\ncomputer, some teams used pair programmers or you might have to double up to debug a program, both of which encouraged\ngood variable naming and functional decomposition.<\/p>\n<p>That said, I&rsquo;m sympathetic to the notion that tricky algorithmic questions do not yield good signal on candidates.\nInstead, I sought signal on how they would handle every day problems.<\/p>\n<p>At Qualtrics, I&rsquo;ve conducted over 300 interviews with candidates that ranged from interns to experienced (20+ years)\nhires. In the early days, I usually asked questions that required dynamic programming, but I eventually switched to questions\ndrawn from my day-to-day experiences. (The <a href=\"https:\/\/www.qualtrics.com\/eng\/how-we-interview-in-engineering-at-qualtrics\/\">Qualtrics interviewing philosophy<\/a>\nis to ask &ldquo;straight-forward&rdquo; questions. This does not mean easy questions, but rather questions that do not involve\nany tricks or obscure knowledge. The blog post references an example clock-hand question; this question was a typical\nstarter problem at the lower-level ACM competitions. Personally, I think both of the example questions provide too\nfew opportunities for signal within a software developer interview.)<\/p>\n<p>To provide good signal, a problem requires some algorithm applied to a non-trivial data structure or data object. I tried to target problems\nthat required a screens worth of code in order to provide opportunities for the candidate to show how they would structure\na solution (functional decomposition). As an example, one of my preferred questions required modeling a color value in\nrgb space. Since this was an internal data object, candidates had a lot of freedom in their choice. Their choice provided\ngood signal on how they approached trade-offs between convenience, efficiency, and legibility.<\/p>\n<p>I ran the interviews as open-book as this was also a source of signal. A typical sign of a poor candidate was using\nexamples of code and attempting to copy-paste the code into something that worked; positive signal was a candidate\nthat used the official API documentation and understood it. One candidate provided very positive signal after first\nfollowing some code from StackOverflow, pausing, and then discussing why the code was bad and then changing their\napproach.<\/p>\n<p>I preferred problems where the candidate had to fully run their code because it provided signal in how they\ndebugged problems and dealt with compiler errors.<\/p>\n<p>Once candidates completed the exercise (I tried to push candidates to a successful completion), I asked them to\nperform a code review of their code, and then I asked about the scaling characteristics of the code. I think the\ncode review is an important part of the process as it 1) allowed candidates to acknowledge and suggest improvements\nto any bad code they might have written and 2) reveal signal about what they expect from good code. In some cases,\nI had to suggest a change in the code to obtain signal. Most candidates noted a lack of comments in their samples\nand thought naming could be improved (although rarely had any specific recommendations). This was weak signal. There\nwas better signal through discussions of idiomatic compliance, memory lifecycles, and alternative approaches to\ndata modeling and functional decomposition. Memorably, a few candidates provided excellent signal by describing how\nthe compiler would interpret various approaches or compared solutions in multiple languages. (The best candidates had\nfun with the problem.)<\/p>\n<p>My intent for asking about the scaling characteristics of their code was not to get a recital of Big-O (although many\ncandidates provided that data anyway without me prompting) but rather a discussion on what to do if the inputs were\nmillions or billions larger. I sought two different kinds of signal: 1) at what level of scale do candidates feel they\nneed to complicate the solution 2) what range of options do candidates consider for scaling? For 1), poor candidates\nwould reach for a distributed map-reduce solution when the number of inputs was in the hundreds of thousands\n(processing time per record was less than a millisecond). Better candidates would probe the use case (turning this coding exercise into a mini design exercise)\nor discuss the point where the gain from parallelization exceeded the required overhead. For 2), I found candidates\ncould rarely differentiate themselves in discussing technical options but the &ldquo;median&rdquo; candidate could suggest one\nappropriate strategy or technology.<\/p>\n<p>Ultimately, the pass-fail was based on a mixture of these various signals, rather than a single measure. For instance,\nI passed a number of candidates that failed to pass all tests within the interview time, if there was sufficient\npositive signal elsewhere. Good problems provide opportunities to gather a broad range of signal.<\/p>\n"},{"title":"Patterns of Progress","link":"https:\/\/ztoz.blog\/posts\/patterns-of-progress\/","pubDate":"Thu, 19 Aug 2021 18:04:24 -0600","guid":"https:\/\/ztoz.blog\/posts\/patterns-of-progress\/","description":"<p>Modern software systems involve many asynchronous tasks, but reporting status on tasks is often handled poorly.\nAn asynchronous task is one that processes in the background\nand allows the client or caller to work on something else in the meantime (&ldquo;non-blocking&rdquo;). Architecturally, asynchronous processing\nis often the right choice for operations and scaling reasons, but it does add complexity. Both the end-users and\nthe operating teams will need to understand the state of the task and, if the wrong mechanism is chosen, teams will spend their\ntime solving issues such as:<\/p>\n<ol>\n<li>Stuck tasks. A task is not progressing; this could be caused by a missing update\/write (the task progressed but\nthe update message was missed), the task was dropped from the queue, a task that is actually stuck in an infinite loop\nor deadlock, or a task&rsquo;s terminal state was not accurately modeled.<\/li>\n<li>Failed tasks. A task is in a failure state and needs some operator intervention. This is usually for tasks\nthat have exhausted their retry budget, but <em>must<\/em> be completed rather than abandoned. Universally, engineering teams\nwill need to care about failing tasks, but a poor progress design will amplify the operational pain and impede\nautomation improvements. End-users generate many complaints and escalations when they cannot differentiate between an\n&ldquo;in-progress&rdquo; task and a failed task, as every delay becomes a suspected failure.<\/li>\n<li>Orchestration limitations. If the client is unable to tell when an asynchronous task is complete nor separate\nsuccess from failure, then their ability to orchestrate follow-on tasks that depend on the result of the previous task\nwill be hampered by data quality and concurrency issues.<\/li>\n<li>Availability issues. The progress reporting mechanism may be subject to denial-of-service attacks where multiple\nclients can be impacted by a single bad client or data issue. If the processing systems cannot update status on tasks,\nthe updates may be lost.<\/li>\n<\/ol>\n<p>Many of these issues will only be seen once a system becomes popular, particularly orchestration and availability issues.<\/p>\n<p>NB: In this article, a end-user is a human while a client is software, operating on the behest of some end-user,\nthat has an interest in the task.<\/p>\n<h1 id=\"problem-statement\">Problem Statement<\/h1>\n<p>Put simply, something is happening out of sight from the client and the client needs some way to know when that\nsomething has completed. However, this definition can obscure the actual design space. If we ask these three questions:<\/p>\n<ol>\n<li>Why is this task being executed? What is the context of the call?<\/li>\n<li>What kind of updates does the client need?<\/li>\n<li>How complex is the client? How does the client receive updates?<\/li>\n<\/ol>\n<p>we can design a solution based on the use case, required data model, and use that to find a lower cost technical implementation.<\/p>\n<blockquote>\n<h3 id=\"example-generating-a-report\">Example: Generating a report<\/h3>\n<p>The end-user (an authorized web user) has requested a report of the past month of activity which will take several\nminutes to compile.\nThis is an adhoc report specific to this user (versus a report production that many clients may be tracking). The\ndata comes from multiple data sources but a &lsquo;partial&rsquo; report is undesirable, so there are many ways the report\ngeneration may fail and a single data source failing will prevent the report&rsquo;s production. The end-user is using a web\napplication to retrieve the successful report, so the client can handle some protocol complexity but may not\nbe able to persist data reliably (user could delete stored data) nor be invoked from the server.<\/p>\n<\/blockquote>\n<blockquote>\n<h3 id=\"example-exporting-data-after-transforming-it\">Example: Exporting data after transforming it<\/h3>\n<p>The end-user is collecting data within an online spreadsheet table and wants to transform the data collected\nsince the last export and, once that is complete, export it daily (as input to a larger workflow). Internally,\ntransformations are handled by one service and exports by a separate service. Transformations and exports operate on\nthe same resource (the table). If the transformation fails, the export is unnecessary. The end-user is automating\nthis process using shell scripts and a simple scheduler system.<\/p>\n<\/blockquote>\n<h1 id=\"client-context-and-knowledge\">Client Context and Knowledge<\/h1>\n<h2 id=\"know-nothing-special-case\">Know Nothing (Special Case)<\/h2>\n<p>The client has no knowledge of the asynchronous task. This is usually indicative that the task is a &lsquo;side effect&rsquo;\nof some other workflow. For example, a log message may eventually be recorded in a store, but the client has no\nneed to know the state of the log because it will not impact their workflow.<\/p>\n<h2 id=\"task-will-eventually-process-special-case\">Task will eventually process (Special Case)<\/h2>\n<p>The client is told that the request will be eventually handled, but they have no visibility into its state. Email\nis an example; after being given confirmation that an email will eventually get to the recipients, the sender has\nno way to track the current state or ultimate delivery of their message<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote-ref\" role=\"doc-noteref\">1<\/a><\/sup>. If the task is modifying or creating\na resource, the clients may be able to determine completion by checking on the resource.<\/p>\n<h2 id=\"completed\">Completed<\/h2>\n<p>The client can tell if a task has completed, but has no further details. For a state machine, this is equivalent to\nknowing that a task is in a terminal state.<\/p>\n<p>If a non-terminal state is treated as &ldquo;complete&rdquo;, clients can see tasks switch back to incomplete, which is\ncounter to normal expectations. If the clients are using the complete status as a part of a work orchestration,\nthis can lead to processes becoming unexpectedly parallelized or data seeming inconsistent. If a task can be retried,\nthe complete status should only be set once all attempts have been exhausted.<\/p>\n<h2 id=\"success-or-failure\">Success or Failure<\/h2>\n<p>The client can tell if the task completed successfully or if it failed. For a state machine, there is generally\nonly a single successful terminal state but there may be many failure states. The client may be able to extract\ndetails about a failure (e.g. exceptions, status code).<\/p>\n<p>Failures can reveal security-sensitive information about an infrastructure, so systems will often have a two-tier\nreporting mechanism &mdash; a limited error message or opaque code (e.g. transaction id) for outsiders and more detailed\ndiagnostics for insiders\/engineering team.<\/p>\n<p>Similar to above, if a task can change from a failed state to a successful state, this can confuse clients and\nfoil orchestration. Conceptually, this model is for tracking &ldquo;permanent failure&rdquo;.<\/p>\n<h2 id=\"in-progress-details\">In-progress Details<\/h2>\n<p>The client has some view into the status of the task as it being executed. This view could be an indicator of:<\/p>\n<ol>\n<li>Activity\/liveliness<\/li>\n<li>Work completed (e.g. percent complete)<\/li>\n<li>Work left (e.g. distance to destination)<\/li>\n<li>Estimated time to complete<\/li>\n<li>Individual steps completed<\/li>\n<\/ol>\n<p>or many other variations or combinations. For instance, when I was generating the SSL certificate for this site,\nthe first attempt failed. The user interface reported that a failure occurred, the time of the failure, and that\nthere would be an automatic retry. While the data reported was minimal, this was sufficient for the purposes of\nthe task.<\/p>\n<h2 id=\"simulated-in-progress-details-special-case\">Simulated In-progress Details (Special Case)<\/h2>\n<p>The client has some view into the status of the task, but the status is a simulation. For example, the server might\nbe projecting an estimated time to complete based on a model of processing times, rather than monitoring actual\nprogress. The server may report a mixture of real and simulated data; for example, a server may report the real\ncompletion status of the task but may add in a projected percent complete if the infrastructure to report a real\npercent complete is missing.<\/p>\n<h1 id=\"data-model-and-lifecycle\">Data Model and Lifecycle<\/h1>\n<p>A progress status may be modeled as its own resource (e.g. with a progress id) or as an attribute of a resource. If the\nprocess does not produce a resource or if a process may be applied multiple times to a resource (perhaps even\nconcurrently), then each task should have a unique (or unique enough) identifier. If the task is 1:1 to a resource,\nthen Occam&rsquo;s razor would suggest creating a new entity is unnecessary and instead use the resource identifier.<\/p>\n<p>If a progress id and resource are connected, but there isn&rsquo;t a way to look up that relationship, you are likely\nto run into operational visibility issues where end-users can only report one of the ids. Business reporting on the\nstate of your system will likely require some join between the two ids, so plan ahead.<\/p>\n<p>If you need to report in-progress details and the task is indeterminate upon creation (e.g. the task will be scanning\na number of files but the number of files is unknown), then you will want to structure your reporting for\n&ldquo;monotonic reporting&rdquo;. Clients expect progress to move forwards, stall, or stop, but not move\nbackwards. For example, in the scanning example, if you report status as the proportion of the number of files scanned\nover the number of total files, it is possible for the percentage to decline over time if finding files and scanning\nfiles are not sequential. Instead, represent the data to make the non-monotonicity clear, perhaps by\ndifferentiating a &ldquo;known, fixed&rdquo; denominator versus an &ldquo;in-progress, changing&rdquo; denominator or using an event history\napproach.<\/p>\n<p>The utility of progress data tends to decline rapidly with time. Based on your operational and client needs, plan to\ndelete old progress data periodically and automatically<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote-ref\" role=\"doc-noteref\">2<\/a><\/sup>. However, if the client can request data for a valid process\nthat simply has not been registered yet (due to eventual consistency or delays in progress history processing), there\nmay be an ambiguity between &ldquo;old and likely deleted data&rdquo; and &ldquo;very new data&rdquo; which can cause orchestration issues. A\npotential mitigation would be to encode the creation date within the progress id such that the server could return a\n410 Gone versus 404 Not Found status.<\/p>\n<p>Security-wise, authorizing access to a progress status can be tricky if it is not 1:1 with a resource. Additionally,\nif the client polls for status, the server could generate a significant number of authorization checks. Thus, from\na risk trade-off perspective, avoiding storing any client or sensitive data as part of the progress and skipping\nauthorization checks may be acceptable.<\/p>\n<h1 id=\"types-of-client-mechanisms\">Types of Client Mechanisms<\/h1>\n<h2 id=\"wait--synchronous-call-wrapper-special-case\">Wait \/ Synchronous Call Wrapper (Special Case)<\/h2>\n<p>The client makes a synchronous call which handles waiting on an asynchronous task. The clients call may need\nto be kept active through keep-alive or <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Status\/100\">HTTP 100 Continue<\/a>\nmessages. If the asynchronous portion of the call is both a\nsmall portion of the overall workflow and the asynchronous task has a well-bounded time to complete, this pattern\ncan simplify the client interface. This approach works best if the task and resource are 1:1, such as resource\ncreation or deletion.<\/p>\n<h2 id=\"poll\">Poll<\/h2>\n<p>Periodically, the client makes a call to get the current status of the task.<\/p>\n<p><em>Failure Modes<\/em><\/p>\n<p>Since there is a cost to every call, this mechanism can create denial-of-service attacks due to a large number of calls\nmade in a very short period as well as tying up network bandwidth. Server mitigations include early request rejections\n(potentially with a <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/HTTP\/Status\/429\">HTTP 429 Too Many Requests<\/a> status),\nlong polling (a request may stay open until some timeout is reached, allowing late updates to be fulfilled\nwithin a single request), and caching. On the client side, the client can switch from a periodic rate to one with\nrandomness or increase delays between calls to reduce server load.<\/p>\n<p>For the overall workflow, the delays between calls add latency to the overall processing time. For clients, this is\nan incentive to decrease the time between requests. For a server, the worst case would be a client &lsquo;busy wait&rsquo;-ing\nwhere calls are made once the previous call has completed.<\/p>\n<h2 id=\"callback\">Callback<\/h2>\n<p>On either completion or an update to status, the server makes a call to the client. Within the Observer pattern, this\nis equivalent to invoking the <code>update<\/code> function. Callbacks may be modeled as both 1:1 and 1:n to clients.<\/p>\n<p>If the clients do not have an addressable location (e.g. web browser), callbacks are not an option.<\/p>\n<p>Compared to polling, the advantage of this method is that, since the server controls the callback, the server can\nminimize wasted &ldquo;no new data&rdquo; calls to the client. The server can maintain an audit trail of clients that were\nsuccessfully notified, an important feature for some domains. Callbacks fit very naturally into an orchestration or\nworkflow management system, allowing efficient &ldquo;wait until this done&rdquo; steps, since the callback can be used to trigger\nthe next step in the flow.<\/p>\n<p><em>Failure Modes<\/em><\/p>\n<p>Security-wise, the callback can be unsafe, particularly if it is controlled fully by the client. Servers can be\ntricked into participating in denial of service attacks, invoking vulnerabilities, or divulging internal data.<\/p>\n<p>The callback from the server can fail, for reasons including a bad registration, network failures, or client\navailability issues. The onus is placed on the server for retries, and it is possible that the client will never\nsee the callback. If clients need a way to see missing callbacks or replay previous callbacks, callbacks may be combined\nwith event streams.<\/p>\n<h2 id=\"event-stream\">Event Stream<\/h2>\n<p>On a change to status, the server writes a new event log. Clients read from the event log. The event log is not\nnecessarily persisted but usually is with technologies like Kafka. If the event log is\npersisted, clients may have the ability to replay or see historical data. The protocol could be\n<a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/Server-sent_events\">server side events<\/a> or may be custom.<\/p>\n<p>An event stream may be modeled per progress, per resource, or per client.<\/p>\n<p>Internally, a system may use an event stream (potentially to decouple async processing from the API layer) but\npresent the data via a different mechanism. For example, a background may read the event stream and update a\nkey-value store to represent the latest &ldquo;snapshot&rdquo; of the data.<\/p>\n<p><em>Failure Modes<\/em><\/p>\n<p>As the most complex mechanism, you can expect more implementation-specific and operational issues than in polling\nor callbacks. For instance, storage space is a function of the number of things being tracked multiplied by the\nnumber of updates, while straight-forward implementations of polling and callbacks only require space proportional\nto the number of things being tracked.<\/p>\n<p>Operationally, however, maintaining a history allows replay of events which can help mitigate server and client failure\nmodes, so the complexity may be worth the cost.<\/p>\n<section class=\"footnotes\" role=\"doc-endnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\" role=\"doc-endnote\">\n<p>Tracking pixels and other techniques are &ldquo;out of band&rdquo; mechanisms to see if an email is read.&#160;<a href=\"#fnref:1\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<li id=\"fn:2\" role=\"doc-endnote\">\n<p>However, historical data may be interesting for building prediction models or tracking trends. In this case, archive\nthe data by moving it to another data store.&#160;<a href=\"#fnref:2\" class=\"footnote-backref\" role=\"doc-backlink\">&#x21a9;&#xfe0e;<\/a><\/p>\n<\/li>\n<\/ol>\n<\/section>\n"},{"title":{},"link":"https:\/\/ztoz.blog\/archives\/","pubDate":"Thu, 19 Aug 2021 00:00:00 +0000","guid":"https:\/\/ztoz.blog\/archives\/","description":{}},{"title":"Prior Writings","link":"https:\/\/ztoz.blog\/posts\/prior-writings\/","pubDate":"Wed, 04 Aug 2021 15:43:07 -0600","guid":"https:\/\/ztoz.blog\/posts\/prior-writings\/","description":"<p>Between Honeywell and Qualtrics, I founded a company called \u2124\u2192\u2124 Technologies. The first thing I tried was a\nsteganography plug-in for Firefox. The plug-in attempted to hide some text using a user-provided corpus to compute\na statistical language model. I <a href=\"http:\/\/ztostartup.blogspot.com\/2013\/10\/mimicking-tweets.html\">documented<\/a> the early\nresearch in emulating tweets. Since that wasn&rsquo;t fruitful, I pivoted towards an automated statistical analysis service based on the idea\nof switching the problem from a high-skill process of selecting what processing to do to a low-skill process\nof searching through the results for something valuable. (I never found a solution to prevent this from being\np-hacking as a service.) When I found that integration tests were taking far longer than on a single computer, simple\nbatch job system, I developed a distrust in big data technologies that required\n<a href=\"http:\/\/ztostartup.blogspot.com\/2013\/12\/hadoop-processing-model.html\">high startup costs<\/a>.<\/p>\n<p>The analysis service project led to two side-efforts: a functional reimplementation of the\n<a href=\"https:\/\/github.com\/jeffreystarr\/SquarifiedTreemap\">square treemap visualization<\/a> and a Python library for\ninferring the <a href=\"https:\/\/github.com\/jeffreystarr\/dateinfer\">format of a date<\/a>. (Although I forgot about this library\nand failed to support it for seven years, it has seen a number of forks and improvements by others.)<\/p>\n<p>At Qualtrics, I wrote two articles for our engineering blog. One of the more popular articles was about using\n<a href=\"https:\/\/www.qualtrics.com\/eng\/wargames-on-call-training\/\">wargames for training<\/a>. A wargame, in this sense, is\na simulation of one or more scenarios which requires the trainees to respond using real-world tools. For example,\ncausing high load on a box and having the trainees diagnose and kill the malignant process. I was often asked if\nwe still used this practice. The answer is yes, multiple teams did create artificial outages or degraded situations\nin our staging environments to train new on-call engineers. Over time, the training tended to become more\nspecialized. As the Qualtrics infrastructure moved from virtual machines to orchestration environments, we stopped\ntraining for certain failure modes. For instance, operators did not need to train on restarting processes as the\norchestrator handled that for them.<\/p>\n<p>The less popular article was a discussion on <a href=\"https:\/\/www.qualtrics.com\/eng\/characterizing-system-design-of-experiments\/\">design of experiments<\/a>,\na technique I learned and used at Honeywell. Design of Experiments is a way to find what parameters matter most\nfor a system. DoE is meant to be used by humans, not computers, so many features of the practice are meant to keep\nthe costs down. (DoE is often used in domains with destructive testing.) Those in the machine learning community\nwould see this as a type of hill-climbing algorithm.<\/p>\n<p>Although I was not the author, one of my co-workers wrote a detailed article on our <a href=\"https:\/\/www.qualtrics.com\/eng\/indexing-text-for-both-effective-search-and-accurate-analysis\/\">language processing and indexing pipeline<\/a>\nwhich serves as the de-facto public technical document on the subject. As such, it is worthy of another link.<\/p>\n"}]}}