Skip to content

janwilmake/remote-sql-cursor

Repository files navigation

remote-sql-cursor

Use the SqlStorageCursor from your durable objects anywhere.

Usage:

npm i remote-sql-cursor

Warning

queryable-object is a simplified alternative that exchanges streaming for simplicity. It is not made available over HTTP but RPC so not for everything, but quite useful in many ways!

Benefits

Use a nearly identical exec cursor API outside of the DO boundary.

import { exec } from "remote-sql-cursor";
export default {
  fetch: () => {
    const stub = env.ExampleObject.get(env.ExampleObject.idFromName("root"));
    //insert one
    await exec(stub, `INSERT INTO items (name) VALUES ('hello')`).toArray();
    // get 'em all
    let count = 0;
    for await (const row of exec<Item>(stub, `SELECT * FROM items`)) {
      console.log({ row });
      count++;
    }
    return new Response(`We/ve got ${count} of em`);
  },
};

Examples:

  • proxy.ts and proxy95.html use remote-sql-cursor from the browser (does not currently work in safari)
  • minimal-example.ts shows usage directly from your worker
  • high-throughput-example.ts shows streaming ±180mb at 8.7mb/s

Please leave a comment and share

Try it

curl --no-buffer -X POST https://remote-sql-cursor.wilmake.com/query/stream \
  -H "Authorization: demo-key-123" \
  -H "Content-Type: application/json" \
  -d '{"query": "SELECT id, name FROM large_users"}' -s | \
  awk '/^{"row":/ { count++; printf "\rCount: %d", count; fflush() } END { print "" }'

Limitations

  • Locally my server is silently restarting when remote-sql-cursor is running queries, and it's not clear why: Reloading local server... is all information that's available. In production it works fine though!
  • My hypothesis was that TTFB for remote queries is lower for large queries (Demo: streaming in 120k rows of a few columns of a table totaling 470MB). That said, as you can see at https://remote-sql-cursor.wilmake.com/direct, the streaming row-by-row setup also significantly reduces actual speed. Direct is much faster because it doesn't have the overhead of streaming each row separately.
  • Although the data in the SQL table can be up to 10GB, the max result from the SQL query can not exceed the memory. If you try you will retrieve {"error":"Durable Object's isolate exceeded its memory limit and was reset."}. It seems that, although SQLite can be configured such that it uses less memory and streams results, this is normally not the case, and Cloudflare does not support it.

Potential improvements

If in the future, Cloudflare would start supporting true SQLite streaming like mentioned here this library would possibly become even more useful. For now, the main benefit sits in the cursor interface, not as much in the streaming.

The problem

The Durable object sqlite storage has a function exec. its interface is:

interface SqlStorage {
  exec<T extends Record<string, SqlStorageValue>>(
    query: string,
    ...bindings: any[]
  ): SqlStorageCursor<T>;
  get databaseSize(): number;
  Cursor: typeof SqlStorageCursor;
  Statement: typeof SqlStorageStatement;
}
declare abstract class SqlStorageStatement {}
type SqlStorageValue = ArrayBuffer | string | number | null;
declare abstract class SqlStorageCursor<
  T extends Record<string, SqlStorageValue>,
> {
  next():
    | {
        done?: false;
        value: T;
      }
    | {
        done: true;
        value?: never;
      };
  toArray(): T[];
  one(): T;
  raw<U extends SqlStorageValue[]>(): IterableIterator<U>;
  columnNames: string[];
  get rowsRead(): number;
  get rowsWritten(): number;
  [Symbol.iterator](): IterableIterator<T>;
}

RPC does not allow accessing this since functions aren't serializable.

I want to create the same interface SqlStorageCursor in any client by streaming the response through fetch via a Streams API, then capture that into this SqlStorageCursor immediately without await.

Ultimately i should be able to call:

let count = 0;
for await (const row of exec(stub, `SELECT * FROM items`)) {
  // Streams, row by row!
  count++;
  console.log({ item });
}

Is this feasible?

Answer; yes;

Got a read speed of 8.7mb/second. After trying batching I saw the speed didn't really improve significantly, so this seems pretty reasonable for a durable object.

CHANGELOG

TODO

Remote Multi-Server Transactions

Making a remote transaction possible would be very useful https://letmeprompt.com/httpsdevelopersc-3mptgo0.

How transactions can be used: https://letmeprompt.com/httpsdevelopersc-vx7x1c0.

We MAY now be able to create a multi-DO transaction in DORM. See https://x.com/janwilmake/status/1926928095329587450 for potential feeback.

Also, the question of sending multiple queries to the DO is not answered, making things slow and potentially not ACID. especially when working with many DBs, allowing point-in-time recovery and transactions would be huge.

About

Streamable Objects - The 'exec' interface from your DO you're used to, now anywhere

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published