Native Protocol Buffers (proto3) support for PostgreSQL.
Store your protobuf binary data with the rest of your data. Supports:
- Schema-aware field extraction without JSONB conversions.
- Custom operators for nested field navigation (
->and#>). - Substantial storage savings over standard JSONB.
- GIN and standard indexing for fast retrieval.
In benchmarks comparing 100,000 serialized example.Order messages against equivalent JSONB structures and normalized native relational schemas (using benchmark.sh with static fixtures):
| Metric | Protobuf (pgproto) |
JSONB (jsonb) |
Native Relational (Normalized 1:N) | Win |
|---|---|---|---|---|
| Storage Size | 16 MB | 46 MB | 25 MB | π ~35% smaller than Native, ~65% smaller than JSONB! |
| Single-Row Lookup Latency (Indexed) | 5.9 ms | 8.1 ms | 3.5 ms | Native is fastest for flat lookups, but pgproto is close! |
| Full Document Retrieval Latency | 5.9 ms | 8.1 ms | 33.1 ms | π ~5x faster than Native JOINs for full object fetch! |
Note
pgproto combines the storage efficiency of binary compaction with the query flexibility of JSONB, without the overhead of heavy JOINs or text parsing!
Benchmarks ran using un-optimized debug binaries on standard Linux environments.
Compile and install the extension (requires standard build-essential and postgresql-server-dev-*).
cd pgproto
make
make install # may need sudoSee the docker-compose.yml if you want to deploy a pre-configured local sandbox.
Enable the extension (do this once in each database where you want to use it):
CREATE EXTENSION pgproto;To understand what fields are in your binary blob, pgproto requires runtime schemas. You can load FileDescriptorSet binary blobs into the registry:
INSERT INTO pb_schemas (name, data) VALUES ('MySchema', '\x...');Create a table with the custom protobuf type:
CREATE TABLE items (
id SERIAL PRIMARY KEY,
data protobuf
);Insert your serialized binary protobuf blobs:
INSERT INTO items (data) VALUES ('\x0a02082a');Extract nested standard fields using operators:
-- Extract field id 1 (integer) from nested structure
SELECT data #> '{Outer, inner, id}'::text[] FROM items;Extract values using standard PostgreSQL operators:
Navigate nested structures using standard text-array paths:
-- Access a nested field deep in protobuf hierarchy
SELECT data #> '{Outer, inner, id}'::text[] FROM items;Navigating complex arrays and maps (using text-arrays for keys and indices):
-- Access map keys inside a nested structure
SELECT data #> '{Outer, tags, mykey}'::text[] FROM items;You can use standard B-Tree indexing on field extraction results for fast lookups:
CREATE INDEX idx_pb_id ON items ((data #> '{Outer, inner, id}'::text[]));
-- Query will use Index Scan instead of sequential scan
EXPLAIN ANALYZE SELECT * FROM items WHERE (data #> '{Outer, inner, id}'::text[]) = 42;Protobuf enums and oneof fields map naturally to standard extraction functions:
- Enums: Encoded as standard varints on the wire. Extract them using
pb_get_int32or the shorthand->operators. - Oneof: Since
oneoffields are just regular fields with a semantic constraint, you can query their values normally.
Protobufβs biggest strength is seamless forward/backward compatibility:
- Adding Fields: You can safely add new fields to your
.protodefinition. Old messages in the database without the field will returnNULLor default values when read using the new schema. - Deprecating Fields: Deprecated fields can still be read if they exist in the binary data. If you remove a field from the schema, the engine will safely skip it during traversal.
To update a schema in the registry without breaking existing data:
-- Update using ON CONFLICT (re-registering is safe!)
INSERT INTO pb_schemas (name, data) VALUES ('MySchema', '\x...')
ON CONFLICT (name) DO UPDATE SET data = EXCLUDED.data;Run the standard PostgreSQL regression tests to verify type I/O, operators, and GIN indexing:
make installcheckWe provide an isolated, ready-to-use testing sandbox with a pre-configured schema (order.proto) and sample records. This environment demonstrates advanced features like Maps, Nested Navigation, and Human-Readable JSON conversion.
To spin it up and run queries:
# 1. Build and start the container
docker-compose -f example/docker-compose.yml up -d --build
# 2. Run showcase queries
docker-compose -f example/docker-compose.yml exec db psql -U postgres -d pgproto_showcase -f /workspace/example/queries.sqlSee example/README.md for more details.
You can run both coverage capture and memory leak analysis directly inside your running Docker workspace.
Install lcov and valgrind inside the running container as root:
docker-compose -f example/docker-compose.yml exec -u root db apt-get update
docker-compose -f example/docker-compose.yml exec -u root db apt-get install -y lcov valgrindRecompile the extension with profiling flags and capture data:
# Recompile inside container
docker-compose -f example/docker-compose.yml exec -u postgres db make clean
docker-compose -f example/docker-compose.yml exec -u postgres db make COPT="-O0 -fprofile-arcs -ftest-coverage"
docker-compose -f example/docker-compose.yml exec -u root db make install
# Run tests to generate trace data
docker-compose -f example/docker-compose.yml exec -u postgres db make installcheck
# Capture output (ignores negative hit counter overflows)
docker-compose -f example/docker-compose.yml exec -u postgres db lcov --capture --directory . --output-file coverage.info --ignore-errors negative,inconsistentRun showcase queries through valgrind to verify memory safety:
docker-compose -f example/docker-compose.yml exec -u postgres db valgrind --leak-check=full --log-file=/workspace/valgrind.log psql -U postgres -d pgproto_showcase -f /workspace/example/valgrind_full.sqlCheck valgrind.log for memory leaks reports!
For historical design plans, caching mechanisms, and deeper architectural discussion, see DESIGN.md.