pgproto

Native Protocol Buffers (proto3) support for PostgreSQL.

Store your protobuf binary data with the rest of your data. Supports:

Schema-aware field extraction without JSONB conversions.
Custom operators for nested field navigation (-> and #>).
Substantial storage savings over standard JSONB.
GIN and standard indexing for fast retrieval.

📊 Performance Results

In benchmarks comparing 100,000 serialized example.Order messages against equivalent JSONB structures and normalized native relational schemas (using benchmark.sh with static fixtures):

Metric	Protobuf (`pgproto`)	JSONB (`jsonb`)	Native Relational (Normalized 1:N)	Win
Storage Size	16 MB	46 MB	25 MB	📊 ~35% smaller than Native, ~65% smaller than JSONB!
Single-Row Lookup Latency (Indexed)	5.9 ms	8.1 ms	3.5 ms	Native is fastest for flat lookups, but `pgproto` is close!
Full Document Retrieval Latency	5.9 ms	8.1 ms	33.1 ms	📈 ~5x faster than Native JOINs for full object fetch!

Note

pgproto combines the storage efficiency of binary compaction with the query flexibility of JSONB, without the overhead of heavy JOINs or text parsing!

Benchmarks ran using un-optimized debug binaries on standard Linux environments.

🛠️ Installation

Linux and Mac

Compile and install the extension (requires standard build-essential and postgresql-server-dev-*).

cd pgproto
make
make install # may need sudo

See the docker-compose.yml if you want to deploy a pre-configured local sandbox.

🏁 Getting Started

Enable the extension (do this once in each database where you want to use it):

CREATE EXTENSION pgproto;

1. Register Your Protobuf Schemas

To understand what fields are in your binary blob, pgproto requires runtime schemas. You can load FileDescriptorSet binary blobs into the registry:

INSERT INTO pb_schemas (name, data) VALUES ('MySchema', '\x...');

2. Create a Protobuf Table

Create a table with the custom protobuf type:

CREATE TABLE items (
    id SERIAL PRIMARY KEY,
    data protobuf
);

3. Insert & Query Values

Insert your serialized binary protobuf blobs:

INSERT INTO items (data) VALUES ('\x0a02082a');

Extract nested standard fields using operators:

-- Extract field id 1 (integer) from nested structure
SELECT data #> '{Outer, inner, id}'::text[] FROM items;

🔍 Querying & Extraction

Extract values using standard PostgreSQL operators:

Nested Field Access

Navigate nested structures using standard text-array paths:

-- Access a nested field deep in protobuf hierarchy
SELECT data #> '{Outer, inner, id}'::text[] FROM items;

Map / Repeated Field Lookups

Navigating complex arrays and maps (using text-arrays for keys and indices):

-- Access map keys inside a nested structure
SELECT data #> '{Outer, tags, mykey}'::text[] FROM items;

🗃️ Indexing

B-Tree expression indexing

You can use standard B-Tree indexing on field extraction results for fast lookups:

CREATE INDEX idx_pb_id ON items ((data #> '{Outer, inner, id}'::text[]));

-- Query will use Index Scan instead of sequential scan
EXPLAIN ANALYZE SELECT * FROM items WHERE (data #> '{Outer, inner, id}'::text[]) = 42;

📚 Advanced Usage & Schema Evolution

Complex Types: Enums and `oneof`

Protobuf enums and oneof fields map naturally to standard extraction functions:

Enums: Encoded as standard varints on the wire. Extract them using pb_get_int32 or the shorthand -> operators.
Oneof: Since oneof fields are just regular fields with a semantic constraint, you can query their values normally.

Schema Evolution Handling

Protobuf’s biggest strength is seamless forward/backward compatibility:

Adding Fields: You can safely add new fields to your .proto definition. Old messages in the database without the field will return NULL or default values when read using the new schema.
Deprecating Fields: Deprecated fields can still be read if they exist in the binary data. If you remove a field from the schema, the engine will safely skip it during traversal.

To update a schema in the registry without breaking existing data:

-- Update using ON CONFLICT (re-registering is safe!)
INSERT INTO pb_schemas (name, data) VALUES ('MySchema', '\x...')
ON CONFLICT (name) DO UPDATE SET data = EXCLUDED.data;

🧪 Testing

🟢 Regression Tests (PostgreSQL `pg_regress`)

Run the standard PostgreSQL regression tests to verify type I/O, operators, and GIN indexing:

make installcheck

🛒 eCommerce Testbench Sandbox (Docker)

We provide an isolated, ready-to-use testing sandbox with a pre-configured schema (order.proto) and sample records. This environment demonstrates advanced features like Maps, Nested Navigation, and Human-Readable JSON conversion.

To spin it up and run queries:

# 1. Build and start the container
docker-compose -f example/docker-compose.yml up -d --build

# 2. Run showcase queries
docker-compose -f example/docker-compose.yml exec db psql -U postgres -d pgproto_showcase -f /workspace/example/queries.sql

See example/README.md for more details.

🐳 Running Coverage & Leaks in Docker (Recommended)

You can run both coverage capture and memory leak analysis directly inside your running Docker workspace.

1. 🏗️ Prerequisites (Install Tools)

Install lcov and valgrind inside the running container as root:

docker-compose -f example/docker-compose.yml exec -u root db apt-get update
docker-compose -f example/docker-compose.yml exec -u root db apt-get install -y lcov valgrind

2. 🧪 Coverage Run

Recompile the extension with profiling flags and capture data:

# Recompile inside container
docker-compose -f example/docker-compose.yml exec -u postgres db make clean
docker-compose -f example/docker-compose.yml exec -u postgres db make COPT="-O0 -fprofile-arcs -ftest-coverage"
docker-compose -f example/docker-compose.yml exec -u root db make install

# Run tests to generate trace data
docker-compose -f example/docker-compose.yml exec -u postgres db make installcheck

# Capture output (ignores negative hit counter overflows)
docker-compose -f example/docker-compose.yml exec -u postgres db lcov --capture --directory . --output-file coverage.info --ignore-errors negative,inconsistent

3. 🧠 Memory Leak Analysis

Run showcase queries through valgrind to verify memory safety:

docker-compose -f example/docker-compose.yml exec -u postgres db valgrind --leak-check=full --log-file=/workspace/valgrind.log psql -U postgres -d pgproto_showcase -f /workspace/example/valgrind_full.sql

Check valgrind.log for memory leaks reports!

🏗️ Technical Details

For historical design plans, caching mechanisms, and deeper architectural discussion, see DESIGN.md.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
example		example
expected		expected
sql		sql
src		src
test_data_new		test_data_new
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
DESIGN.md		DESIGN.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
benchmark.sh		benchmark.sh
docker-compose.yml		docker-compose.yml
pgproto.control		pgproto.control

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pgproto

📊 Performance Results

🛠️ Installation

Linux and Mac

🏁 Getting Started

1. Register Your Protobuf Schemas

2. Create a Protobuf Table

3. Insert & Query Values

🔍 Querying & Extraction

Nested Field Access

Map / Repeated Field Lookups

🗃️ Indexing

B-Tree expression indexing

📚 Advanced Usage & Schema Evolution

Complex Types: Enums and `oneof`

Schema Evolution Handling

🧪 Testing

🟢 Regression Tests (PostgreSQL `pg_regress`)

🛒 eCommerce Testbench Sandbox (Docker)

🐳 Running Coverage & Leaks in Docker (Recommended)

1. 🏗️ Prerequisites (Install Tools)

2. 🧪 Coverage Run

3. 🧠 Memory Leak Analysis

🏗️ Technical Details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pgproto

📊 Performance Results

🛠️ Installation

Linux and Mac

🏁 Getting Started

1. Register Your Protobuf Schemas

2. Create a Protobuf Table

3. Insert & Query Values

🔍 Querying & Extraction

Nested Field Access

Map / Repeated Field Lookups

🗃️ Indexing

B-Tree expression indexing

📚 Advanced Usage & Schema Evolution

Complex Types: Enums and oneof

Schema Evolution Handling

🧪 Testing

🟢 Regression Tests (PostgreSQL pg_regress)

🛒 eCommerce Testbench Sandbox (Docker)

🐳 Running Coverage & Leaks in Docker (Recommended)

1. 🏗️ Prerequisites (Install Tools)

2. 🧪 Coverage Run

3. 🧠 Memory Leak Analysis

🏗️ Technical Details

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Complex Types: Enums and `oneof`

🟢 Regression Tests (PostgreSQL `pg_regress`)

Packages