-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
questdb/questdb
#6739Labels
open sourceOpen source featuresOpen source features
Description
Summary
Bloom filters for both native and Parquet storage to efficiently search high-cardinality columns (order IDs, user IDs, transaction references) across unlimited time ranges without scanning all data.
The Problem
-- Find one order across 5 years of data
SELECT * FROM trades WHERE order_id = 'ORD-2024-ABC123';- High-cardinality columns can't use traditional indexes efficiently
- Without filtering, must scan all partitions
- Needle-in-haystack queries become prohibitively expensive
The Solution
Query: WHERE order_id = 'ORD-2024-ABC123'
Partition 2024-Q1: Bloom says "not here" → SKIP ✓
Partition 2024-Q2: Bloom says "not here" → SKIP ✓
Partition 2024-Q3: Bloom says "maybe here" → SCAN
Partition 2024-Q4: Bloom says "not here" → SKIP ✓
...
How Bloom Filters Work
| Property | Description |
|---|---|
| No false negatives | If filter says "not here", value is guaranteed absent |
| Rare false positives | Occasionally says "maybe" when value isn't present (tunable) |
| Compact | Minimal storage overhead per partition/row group |
| Fast | O(k) lookups where k is number of hash functions |
Features
- Native storage support — Bloom filter per partition for indexed columns
- Parquet support — Bloom filter per row group, compatible with Parquet spec
- Configurable — Tune false positive rate vs storage trade-off
- Automatic maintenance — Filters built during ingestion and compaction
Benefits
- Skip 99%+ of data — Only scan partitions that might contain the value
- Unlimited time range — Search across years without proportional cost
- High-cardinality friendly — Works where traditional indexes don't scale
Use Cases
- Find specific order/trade by ID across historical data
- Lookup user activity across all time
- Audit trails and compliance queries
- Any needle-in-haystack search pattern
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
open sourceOpen source featuresOpen source features
Type
Projects
Status
Done