bloom-sift

Bloom filter using 128-bit MurmurHash3 with Kirsch-Mitzenmacher double hashing.

Overview

A space-efficient probabilistic data structure for set membership testing. Uses a single 128-bit hash to derive all k hash values via double hashing:

h(i) = h1 + i * h2

Features:

Single hash call for all k values (fast)
Automatic optimal parameter calculation
Serialization for storage/transfer
Works in Node.js and browsers
Zero dependencies (except murmur-hash)

Installation

npm install bloom-sift

Quick Start

import { BloomSift } from 'bloom-sift';

const filter = new BloomSift({ capacity: 1000, errorRate: 0.01 });

filter.add('user:123');
filter.has('user:123');  // true
filter.has('user:456');  // false (probably)

API

Constructor

new BloomSift({ capacity: number, errorRate: number })

capacity - Expected number of items
errorRate - Desired false positive rate (0 < p < 1)

Methods

filter.add(item: string | Uint8Array): void
filter.has(item: string | Uint8Array): boolean
filter.clear(): void
filter.serialize(): SerializedBloomSift
BloomSift.deserialize(data: SerializedBloomSift): BloomSift

Properties

filter.size       // number of bits
filter.hashCount  // number of hash functions (k)
filter.count      // items added
filter.fillRatio  // saturation (0 to 1)

Utility

import { calculateOptimalParams } from 'bloom-sift';

const { size, hashCount } = calculateOptimalParams(1000, 0.01);
// { size: 9586, hashCount: 7 }

Usage Examples

Serialization

const data = filter.serialize();
localStorage.setItem('filter', JSON.stringify(data));

// Later...
const restored = BloomSift.deserialize(JSON.parse(localStorage.getItem('filter')));

Binary Data

filter.add(new Uint8Array([1, 2, 3]));
filter.has(new Uint8Array([1, 2, 3]));  // true

Reusing Filters

filter.add('item-1');
filter.clear();
filter.count;  // 0

How It Works

Given capacity n and error rate p:

Bit size: m = -n * ln(p) / (ln(2)²)
Hash count: k = (m/n) * ln(2)

The 128-bit MurmurHash3 output is split into two 64-bit values (h1, h2). Each of the k bit positions is computed as (h1 + i * h2) % m, providing the same theoretical guarantees as k independent hashes.

Performance

Tested on Apple M1 Pro:

Operation	Ops/sec
add	~220K
has	~250K

Run benchmarks:

npm run bench

TypeScript

Full TypeScript support with bundled type definitions. Exports:

BloomSift - Main class
BloomSiftOptions - Constructor options interface
SerializedBloomSift - Serialization format interface
calculateOptimalParams - Utility function

Limitations

No deletions - Use counting Bloom filters for that
Fixed size - Capacity must be set upfront
False positives - May report items as present when they're not (never false negatives)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
bench		bench
docs		docs
src		src
test		test
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bloom-sift

Overview

Installation

Quick Start

API

Constructor

Methods

Properties

Utility

Usage Examples

Serialization

Binary Data

Reusing Filters

How It Works

Performance

TypeScript

Limitations

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

1mb-dev/bloom-sift

Folders and files

Latest commit

History

Repository files navigation

bloom-sift

Overview

Installation

Quick Start

API

Constructor

Methods

Properties

Utility

Usage Examples

Serialization

Binary Data

Reusing Filters

How It Works

Performance

TypeScript

Limitations

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages