-
Notifications
You must be signed in to change notification settings - Fork 0
Comparing changes
Open a pull request
base repository: hallelx2/context8
base: v0.4.0
head repository: hallelx2/context8
compare: v1.0.0
- 7 commits
- 41 files changed
- 2 contributors
Commits on May 2, 2026
-
feat(storage): pluggable backends with SQLite default
Introduce the StorageBackend Protocol so the engine, ingest, browse, export, feedback, and MCP layers stop reaching into a vendor SDK directly. Two implementations live alongside the Protocol: - SQLiteBackend (default) — single file at ~/.context8/context8.db, three vec0 virtual tables for named dense vectors, FTS5 for native BM25 sparse search, JSON1 for tag arrays, WAL mode + 5s busy timeout. Zero infrastructure: stock pip install, no Docker, no daemon. - ActianBackend (legacy) — extracted from the old storage.py and preserved behind the new [actian] extra. Same external contract. StorageService is a thin facade that resolves the backend from CONTEXT8_BACKEND (default "sqlite") and delegates every Protocol method. Existing imports (`from context8.storage import StorageService`) keep working unchanged. Schema migrations are idempotent via apply_migrations(); a meta-table dim guard refuses to re-open a DB whose code_dim disagrees with the current CONTEXT8_USE_CODE_MODEL setting (vec0 dims are immutable). config.py adds CONTEXT8_BACKEND, CONTEXT8_DB_PATH, USE_CODE_MODEL. pyproject.toml adds sqlite-vec, moves Actian into [actian] (with allow-direct-references for the GitHub wheel URL), registers the "actian" pytest marker, and bumps to 0.5.0. tests/test_storage_sqlite.py covers schema migration idempotency, the dim guard, WAL pragma, vec0 round-trips, FTS5 special-character defanging, JSON1 tag filters, and scroll-cursor pagination (18 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dbb7b2b - Browse repository at this point
Copy the full SHA dbb7b2bView commit details -
refactor(search): lift search primitives onto the backend Protocol
Drop every _require_actian()/av.* call from the search engine, browse, and export layers. They now talk to StorageService.search_dense(), search_sparse(), and scroll() — the same Protocol both backends implement, so the engine is fully backend-agnostic. - search/engine.py: rewritten around the new Protocol. find_duplicate, find_duplicate_or_variant, and search_by_solution all collapse to thin wrappers over storage.search_dense("problem"|"solution", ...). _build_filter now returns a SearchFilter dataclass; QueryAnalyzer weights and AttributionTracker integration are unchanged. - search/fusion.py (new): pure-Python Reciprocal Rank Fusion. Replaces av.reciprocal_rank_fusion. ~30 lines, k=60 by default, per-list weights for QueryAnalyzer-driven fusion tuning. - search/attribution.py: tolerant of both ScoredHit (record_id) and the legacy Actian point shape (id), so the tracker stays neutral. - browse.py / export.py: scroll() with a SearchFilter — JSON1 on SQLite, FilterBuilder on Actian. Cross-backend export/import round-trip verified: JSON has no vectors, import re-embeds. tests/test_e2e_sqlite.py mirrors the legacy Actian e2e against SQLite: collection shape, hybrid search, filtered search, named-vector access, ablation, feedback loop, attribution, quality boost, dedup, browse, export-import (24 tests, no infrastructure required). tests/test_search_filter.py exercises SearchFilter → SQL fragment + JSON1 translation as pure Python (10 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>Configuration menu - View commit details
-
Copy full SHA for 16b2e4f - Browse repository at this point
Copy the full SHA 16b2e4fView commit details -
feat(cli): backend-aware lifecycle, doctor, serve
start/stop print "no daemon needed" under SQLite (and only spin up the Actian container when CONTEXT8_BACKEND=actian). init runs schema migrations for SQLite or creates the Actian collection. doctor checks sqlite-vec, WAL mode, vec0 + FTS5 module presence, and runs a filtered-scroll probe — instead of poking the Actian client directly. stats shows "Database: SQLite (~/.context8/context8.db)" or the "Endpoint: host:50051" Actian line as appropriate. - cli/ui.py: check_backend() picks the probe by backend; check_db_connection is preserved as an alias. New check_sqlite_vec() reports the extension version. Old check_actian_sdk() kept for the doctor's Actian-specific section. - cli/commands/lifecycle.py: start/stop/init are dispatch + delegate. - cli/commands/serve.py: _bootstrap() skips Docker for SQLite, delegates to ensure_running() for Actian. --no-bootstrap flag unchanged. - cli/commands/ops.py: doctor splits into _doctor_sqlite() and _doctor_actian(). The old raw-client filter probe is gone — replaced with storage.scroll(SearchFilter(language="python"), 1) which works on both backends through the Protocol. browse/search/export/import all now route through check_backend() instead of the Actian-only check_db_connection() alias. - cli/commands/bench.py: "All Actian features + quality ranker" copy reworded to "Hybrid + filter + ranker (full pipeline)". demo's FilterBuilder mention now reads "Metadata filters (SQL WHERE on SQLite, FilterBuilder on Actian)". - docker.py: ensure_running, is_container_running, stop_container short-circuit to success when CONTEXT8_BACKEND != "actian", so legacy callers can't accidentally spin up Docker on a SQLite-default install. - mcp/tools.py: get_services() drops the unconditional Docker bootstrap — that already happens in serve._bootstrap. _handle_stats reports backend / db_path / endpoint as appropriate. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c28cda8 - Browse repository at this point
Copy the full SHA c28cda8View commit details -
docs+tests: SQLite-first README, gate Actian e2e, fix arity bug
README rewritten around the new SQLite-first install (single command: pip install context8 && context8 init --seed && context8 add claude-code). The Actian path is preserved as an "Optional: Actian VectorAI DB backend" section with the hackathon-era install. The "Hackathon: Advanced Features Used" section becomes "Capabilities (and how each backend delivers them)" — the same three capabilities (named vectors, hybrid fusion, filtered search) framed as backend-portable. Architecture diagram redrawn: pluggable Protocol fanning out to two concrete backends (SQLite vec0+FTS5 below, Actian gRPC container right). Tech-stack table promotes sqlite-vec and FTS5 to primary, demotes Actian to optional. Project-structure tree updated to reflect the new storage/ package and search/fusion.py. v0.5.0 changelog entry added. CLAUDE.md fully rewritten — new project overview, structure, key design decisions (#1 is now "pluggable storage backend"), commands, plus distinct SQLite Backend Notes / Actian Backend Notes sections. RESULTS.md kept as the hackathon submission narrative but reframed: the Actian-feature table is rewritten as a backend-portability table (SQLite delivers each capability via vec0/FTS5/SQL+JSON1; Actian delivers them via named vectors / sparse vectors / FilterBuilder). Benchmark section now has placeholders for both backends so the ablation can be run side-by-side. All twelve docs/*.md design docs (CONCEPT, ARCHITECTURE, BOTTLENECKS, PLAN-01..08, Hackathon Demo Video — Script) prepended with a historical-artifact banner pointing readers to README.md / CLAUDE.md. The hackathon design narrative is preserved in place; it just stops being the canonical source for the current architecture. tests/test_e2e.py: - pytestmark adds pytest.mark.actian + a new skipif on CONTEXT8_BACKEND != "actian", so the Actian e2e suite skips cleanly under the default SQLite install (15 tests skip). - Fixed pre-existing FeedbackService(storage, embeddings) arity mismatch on lines 281 and 300 — production constructor takes storage only (mcp/tools.py:44 confirms). Drop the second arg. - isolated_collection fixture now patches actian_backend.COLLECTION_NAME (the captured-at-import-time copy) rather than the package module attribute — the module's attribute is no longer load-time-bound to COLLECTION_NAME after the storage package split. Verification: 127 passed, 15 actian e2e skipped, ruff clean, context8 init/doctor/stats/search/bench/export/import all green under SQLite. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 40b9339 - Browse repository at this point
Copy the full SHA 40b9339View commit details -
chore: ruff format the new SQLite-backend code
`ruff format --check src/` is part of CI and was failing on nine files that were added or rewritten across the SQLite-backend rollout. This is purely whitespace and line-wrapping (no logic change) — the test suite is identical before and after. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 554b696 - Browse repository at this point
Copy the full SHA 554b696View commit details -
release: v1.0.0 — SQLite-first default, public API stable
The version jump from 0.x to 1.0 marks two things: 1. The default install is finally a one-command `pip install context8` with no non-PyPI wheels. v0.x always required the Actian GitHub wheel; that wart is gone. 2. The public surface — MCP tools, CLI command names, env-var conventions, and the StorageBackend Protocol — is now committed to under semver. Future backend additions (e.g. Postgres + pgvector for a hosted version) will be additive third Protocol implementations, not breaking changes. Public surface guaranteed under semver going forward: - MCP tools: context8_search, context8_log, context8_rate, context8_search_solutions, context8_stats. - CLI commands: init, start, stop, doctor, stats, search, browse, add, remove, bench, demo, serve, export, import, import-github, mine. - Env vars: CONTEXT8_BACKEND, CONTEXT8_DB_PATH, CONTEXT8_DB_HOST, CONTEXT8_DB_PORT, CONTEXT8_USE_CODE_MODEL, CONTEXT8_TEXT_MODEL, CONTEXT8_CODE_MODEL, CONTEXT8_RECENCY_HALF_LIFE_DAYS. - Backend Protocol: StorageBackend + SearchFilter + ScoredHit in context8.storage. - JSON export format: {format: "context8-export", version: 1} (backend-agnostic, used for cross-backend migration). pyproject.toml: bump 0.5.0 → 1.0.0, classifier 3-Alpha → 4-Beta. src/context8/__init__.py: bump __version__. README: full v1.0.0 changelog entry rolling up everything since v0.4.0, plus a migration recipe for existing Actian users (export → upgrade → init → import re-embeds). Tag and push to trigger CI + PyPI publish: git tag v1.0.0 && git push --tags Verification: 127 passed, 15 actian e2e skipped, ruff clean, format clean, context8 --version reports 1.0.0. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>Configuration menu - View commit details
-
Copy full SHA for d68e245 - Browse repository at this point
Copy the full SHA d68e245View commit details -
fix(packaging): drop [actian] extra; PyPI bans URL-pinned deps
The v1.0.0 publish workflow failed with HTTP 400 from PyPI: Can't have direct dependency: actian-vectorai @ https://github.com/hackmamba-io/actian-vectorAI-db-beta/raw/main/ actian_vectorai-0.1.0b2-py3-none-any.whl ; extra == "actian" PyPI policy (PEP 440 §5) rejects packages that declare URL-pinned dependencies in their published metadata, even when gated behind an extra. The Actian SDK isn't on PyPI, so the only valid options are: (a) leave it out of pyproject.toml entirely and document the URL install, or (b) wait until it lands on PyPI. v0.4.0 used (a); we revert to that. Changes: - pyproject.toml: remove [project.optional-dependencies] actian and the [tool.hatch.metadata] allow-direct-references block. [tool.uv.sources] is preserved as a convenience for `uv pip install` from source. - README: install path becomes pip install context8 \ "actian-vectorai @ https://github.com/.../actian_vectorai-...whl" with a note explaining why it isn't a single command. - CLAUDE.md / RESULTS.md / config.py docstrings updated likewise. - src/context8/storage/actian_backend.py: ACTIAN_INSTALL_HINT no longer suggests the (now non-existent) [actian] extra; points at the wheel URL directly. Module docstring updated. - src/context8/storage/__init__.py: docstring updated. - src/context8/cli/ui.py: _check_actian and check_actian_sdk error messages point at the wheel URL. - tests/test_e2e.py: docstring install command updated. Verification: - `python -m build` succeeds; twine check PASSED on both .tar.gz and .whl. - Wheel METADATA inspected: zero `Requires-Dist:` URL pins (the actian-vectorai URL appears only inside the embedded README long-description, which PyPI doesn't validate). - ruff format --check / ruff check / pytest -q all clean (127 passed, 15 actian e2e skipped). Tag v1.0.0 will be moved to this commit and re-pushed; v1.0.0 never made it to PyPI so no consumers are affected. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5b97e4d - Browse repository at this point
Copy the full SHA 5b97e4dView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v0.4.0...v1.0.0