feat: restructure as framework, add feedback loop + ranker + GitHub importer by pathfindermilan · Pull Request #1 · hallelx2/context8

pathfindermilan · 2026-04-17T21:12:12Z

Summary

Restructures Context8 from a flat module layout into a focused subpackage framework, and adds the four production capabilities that turn it from a hackathon demo into a credible submission.

What's new

Framework layout — embeddings/, search/, ingest/, benchmark/, mcp/, cli/commands/ (one responsibility per module)
GitHub Issues importer — context8 import-github vercel/next.js --label bug --max-issues 50
Agent feedback loop — context8_rate MCP tool; worked_ratio feeds the ranker
Per-strategy attribution — every result shows which named vector / sparse path surfaced it, at what rank
Quality ranker — final = retrieval × confidence × recency_decay × worked_ratio (configurable floors)
CLI — bench (Recall@K ablation across 5 configs), demo (4 scripted scenarios)

Bug fixes

StorageService.client now calls connect() — fixes 503 from the Actian sync wrapper
context8 doctor asserts hybrid / named / sparse / filter are actually live (no silent degradation)
docker-compose.yml uses fully-qualified docker.io/... image (Podman compatibility)

Test plan

79 unit tests pass (was 29) — added test_ranking, test_attribution, test_github_importer, test_models_extended, test_benchmark
tests/test_e2e.py covers hybrid retrieval, filter isolation, feedback persistence, quality boost (live DB, auto-skips when unreachable)
ruff check src/ tests/ clean
context8 --help shows all 12 commands; context8 doctor reports green on a live DB
Run context8 bench against live DB and paste numbers into RESULTS.md

Summary by CodeRabbit

Release Notes

New Features
- Added GitHub issue ingestion with import-github command
- Added benchmarking and evaluation with bench and demo commands
- Added per-strategy attribution to search results
- Added feedback rating system (context8_rate) with worked/applied counters
- Added solution-based search capability
- Added quality ranking based on recency, confidence, and user feedback
Documentation
- Updated CLI documentation with new commands and capabilities
- Added comprehensive submission results documentation
Chores
- Reorganized CLI into modular command structure
- Refactored codebase into logical packages for search, ingestion, and benchmarking

…orter - Reorganize into subpackages: embeddings/, search/, ingest/, benchmark/, mcp/, cli/commands/ - Add `context8 import-github`, `context8_rate` MCP tool, per-strategy attribution, and confidence/recency/worked-ratio quality ranker - Wire previously-unused solution named vector into search - Add `bench` (Recall@K ablation) and `demo` CLI commands - Fix StorageService not calling connect() on the Actian sync wrapper - Harden `context8 doctor` to verify hybrid/named/sparse are actually live

sourcery-ai

Sorry @pathfindermilan, your pull request is larger than the review limit of 150000 diff characters

coderabbitai · 2026-04-17T21:12:19Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae40c5cd-9b83-476c-9ebf-4b4a6683087f

📥 Commits

Reviewing files that changed from the base of the PR and between 46ed687 and 13a053c.

📒 Files selected for processing (49)

README.md
RESULTS.md
docker-compose.yml
pyproject.toml
src/context8/__init__.py
src/context8/__main__.py
src/context8/agents.py
src/context8/benchmark/__init__.py
src/context8/benchmark/ground_truth.py
src/context8/benchmark/runner.py
src/context8/cli.py
src/context8/cli/__init__.py
src/context8/cli/commands/__init__.py
src/context8/cli/commands/bench.py
src/context8/cli/commands/ingest.py
src/context8/cli/commands/integrations.py
src/context8/cli/commands/lifecycle.py
src/context8/cli/commands/ops.py
src/context8/cli/commands/serve.py
src/context8/cli/main.py
src/context8/cli/ui.py
src/context8/config.py
src/context8/embeddings/__init__.py
src/context8/embeddings/service.py
src/context8/embeddings/tokenizer.py
src/context8/feedback.py
src/context8/ingest/__init__.py
src/context8/ingest/github.py
src/context8/ingest/pipeline.py
src/context8/ingest/seed.py
src/context8/mcp/__init__.py
src/context8/mcp/server.py
src/context8/mcp/tools.py
src/context8/models.py
src/context8/search.py
src/context8/search/__init__.py
src/context8/search/analyzer.py
src/context8/search/attribution.py
src/context8/search/engine.py
src/context8/search/ranking.py
src/context8/storage.py
tests/test_agents.py
tests/test_attribution.py
tests/test_benchmark.py
tests/test_e2e.py
tests/test_embeddings.py
tests/test_github_importer.py
tests/test_models_extended.py
tests/test_ranking.py

📝 Walkthrough

Walkthrough

Context8 undergoes a comprehensive reorganization from flat structure to modular packages: CLI commands split into separate files under cli/commands/, search logic refactored into search/ submodules with new ranking and attribution tracking, and ingest operations consolidated into ingest/ with GitHub importer. Adds feedback rating loop, quality ranking with configurable boosts, per-strategy attribution in results, GitHub issue ingestion pipeline, and benchmark/evaluation artifacts. Expands test coverage significantly with e2e, benchmark, and component-specific test modules.

Changes

Cohort / File(s)	Summary
Documentation & Configuration `README.md`, `RESULTS.md`, `docker-compose.yml`	Added RESULTS.md for submission documentation, expanded README with feature descriptions and benchmark instructions, updated Docker image registry reference.
Version & Package Setup `src/context8/__init__.py`, `src/context8/__main__.py`	Bumped version 0.1.1 → 0.2.0, added `from __future__ import annotations`, simplified relative imports.
Lint Configuration `pyproject.toml`	Updated Ruff ignore rules: relocated E501 exemptions to reflect new seed/tools file locations.
CLI Core Reorganization `src/context8/cli.py` (removed), `src/context8/cli/__init__.py`, `src/context8/cli/main.py`, `src/context8/cli/ui.py`	Refactored monolithic CLI into modular package: removed 542-line cli.py, created main.py entrypoint with Click group, added ui.py helpers for Docker/SDK/DB checks, reorganized command handling.
CLI Commands - Lifecycle `src/context8/cli/commands/__init__.py`, `src/context8/cli/commands/lifecycle.py`	Created lifecycle command module: `start`, `stop`, `init` with Docker orchestration, database initialization, and optional seeding.
CLI Commands - Operations `src/context8/cli/commands/ops.py`	Added health/status commands: `stats` (knowledge base metrics), `doctor` (comprehensive health checks including vector/filter support), `search` (interactive retrieval with attribution display).
CLI Commands - Integration & Ingestion `src/context8/cli/commands/integrations.py`, `src/context8/cli/commands/ingest.py`	New MCP agent integration (`add`/`remove` commands) and GitHub issue ingestion (`import-github`) with label filtering, state control, and resolution handling.
CLI Commands - Benchmarking & Server `src/context8/cli/commands/bench.py`, `src/context8/cli/commands/serve.py`	Added `bench` command with configuration comparison and recall/MRR reporting, `demo` command with scripted scenarios, and `serve` entrypoint for MCP stdio server.
Agent Management `src/context8/agents.py`	Removed docstrings, refactored conditional logic to eliminate redundant else blocks, maintained JSON/agent integration functionality.
Config & Constants `src/context8/config.py`	Added ranking-related constants (RECENCY_HALF_LIFE_DAYS, boost floors, min feedback samples), new `continue_config_path()`, updated MCP server path to `context8.mcp.server`.
Models - Feedback & Attribution `src/context8/models.py`	Extended with `FeedbackStats` (applied/worked counters), updated `ResolutionRecord` with feedback field, added `Attribution`/`StrategyContribution` for ranking source tracking, expanded `SearchResult` with `raw_score` and `boost_factors`.
Embeddings Package Split `src/context8/embeddings/__init__.py`, `src/context8/embeddings/service.py`, `src/context8/embeddings/tokenizer.py`	Extracted sparse tokenization logic into new `BM25Tokenizer` class, delegated `embed_sparse()` to tokenizer, created embeddings package API surface.
Feedback & Rating Loop `src/context8/feedback.py`	New service for feedback persistence: `FeedbackService.rate()` increments applied/worked counters, updates record tags/timestamps, and re-embeds for storage.
Ingest Package Refactoring `src/context8/ingest/__init__.py`, `src/context8/ingest/seed.py`, `src/context8/ingest/pipeline.py`, `src/context8/ingest/github.py`	Reorganized seeding with UUID-based slug-to-ID mapping, created `IngestPipeline` for batch processing with stats tracking, implemented `GitHubIssueImporter` for authenticated fetching/conversion with label/state/resolution filters.
Search Engine Refactoring `src/context8/search.py` (removed), `src/context8/search/__init__.py`, `src/context8/search/engine.py`, `src/context8/search/analyzer.py`, `src/context8/search/attribution.py`, `src/context8/search/ranking.py`	Decomposed monolithic 295-line search.py into modular package: engine handles hybrid/dense/sparse fusion with quality boosting, analyzer determines strategy weights, attribution tracks per-strategy contributions, ranking applies confidence/recency/feedback boosts.
Storage Service Enhancements `src/context8/storage.py`	Added `update_record()` for feedback persistence, enhanced `get_collection_info()` to discover named/sparse vectors dynamically with metadata flags, ensured client connection initialization.
MCP Server Restructuring `src/context8/mcp/__init__.py`, `src/context8/mcp/server.py`, `src/context8/mcp/tools.py`	Moved server bootstrap to dedicated module, separated concerns: server.py handles stdio/MCP setup, tools.py provides tool list/dispatch, added feedback rating tool (`context8_rate`) and solution search tool (`context8_search_solutions`).
Benchmark & Evaluation `src/context8/benchmark/__init__.py`, `src/context8/benchmark/ground_truth.py`, `src/context8/benchmark/runner.py`	New evaluation framework: ground truth dataset with 50+ problem/solution pairs, configurable benchmark runner testing named/sparse/filter/quality-boost combinations with recall@K/MRR/latency metrics.
Test Suite - Core Models `tests/test_models_extended.py`	New comprehensive model tests: FeedbackStats serialization, Attribution selection logic, SearchResult defaults.
Test Suite - Search Components `tests/test_attribution.py`, `tests/test_ranking.py`, `tests/test_embeddings.py`	New attribution tracker tests validating multi-strategy contribution tracking, ranking tests verifying confidence/recency/feedback boost application, refactored tokenizer tests to use public BM25Tokenizer API.
Test Suite - Ingestion `tests/test_github_importer.py`	New GitHub importer tests: slug-to-ID determinism, error/code/language/framework detection, resolution inference, record construction.
Test Suite - Benchmark & E2E `tests/test_benchmark.py`, `tests/test_e2e.py`	New benchmark validation tests (ground truth completeness, metric correctness across scenarios), comprehensive e2e integration tests covering collection shape, hybrid/sparse/filtered/named-vector search, feedback loop, attribution, quality boosting.
Test Suite - Configuration `tests/test_agents.py`	Updated MCP server path assertion from `context8.server` to `context8.mcp.server`.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as CLI (import-github)
    participant GitHub as GitHub API
    participant Importer as GitHubIssueImporter
    participant Embeddings as EmbeddingService
    participant Pipeline as IngestPipeline
    participant Storage as StorageService
    
    User->>CLI: context8 import-github owner/repo --label bug
    CLI->>CLI: check_db_connection()
    CLI->>Importer: fetch(repo, labels, max_issues, state)
    Importer->>GitHub: GET /repos/owner/repo/issues?labels=bug
    GitHub-->>Importer: issues + comments
    Importer-->>CLI: FetchResult
    CLI->>Importer: to_records(FetchResult, require_resolution)
    Importer-->>CLI: list[ResolutionRecord]
    CLI->>Pipeline: ingest(records, skip_existing=True)
    Pipeline->>Storage: get_record(id) [check duplicates]
    Storage-->>Pipeline: existing or None
    Pipeline->>Embeddings: embed_record(problem, solution, code)
    Embeddings-->>Pipeline: vectors dict
    Pipeline->>Storage: store_record(record, vectors)
    Storage-->>Pipeline: record_id
    Pipeline-->>CLI: IngestStats(attempted=N, stored=M, duplicates=X)
    CLI->>User: Display summary table

sequenceDiagram
    participant User
    participant SearchEngine as SearchEngine
    participant Embeddings as EmbeddingService
    participant Attribution as AttributionTracker
    participant Ranking as QualityRanker
    participant CLI as CLI (search)
    
    User->>CLI: context8 search "error message"
    CLI->>SearchEngine: search(query, language=..., limit=5)
    SearchEngine->>Embeddings: embed_query("error message")
    Embeddings-->>SearchEngine: dense vector
    SearchEngine->>SearchEngine: _search_named("problem", vector, filter, 5)
    SearchEngine-->>SearchEngine: problem_results[]
    SearchEngine->>SearchEngine: _search_sparse(sparse_vector, filter, 5)
    SearchEngine-->>SearchEngine: sparse_results[]
    SearchEngine->>SearchEngine: RRF fusion(problem_results, sparse_results)
    SearchEngine-->>SearchEngine: fused_results[]
    SearchEngine->>Attribution: record(strategy="dense", results)
    SearchEngine->>Attribution: record(strategy="sparse", results)
    SearchEngine->>Ranking: boost(fused_results)
    Ranking->>Ranking: Apply confidence/recency/feedback factors
    Ranking-->>SearchEngine: boosted_results[]
    SearchEngine->>Attribution: build_for(record_id) for each result
    SearchEngine-->>CLI: list[SearchResult with attribution]
    CLI->>User: Display results with source tracking

sequenceDiagram
    participant User
    participant CLI as CLI (rate tool)
    participant MCP as MCP Handler
    participant FeedbackService as FeedbackService
    participant Storage as StorageService
    participant Embeddings as EmbeddingService
    
    User->>MCP: Call context8_rate(record_id, worked=true)
    MCP->>FeedbackService: rate(record_id, worked=true)
    FeedbackService->>Storage: get_record(record_id)
    Storage-->>FeedbackService: ResolutionRecord
    FeedbackService->>FeedbackService: Increment applied_count
    FeedbackService->>FeedbackService: Increment worked_count (if worked=true)
    FeedbackService->>FeedbackService: Update last_seen timestamp
    FeedbackService->>Embeddings: embed_record(updated record)
    Embeddings-->>FeedbackService: vectors dict
    FeedbackService->>Storage: update_record(record, vectors)
    Storage-->>FeedbackService: success
    FeedbackService-->>MCP: FeedbackOutcome(accepted=true, worked_ratio=0.5)
    MCP-->>User: "Feedback recorded: 1/2 worked"

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

A rabbit hops through refactored ground,
Where modular packages abound!
🐰 New feedback loops keep records keen,
Rankings bloom with signals unseen,
GitHub seeds the knowledge base bright,
Attribution shines—a guiding light! ✨

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/framework-restructure

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

pathfindermilan · 2026-04-17T21:13:09Z

docs: add ROADMAP — DO NOT MERGE; pluggable ingestion sources next

Flags branch as not-ready-to-merge and lists the highest-leverage
follow-ups, chiefly: refactor ingest/ into a pluggable sources/ registry
so users can context8 import <source> from agent transcripts, git logs,
shell history, Stack Overflow, URLs, JSONL, and markdown — not just GitHub.

sourcery-ai

Sorry @hallelx2, your pull request is larger than the review limit of 150000 diff characters

Copilot

Pull request overview

This PR restructures Context8 into a capability-oriented framework package layout and adds production features around ingestion, ranking, attribution, benchmarking, and MCP/CLI operations.

Changes:

Replaces the previous flat module layout (search.py, cli.py, etc.) with subpackages (search/, ingest/, benchmark/, mcp/, cli/commands/, embeddings/).
Adds GitHub Issues ingestion, an agent feedback loop, per-strategy attribution, and a quality re-ranker.
Adds a benchmark harness + expanded unit/e2e tests and updates docs/artifacts (RESULTS.md, README.md).

Reviewed changes

Copilot reviewed 49 out of 49 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/test_ranking.py	Adds unit tests for quality ranker factors and boosting behavior
tests/test_models_extended.py	Adds unit tests for feedback + attribution model behavior
tests/test_github_importer.py	Adds unit tests for GitHub importer parsing/detection helpers
tests/test_embeddings.py	Updates tests to target BM25Tokenizer + QueryAnalyzer in new layout
tests/test_e2e.py	Adds live-DB end-to-end coverage for hybrid/filter/feedback/boosting
tests/test_benchmark.py	Adds unit tests for benchmark math + ground-truth integrity
tests/test_attribution.py	Adds unit tests for attribution tracking logic
tests/test_agents.py	Updates MCP entrypoint assertion after module move
src/context8/storage.py	Connect-on-demand client + collection introspection + update_record
src/context8/search/ranking.py	Implements confidence/recency/feedback-based score boosting
src/context8/search/engine.py	New hybrid search engine with attribution + quality boost hooks
src/context8/search/attribution.py	Tracks per-strategy rank/score contributions for results
src/context8/search/analyzer.py	QueryAnalyzer extracted to its own module
src/context8/search/init.py	Exposes new search package surface
src/context8/search.py	Removes legacy monolithic search module
src/context8/models.py	Adds feedback stats + attribution + raw_score/boost_factors
src/context8/mcp/tools.py	Adds MCP tools for rating + solution-approach search + formatting
src/context8/mcp/server.py	New MCP server entrypoint wrapping tools module
src/context8/mcp/init.py	Exposes MCP app/run_server API
src/context8/ingest/seed.py	Adds deterministic seed slugs + routes seeding via ingest pipeline
src/context8/ingest/pipeline.py	Adds generic ingest pipeline + ingest stats
src/context8/ingest/github.py	Adds GitHub issue importer and extraction heuristics
src/context8/ingest/init.py	Exposes ingest package API (seed/pipeline/importer)
src/context8/feedback.py	Adds FeedbackService to persist agent success/failure ratings
src/context8/embeddings/tokenizer.py	Extracts BM25 tokenizer used for sparse vectors
src/context8/embeddings/service.py	Refactors embeddings to use BM25Tokenizer and new package layout
src/context8/embeddings/init.py	Exposes embeddings package surface
src/context8/config.py	Adds ranker tuning constants + updates MCP server command path
src/context8/cli/ui.py	Adds shared CLI helpers (docker compose selection, DB checks)
src/context8/cli/main.py	New Click CLI group wiring commands from `cli/commands/`
src/context8/cli/commands/serve.py	Adds `context8 serve` command to run MCP server
src/context8/cli/commands/ops.py	Adds/updates stats/doctor/search CLI operations
src/context8/cli/commands/lifecycle.py	Adds start/stop/init commands for DB lifecycle + seeding
src/context8/cli/commands/integrations.py	Adds agent integration commands (add/remove) with aliases
src/context8/cli/commands/ingest.py	Adds `import-github` CLI command to ingest GitHub issues
src/context8/cli/commands/bench.py	Adds `bench` and `demo` CLI commands
src/context8/cli/commands/init.py	Exports CLI commands for main registration
src/context8/cli/init.py	Exposes CLI main entrypoint
src/context8/cli.py	Removes legacy monolithic CLI module
src/context8/benchmark/runner.py	Adds benchmark runner with ablation configurations
src/context8/benchmark/ground_truth.py	Adds ground-truth query set for benchmark evaluation
src/context8/benchmark/init.py	Exposes benchmark package surface
src/context8/agents.py	Updates agent config writer to new MCP module command
src/context8/main.py	Updates module entry to run new CLI main
src/context8/init.py	Bumps version to 0.2.0
pyproject.toml	Updates ruff per-file ignores to new file paths
docker-compose.yml	Switches to fully-qualified docker.io image reference
RESULTS.md	Adds submission results template + reproduction steps
README.md	Updates docs for new capabilities, commands, and layout

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T21:53:33Z

+            named_vectors = self._discover_named_vectors(info)
+            sparse_vectors = self._discover_sparse_vectors(info)
            return {
                "status": str(getattr(info, "status", "unknown")),
                "points": getattr(info, "points_count", 0),
-                "vectors": ["problem", "solution", "code_context"],
+                "vectors": named_vectors or ["problem", "solution", "code_context"],
+                "named_vector_count": len(named_vectors),
+                "sparse_vectors": sparse_vectors,
+                "sparse_supported": bool(sparse_vectors),
+                "hybrid_enabled": len(named_vectors) >= 2 and bool(sparse_vectors),


get_collection_info() returns a fallback list of vector names when discovery fails, but named_vector_count is still len(named_vectors) (0). Downstream checks like context8 doctor treat named_vector_count < 3 as a hard failure even though vectors reports the fallback 3. Consider making named_vector_count consistent with the vectors you return (or add a separate flag like named_vectors_discovered).

Copilot · 2026-04-17T21:53:34Z

+    def __init__(
+        self,
+        storage: StorageService,
+        embeddings: EmbeddingService,
+        ranker: QualityRanker | None = None,
+        dense_weight: float = DEFAULT_DENSE_WEIGHT,
+        code_weight: float = DEFAULT_CODE_WEIGHT,
+        sparse_weight: float = DEFAULT_SPARSE_WEIGHT,
+    ):
+        self.storage = storage
+        self.embeddings = embeddings
+        self.ranker = ranker or QualityRanker()
+        self.dense_weight = dense_weight
+        self.code_weight = code_weight
+        self.sparse_weight = sparse_weight


SearchEngine accepts dense_weight/code_weight/sparse_weight and stores them, but they are not used when building fusion_weights (QueryAnalyzer weights are used directly). This makes the constructor parameters misleading and prevents config-level tuning. Either incorporate these weights into fusion_weights (e.g., multiply or use as defaults when QueryAnalyzer doesn't apply) or remove the parameters/fields.

Copilot · 2026-04-17T21:53:34Z

+    table.add_row("Status", "[green]HEALTHY[/]")
+
+    if collection_info:
+        table.add_row("Vector spaces", ", ".join(collection_info.get("vectors", [])))
+        table.add_row("Status", collection_info.get("status", "unknown"))


The stats command adds a hard-coded "Status: HEALTHY" row and then (when collection_info is present) adds another "Status" row from the collection metadata. This produces duplicate/conflicting metrics in the output; consider renaming one (e.g., "DB health" vs "Collection status") or removing the hard-coded row.

Suggested change

table.add_row("Status", "[green]HEALTHY[/]")

if collection_info:

table.add_row("Vector spaces", ", ".join(collection_info.get("vectors", [])))

table.add_row("Status", collection_info.get("status", "unknown"))

table.add_row("DB health", "[green]HEALTHY[/]")

if collection_info:

table.add_row("Vector spaces", ", ".join(collection_info.get("vectors", [])))

table.add_row("Collection status", collection_info.get("status", "unknown"))

Copilot · 2026-04-17T21:53:34Z

+        for token, freq in sorted(term_freqs.items()):
+            idx = abs(hash(token)) % self.vocab_size
+            weight = freq / (freq + 1.0)


BM25Tokenizer.encode() derives sparse indices via Python's built-in hash(), which is salted per process (PYTHONHASHSEED). That makes stored sparse vectors and query-time sparse vectors inconsistent across restarts/processes, effectively breaking sparse retrieval and hybrid fusion.

Copilot · 2026-04-17T21:53:34Z

    @property
    def sparse_supported(self) -> bool:
-        """Check if the collection supports sparse vectors."""
        if self._sparse_supported is None:
            try:
                self.client.collections.get_info(COLLECTION_NAME)
-                self._sparse_supported = False  # Safe default
+                self._sparse_supported = False
            except Exception:
                self._sparse_supported = False
        return self._sparse_supported


StorageService.sparse_supported always resolves to False when _sparse_supported is None (even if the existing collection supports sparse vectors). This will disable sparse search paths on fresh processes that connect to a pre-existing hybrid collection (e.g., when initialize() returns False because the collection already exists). Consider introspecting collection info and setting _sparse_supported based on discovered sparse vector config instead of hard-coding False.

README rewritten around the new SQLite-first install (single command: pip install context8 && context8 init --seed && context8 add claude-code). The Actian path is preserved as an "Optional: Actian VectorAI DB backend" section with the hackathon-era install. The "Hackathon: Advanced Features Used" section becomes "Capabilities (and how each backend delivers them)" — the same three capabilities (named vectors, hybrid fusion, filtered search) framed as backend-portable. Architecture diagram redrawn: pluggable Protocol fanning out to two concrete backends (SQLite vec0+FTS5 below, Actian gRPC container right). Tech-stack table promotes sqlite-vec and FTS5 to primary, demotes Actian to optional. Project-structure tree updated to reflect the new storage/ package and search/fusion.py. v0.5.0 changelog entry added. CLAUDE.md fully rewritten — new project overview, structure, key design decisions (#1 is now "pluggable storage backend"), commands, plus distinct SQLite Backend Notes / Actian Backend Notes sections. RESULTS.md kept as the hackathon submission narrative but reframed: the Actian-feature table is rewritten as a backend-portability table (SQLite delivers each capability via vec0/FTS5/SQL+JSON1; Actian delivers them via named vectors / sparse vectors / FilterBuilder). Benchmark section now has placeholders for both backends so the ablation can be run side-by-side. All twelve docs/*.md design docs (CONCEPT, ARCHITECTURE, BOTTLENECKS, PLAN-01..08, Hackathon Demo Video — Script) prepended with a historical-artifact banner pointing readers to README.md / CLAUDE.md. The hackathon design narrative is preserved in place; it just stops being the canonical source for the current architecture. tests/test_e2e.py: - pytestmark adds pytest.mark.actian + a new skipif on CONTEXT8_BACKEND != "actian", so the Actian e2e suite skips cleanly under the default SQLite install (15 tests skip). - Fixed pre-existing FeedbackService(storage, embeddings) arity mismatch on lines 281 and 300 — production constructor takes storage only (mcp/tools.py:44 confirms). Drop the second arg. - isolated_collection fixture now patches actian_backend.COLLECTION_NAME (the captured-at-import-time copy) rather than the package module attribute — the module's attribute is no longer load-time-bound to COLLECTION_NAME after the storage package split. Verification: 127 passed, 15 actian e2e skipped, ruff clean, context8 init/doctor/stats/search/bench/export/import all green under SQLite. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

pathfindermilan requested a review from hallelx2 April 17, 2026 21:12

pathfindermilan assigned hallelx2 Apr 17, 2026

sourcery-ai Bot reviewed Apr 17, 2026

View reviewed changes

style: ruff format pass

13a053c

hallelx2 marked this pull request as ready for review April 17, 2026 21:50

Copilot AI review requested due to automatic review settings April 17, 2026 21:50

sourcery-ai Bot reviewed Apr 17, 2026

View reviewed changes

hallelx2 merged commit 79519cf into main Apr 17, 2026
6 of 7 checks passed

Copilot started reviewing on behalf of hallelx2 April 17, 2026 21:50 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

pathfindermilan deleted the feat/framework-restructure branch April 25, 2026 14:42

Conversation

pathfindermilan commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's new

Bug fixes

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

pathfindermilan commented Apr 17, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pathfindermilan commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading