-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
Create OPTIONAL end-to-end tests that validate the entire vector search workflow with real Ollama embeddings service.
Priority: LOW (optional enhancement)
Estimated Time: Week 4
Depends On: #528 (Phase 1)
Context
Note: Phase 1 and 2 already provide comprehensive integration testing with Testcontainers and real ArcadeDB. Phase 3 adds OPTIONAL E2E tests with real Ollama for complete system validation.
These tests are:
⚠️ Slow (~60s per test due to Ollama model loading)⚠️ Large (requires pulling ~400MB Ollama image)⚠️ Optional (most testing covered by Phase 1+2)
When to Use
Use E2E tests with real Ollama when you need to:
- Validate actual semantic similarity (not fake embeddings)
- Verify real embedding dimensions from Ollama
- Test full system integration before production release
- Catch model-specific issues
Tasks
3.1 Create VectorSearchE2ETest
File: src/test/java/it/robfrank/linklift/integration/VectorSearchE2ETest.java
Setup:
- Use
ArcadeDbContainerfrom Phase 1 - Add
GenericContainerfor Ollama service - Pull
nomic-embed-textmodel in@BeforeAll - Create REAL
OllamaEmbeddingAdapter(notFakeEmbeddingGenerator) - Wire up real services:
BackfillEmbeddingsService,SearchContentService
Test Cases:
-
Test: endToEnd_realEmbeddings_shouldFindSimilarContent
- Save 3 content items (2 AI-related, 1 cooking-related)
- Run backfill with REAL Ollama embeddings
- Search for "artificial intelligence and AI"
- Verify AI content returned, not cooking content
- Goal: Validate actual semantic similarity
-
Test: endToEnd_realEmbeddings_shouldValidateDimensions
- Save content
- Run backfill with REAL Ollama
- Retrieve content and verify embedding has correct dimensions
- Goal: Catch dimension mismatches with real model
3.2 Configure for Optional Execution
Make tests optional using Maven profile:
<profile>
<id>e2e-tests</id>
<properties>
<skipE2ETests>false</skipE2ETests>
</properties>
</profile>Default: Skip E2E tests in normal builds
CI: Run E2E tests on main branch only (not PRs)
3.3 Documentation
- Document Docker requirements
- Document how to run E2E tests:
mvn test -Pe2e-tests - Add troubleshooting guide for Ollama container issues
- Document expected execution time (~2-3 minutes)
3.4 Validation
- Both E2E tests pass with real Ollama
- Tests skip by default (don't slow down normal development)
- CI runs E2E tests on main branch
- Documentation is clear
Trade-offs
Pros:
- ✅ Full system validation with real embeddings
- ✅ Catches dimension mismatches with actual model
- ✅ Validates real semantic similarity
Cons:
⚠️ Very slow (~60s per test)⚠️ Requires ~400MB Ollama image download⚠️ Can be flaky (model download, container startup)
Success Criteria
- 2 E2E tests created and passing
- Tests use REAL Ollama embeddings
- Tests are optional (Maven profile)
- Documentation complete
- CI configured to run on main branch only
Related
- Depends on: Phase 1: Setup Testcontainers Infrastructure for Vector Search Tests #528 (Phase 1)
- See
TEST_REFACTORING_PLAN.mdPhase 3 for examples - Optional - can be implemented after Phase 2 and 4 are complete