Skip to content

Phase 3: Optional E2E Tests with Real Ollama (LOW PRIORITY) #530

@robfrank

Description

@robfrank

Overview

Create OPTIONAL end-to-end tests that validate the entire vector search workflow with real Ollama embeddings service.

Priority: LOW (optional enhancement)
Estimated Time: Week 4
Depends On: #528 (Phase 1)

Context

Note: Phase 1 and 2 already provide comprehensive integration testing with Testcontainers and real ArcadeDB. Phase 3 adds OPTIONAL E2E tests with real Ollama for complete system validation.

These tests are:

  • ⚠️ Slow (~60s per test due to Ollama model loading)
  • ⚠️ Large (requires pulling ~400MB Ollama image)
  • ⚠️ Optional (most testing covered by Phase 1+2)

When to Use

Use E2E tests with real Ollama when you need to:

  • Validate actual semantic similarity (not fake embeddings)
  • Verify real embedding dimensions from Ollama
  • Test full system integration before production release
  • Catch model-specific issues

Tasks

3.1 Create VectorSearchE2ETest

File: src/test/java/it/robfrank/linklift/integration/VectorSearchE2ETest.java

Setup:

  • Use ArcadeDbContainer from Phase 1
  • Add GenericContainer for Ollama service
  • Pull nomic-embed-text model in @BeforeAll
  • Create REAL OllamaEmbeddingAdapter (not FakeEmbeddingGenerator)
  • Wire up real services: BackfillEmbeddingsService, SearchContentService

Test Cases:

  1. Test: endToEnd_realEmbeddings_shouldFindSimilarContent

    • Save 3 content items (2 AI-related, 1 cooking-related)
    • Run backfill with REAL Ollama embeddings
    • Search for "artificial intelligence and AI"
    • Verify AI content returned, not cooking content
    • Goal: Validate actual semantic similarity
  2. Test: endToEnd_realEmbeddings_shouldValidateDimensions

    • Save content
    • Run backfill with REAL Ollama
    • Retrieve content and verify embedding has correct dimensions
    • Goal: Catch dimension mismatches with real model

3.2 Configure for Optional Execution

Make tests optional using Maven profile:

<profile>
    <id>e2e-tests</id>
    <properties>
        <skipE2ETests>false</skipE2ETests>
    </properties>
</profile>

Default: Skip E2E tests in normal builds
CI: Run E2E tests on main branch only (not PRs)

3.3 Documentation

  • Document Docker requirements
  • Document how to run E2E tests: mvn test -Pe2e-tests
  • Add troubleshooting guide for Ollama container issues
  • Document expected execution time (~2-3 minutes)

3.4 Validation

  • Both E2E tests pass with real Ollama
  • Tests skip by default (don't slow down normal development)
  • CI runs E2E tests on main branch
  • Documentation is clear

Trade-offs

Pros:

  • ✅ Full system validation with real embeddings
  • ✅ Catches dimension mismatches with actual model
  • ✅ Validates real semantic similarity

Cons:

  • ⚠️ Very slow (~60s per test)
  • ⚠️ Requires ~400MB Ollama image download
  • ⚠️ Can be flaky (model download, container startup)

Success Criteria

  • 2 E2E tests created and passing
  • Tests use REAL Ollama embeddings
  • Tests are optional (Maven profile)
  • Documentation complete
  • CI configured to run on main branch only

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions