Skip to content

Workflow for extracting context data from voice notes and passing them into Pipecone vector database for upserting

Notifications You must be signed in to change notification settings

danielrosehill/N8N-Voice-Note-Context-Pipeline-Workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

N8N Voice Note Context Pipeline Workflow

alt text

A personal RAG (Retrieval-Augmented Generation) pipeline built with N8N that processes voice notes into structured context data for AI workloads.

Overview

This workflow transforms raw voice note transcripts into contextually-rich, timestamped data suitable for vector database storage and AI retrieval. The pipeline extracts specific personal context from speech-to-text transcripts, reformats them from first-person to third-person perspective, and prepares them for embedding in vector databases.

Workflow Architecture

1. Context Extraction Agent

  • Input: Raw voice note transcripts (potentially containing transcription errors)
  • Processing: AI agent extracts meaningful personal context data
  • Output: Structured facts reformatted from first-person ("I like pizza") to third-person ("Daniel likes pizza")
  • System Prompt: Defines extraction rules and formatting guidelines (see components/context-extraction-agent/system-prompt.md)

alt text

alt text

2. Binary Conversion with Timestamp Injection

  • Processing: Converts extracted context to binary format
  • Enhancement: Injects creation timestamp for temporal contextualization
  • Purpose: Enables AI workloads to understand when context data was created

3. Vector Database Embedding

  • Default Target: Milvus vector database
  • Flexibility: Can be substituted with any vector database (Pinecone, Weaviate, etc.)
  • Purpose: Enables semantic search and retrieval for RAG applications

alt text

alt text

Key Features

  • Speech-to-Text Error Handling: Agent infers intended meaning from imperfect transcriptions
  • Personal Context Extraction: Filters out casual musings, focuses on significant facts
  • Perspective Transformation: Converts first-person references to user's name
  • Temporal Context: Timestamps enable time-aware AI responses
  • Modular Design: Vector database component can be easily swapped

Example Data Flow

Input (Raw Voice Note)

{
  "title": "Bold flavors and favorite foods",
  "transcript": "What kind of foods do I enjoy? Well, a former co-worker who was originally from Iran said Dana enjoys foods with very strong flavors...",
  "timestamp": "2025-08-15T11:28:21+00:00"
}

Output (Extracted Context)

{
  "output": "FOOD PREFERENCES\nDaniel enjoys foods with very strong and bold flavors.\nDaniel enjoys Indian, Nepalese, Ethiopian, and Mexican cuisines, roughly in that order of preference.\nDaniel enjoys falafel and shawarma in Israel..."
}

Repository Structure

├── components/
│   └── context-extraction-agent/
│       └── system-prompt.md          # AI agent instructions
├── payloads/
│   ├── raw-note.json                 # Example input data
│   └── extracted-context.json        # Example output data
├── screenshots/                      # Workflow visualization
└── README.md

Use Cases

  • Personal Knowledge Management: Build a searchable database of personal preferences and experiences
  • AI Assistant Enhancement: Provide context-aware responses based on personal history
  • Memory Augmentation: Create a queryable external memory system
  • Content Personalization: Enable AI to reference specific user preferences and experiences

Customization

Vector Database

Replace the Milvus node with your preferred vector database:

  • Pinecone
  • Weaviate
  • Chroma
  • Qdrant
  • OpenSearch

Context Categories

Modify the system prompt to extract different types of context:

  • Professional experiences
  • Technical preferences
  • Health information
  • Travel history
  • Learning goals

Getting Started

  1. Import the N8N workflow
  2. Configure your chosen vector database connection
  3. Set up the AI agent with your preferred LLM provider
  4. Customize the system prompt for your specific context needs
  5. Test with sample voice note data

Dependencies

  • N8N workflow automation platform
  • LLM provider (OpenAI, Anthropic, etc.) for context extraction
  • Vector database (Milvus by default)
  • Speech-to-text service (for voice note transcription)

This workflow provides the foundation for building sophisticated personal RAG systems that can understand and utilize personal context from voice notes and other unstructured data sources.

About

Workflow for extracting context data from voice notes and passing them into Pipecone vector database for upserting

Topics

Resources

Stars

Watchers

Forks