Skip to content

cocoindex-io/meeting-notes-knowledge-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CocoIndex

Meeting Notes Knowledge Graph that Auto Updates

GitHub Documentation License PyPI version

PyPI Downloads CI release Link Check Discord

Step By Step Tutorial

cover

Meeting notes capture decisions, action items, participant information, and the relationships between people and tasks. Yet most organizations treat them as static documents—searchable only through basic text search.

With a knowledge graph, you can run queries like: "Who attended meetings where the topic was 'budget planning'?" or "What tasks did Sarah get assigned across all meetings?"

This example shows how to build a meeting knowledge graph from Google Drive Markdown notes using LLM extraction and Neo4j, with automatic continuous updates.

Please drop CocoIndex on Github a star to support us and stay tuned for more updates. Thank you so much 🥥🤗. GitHub

neo4j

What this builds

The pipeline defines:

  • Meeting nodes: one per meeting section, keyed by source note file and meeting time
  • Person nodes: people who organized or attended meetings
  • Task nodes: tasks decided in meetings
  • Relationships:
    • ATTENDED Person → Meeting (organizer included, marked in flow when collected)
    • DECIDED Meeting → Task
    • ASSIGNED_TO Person → Task

The source is Google Drive folders shared with a service account. The flow watches for recent changes and keeps the graph up to date.

graph

How it works

  1. Ingest files from Google Drive (service account + root folder IDs)
  2. Split each note by Markdown headings into meeting sections
  3. Use an LLM to extract a structured Meeting object: time, note, organizer, participants, and tasks (with assignees)
  4. Collect nodes and relationships in-memory
  5. Export to Neo4j:
    • Nodes: Meeting (explicit export), Person and Task (declared with primary keys)
    • Relationships: ATTENDED, DECIDED, ASSIGNED_TO
flow

Prerequisite

Environment

Set the following environment variables:

export OPENAI_API_KEY=sk-...
export GOOGLE_SERVICE_ACCOUNT_CREDENTIAL=/absolute/path/to/service_account.json
export GOOGLE_DRIVE_ROOT_FOLDER_IDS=folderId1,folderId2

Notes:

  • GOOGLE_DRIVE_ROOT_FOLDER_IDS accepts a comma-separated list of folder IDs
  • The flow polls recent changes and refreshes periodically

Run

Build/update the graph

Install dependencies:

pip install -e .

Update the index (run the flow once to build/update the graph):

cocoindex update main

Browse the knowledge graph

Open Neo4j Browser at http://localhost:7474.

Sample Cypher queries:

// All relationships
MATCH p=()-->() RETURN p

// Who attended which meetings (including organizer)
MATCH (p:Person)-[:ATTENDED]->(m:Meeting)
RETURN p, m

// Tasks decided in meetings
MATCH (m:Meeting)-[:DECIDED]->(t:Task)
RETURN m, t

// Task assignments
MATCH (p:Person)-[:ASSIGNED_TO]->(t:Task)
RETURN p, t

CocoInsight

I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline. It just connects to your local CocoIndex server, with Zero pipeline data retention.

Start CocoInsight:

cocoindex server -ci main

Then open the UI at https://cocoindex.io/cocoinsight. meetings

About

Build a meeting knowledge graph from Google Drive using LLM extraction and graph database, with automatic continuous updates.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages