As part of #82, we need some form of data storage for parsed reference content, particularly the chunked document text. Since #83 , we no longer store the chunked reference text. We'll need to add this back in, with or without embeddings.
We need to be able to query the chunked text in order to retrieve the most likely related chunks for a given chat question (regardless of whether we are using embeddings or something like bm25).
Some options and notes:
- Structured text file on disk (e.g. JSON) - Because the sidecar is not a long running process, this file will need to be read from disk on each
chat call. That doesn't feel great, though it might actually perform fine.
- LanceDB - We used LanceDB in an initial prototype and it worked well. It was easy to get started, performed well, and had nice integration with Python data ecosystem. However, LanceDB currently does not have delete or update capabilities for individual records and is a fairly new project in very active development, so it is likely that its API will evolve quickly. It is focused on vector and full-text search, and does have a TypeScript client.
- Sqlite + VSS - Can't say much about VSS as I'm not familiar with it, but Sqlite would be plenty sufficient for our needs.
As part of #82, we need some form of data storage for parsed reference content, particularly the chunked document text. Since #83 , we no longer store the chunked reference text. We'll need to add this back in, with or without embeddings.
We need to be able to query the chunked text in order to retrieve the most likely related chunks for a given chat question (regardless of whether we are using embeddings or something like bm25).
Some options and notes:
chatcall. That doesn't feel great, though it might actually perform fine.