feat: Add source reference metadata to chat response by gjreda · Pull Request #241 · refstudio/refstudio

gjreda · 2023-07-02T19:40:23Z

fixes #143

Some examples below.

Note that in the first example, the LLM appropriately cities multiple sources. I was surprised that it nicely combined them without an example or being told how to do so.

$ poetry run python main.py chat --text "What are some types of technical debt in machine learning?" | jq
[
  {
    "index": 0,
    "text": "Some types of technical debt in machine learning include erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, changes in the external world, system-level anti-patterns, configuration issues, and code complexity issues.",
    "citations": [
      {
        "source_filename": "Machine Learning - The High-Interest Credit Card of Technical Debt.pdf",
        "page_num": 1
      },
      {
        "source_filename": "Hidden Technical Debt in Machine Learning Systems.pdf",
        "page_num": 1
      }
    ]
  }
]

$ poetry run python main.py chat --text "What are hidden feedback loops?" | jq
[
  {
    "index": 0,
    "text": "Hidden feedback loops are systems that influence each other indirectly through the world, often resulting in changes in behavior in reaction to changes in one system. These loops may exist between completely disjoint systems.",
    "citations": [
      {
        "source_filename": "Hidden Technical Debt in Machine Learning Systems.pdf",
        "page_num": 4
      },
      {
        "source_filename": "Machine Learning - The High-Interest Credit Card of Technical Debt.pdf",
        "page_num": 2
      }
    ]
  }
]

$ poetry run python main.py chat --text "What can you tell me about Chicago?" | jq
[
  {
    "index": 0,
    "text": "I am unable to answer the question with the provided sources.",
    "citations": []
  }
]

…sponse

codecov · 2023-07-02T19:42:54Z

Codecov Report

Merging #241 (f1d8d4a) into main (2f9144b) will decrease coverage by 0.87%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #241      +/-   ##
==========================================
- Coverage   77.30%   76.44%   -0.87%     
==========================================
  Files         111       99      -12     
  Lines        6170     5518     -652     
  Branches      576      576              
==========================================
- Hits         4770     4218     -552     
+ Misses       1384     1284     -100     
  Partials       16       16

see 12 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…sponse

sehyod

Nice! This greatly improves the chat experience!

cguedes

This is a great step forward working with the chat, but my expectation was that we would get structured data reply with the reference(s) in a field outside of the text. Can we have it @gjreda?

shauryr · 2023-07-06T13:27:01Z

This looks great! The ability to cite multiple sources is very useful as well.
Re: structured citations @cguedes we might want to look at langchain for this - https://python.langchain.com/docs/use_cases/question_answering/#adding-in-sources

I also found a tutorial demonstrating similar functionality - https://github.com/gkamradt/langchain-tutorials/blob/cc89c780e6484bdc3b15a967c1be2a5bfa4934dc/data_generation/Custom%20Files%20Question%20%26%20Answer.ipynb#L9

gjreda · 2023-07-07T16:07:39Z

This is a great step forward working with the chat, but my expectation was that we would get structured data reply with the reference(s) in a field outside of the text. Can we have it @gjreda?

@cguedes updated this to include structured citations. see the updated top comment on this PR for examples

sehyod · 2023-07-07T16:24:12Z

I'm sorry about the back-and-forth regarding this feature, but I'm a bit concerned about this update, which feels like regression to me.

The reply we were getting before for the question about hidden feedback loops was the following:
Hidden feedback loops in machine learning refer to situations where two systems indirectly influence each other through the world. One example of this is when two systems independently determine different aspects of a web page, such as selecting products to show and selecting related reviews. Improving one system may lead to changes in behavior in the other, as users react to the changes by clicking more or less on the other components (Hidden Technical Debt in Machine Learning Systems.pdf, p4). These hidden feedback loops can exist between completely disjoint systems and pose a statistical challenge for ML researchers to investigate (Machine Learning - The High-Interest Credit Card of Technical Debt.pdf, p3).

The new reply is:
Hidden feedback loops are systems that influence each other indirectly through the world, often resulting in changes in behavior in reaction to changes in one system. These loops may exist between completely disjoint systems.

Although we now get citations with a proper schema, I think the reply contains far less details. And another issue is that we don't know which part of the answer corresponds to which citation.
Maybe we should rollback to the previous solution and parse content between parenthesis with a regex to determine which references are being used?
What are your thoughts on that?

cguedes · 2023-07-07T16:30:21Z

I'm ok with reverting to the other solution and regex/parse the reference for now and work on the improved solution in a different PR.

gjreda · 2023-07-07T16:57:41Z

Yes, requiring JSON output means less available token space in the response, and it also means eating up a lot more of the input context space due to requiring more directions in the prompt. It took a lot of trial and error to have a prompt that resulted in JSON output and also left enough of the input context space for the relevant text chunks.

I don't love the idea of reworking this again and trying to regex out a somewhat non-deterministic result from the LLM, but can give it a shot if we think it's worth the time.

I think the best approach for now would be to move forward with how this PR previously was (with inline citations) and then I can add a follow-up PR to regex out the citations if we think it's valuable.

What do y'all think about that? @cguedes @sehyod

cguedes · 2023-07-07T18:16:45Z

@gjreda we see your point. Ok. Push a revert to this branch and let me know when it's ready for a new review.

This reverts commit cbad0ff.

…sponse

gjreda · 2023-07-07T20:20:12Z

@cguedes Sound good, should be good to go now and I'll work on a separate regex PR.

gjreda added 7 commits July 2, 2023 10:54

Update chat to rely on and env for ref storage path

26ba7fc

Add cli param so n_choices argument can be used

2dcc9e1

Add function to chunk_reference and add page number to Chunk metadata

d6f41ce

Merge branch 'main' into 143-add-source-reference-metadata-to-chat-re…

c6f7b78

…sponse

Add reference citations to LLM responses

d5f9534

Add and fix tests

d53127b

linter

6ef10f0

gjreda added 4 commits July 5, 2023 10:40

Merge branch 'main' into 143-add-source-reference-metadata-to-chat-re…

1f1d38d

…sponse

fix tests

0f781c1

fix test

f2ef7ab

fix test

cbad0ff

gjreda requested review from cguedes, sehyod and shauryr July 5, 2023 18:44

gjreda marked this pull request as ready for review July 5, 2023 18:44

sehyod previously approved these changes Jul 6, 2023

View reviewed changes

cguedes previously approved these changes Jul 6, 2023

View reviewed changes

gjreda dismissed stale reviews from cguedes and sehyod via 2f97c4e July 7, 2023 16:06

gjreda requested review from cguedes and sehyod July 7, 2023 16:07

gjreda added 2 commits July 7, 2023 14:53

Revert "fix test"

bfd2063

This reverts commit cbad0ff.

Merge branch 'main' into 143-add-source-reference-metadata-to-chat-re…

f1d8d4a

…sponse

gjreda force-pushed the 143-add-source-reference-metadata-to-chat-response branch from 2f97c4e to f1d8d4a Compare July 7, 2023 20:12

fix

1764095

cguedes approved these changes Jul 10, 2023

View reviewed changes

cguedes merged commit b1f221d into main Jul 10, 2023

cguedes deleted the 143-add-source-reference-metadata-to-chat-response branch July 10, 2023 08:59

sehyod mentioned this pull request Jul 10, 2023

Follow-up on improved chat interaction #270

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add source reference metadata to chat response#241

feat: Add source reference metadata to chat response#241
cguedes merged 14 commits into
mainfrom
143-add-source-reference-metadata-to-chat-response

gjreda commented Jul 2, 2023 •

edited

Loading

Uh oh!

codecov Bot commented Jul 2, 2023 •

edited

Loading

Uh oh!

sehyod left a comment

Uh oh!

cguedes left a comment

Uh oh!

shauryr commented Jul 6, 2023 •

edited

Loading

Uh oh!

gjreda commented Jul 7, 2023

Uh oh!

sehyod commented Jul 7, 2023 •

edited

Loading

Uh oh!

cguedes commented Jul 7, 2023

Uh oh!

gjreda commented Jul 7, 2023 •

edited

Loading

Uh oh!

cguedes commented Jul 7, 2023

Uh oh!

gjreda commented Jul 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

gjreda commented Jul 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jul 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sehyod left a comment

Choose a reason for hiding this comment

Uh oh!

cguedes left a comment

Choose a reason for hiding this comment

Uh oh!

shauryr commented Jul 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gjreda commented Jul 7, 2023

Uh oh!

sehyod commented Jul 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cguedes commented Jul 7, 2023

Uh oh!

gjreda commented Jul 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cguedes commented Jul 7, 2023

Uh oh!

gjreda commented Jul 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gjreda commented Jul 2, 2023 •

edited

Loading

codecov Bot commented Jul 2, 2023 •

edited

Loading

shauryr commented Jul 6, 2023 •

edited

Loading

sehyod commented Jul 7, 2023 •

edited

Loading

gjreda commented Jul 7, 2023 •

edited

Loading