I Love Merge
I Love Merge
N8n. So what I'm gonna be doing today is walking through signing up for an account,
creating a project, and then connecting them both to N8n so you guys can follow
every step of the way. But real quick, Postgres is an open source relational
database management system that you're able to use plugins like P2Vector if you
want vector similarity search. In this case, we're just gonna be using Postgres as
the memory for our agent. And then SuperBase is a backend as a service that's kind
of built on top of Postgres. And in today's example, we're gonna be using that as
the vector database. But don't want to waste any time here we are at end to end and
what we know we're going to do here for our agent is give it memory with Postgres
and access to a vector database and SuperBase. So for memory, I'm going to click
on this plus and click on Postgres chat memory, and then we'll set up this
credential. And then over here, we want to click on the plus for tool. We'll grab
a SuperBase vector store node. And then this is where we'll hook up our SuperBase
credential. So whenever we need to connect to these third party services, what we
have to do is come into the node, go to our credential, and then we want to create
a new one. And then we have all this stuff to configure like our host, our
username, our password, our port, all this kind of stuff. So we have to hop into
SuperBase first, create an account, create a new project, and then we'll be able to
access all this information to plug in. So here we are in SuperBase. I'm going to
be creating a new account, like I said, just so we can walk through all of this
step-by-step for you guys. So first thing you want to do is sign up for a new
account. So I just got my confirmation email. So I'm going to go ahead and confirm.
Once you do that, it's going to have you create a new organization. And then within
that, we create a new project. So I'm just going to leave everything as is for now.
It's going to be personal. It's going to be free and I'll hit create organization.
And then from here, we are creating a new project. So I'm to leave everything once
again as is. This is the organization we're creating the product in. Here's the
project name, and then you need to create a password and you're going to have to
remember this password to hook up to our super based node later. So I've entered
my password. I'm going to copy this because like I said, you want to save this so
you can enter it later. And then we'll click create new project. This is going to
be launching up our project and this may take a few minutes. So just have to be
patient here. As you can see, we're in the screen. It's going to say setting up
project. So we pretty much are just going to wait until our project's been set up.
So while this is happening, we can see that there's already some stuff that may
look a little confusing. We've got project API keys with a service role secret. We
have configuration with a different URL and some sort of JWT secret. So I'm going
to show you guys how you need to access what it is and plug it into the right
places in NNN. But as you can see, we got launched to different screen. The project
status is still being launched. So just going to wait for it to be complete. So
everything just got set up. We're now good to connect to NNN. And what you want to
do is typically you come down to product settings and you click on database. And
this is where everything would be to connect. but it says connection string has
moved. So as you can see, there's a little button up here called connect. So we're
click on this and now this is where we're grabbing the information that we need for
Postgres. So this is where it gets a little confusing because there's a lot of
stuff that we need for Postgres. We need to get a host, username, our password
from earlier when we set up the project and then a port. So all we're looking for
are those four things, but we need to find them in here. So what I'm gonna do is
change the type to Postgres SQL. And then I'm gonna go down to the transaction
pooler and this is where we're gonna find the things that we need. The first thing
that we're looking for is the host, which if you set it up just like me, it's going
to be after the dash H. So it's going to be AWS. And then we have our
region.pooler.subabase.com. So we're going to grab that, copy it. And then we're
to paste that into the host section right there. So that's what it should look like
for host. Now we have a database and a username to set up. So if we go back into
that Subabase page, we can see we have a D and a U. So the D is going to stay as
Postgres, but for user, we're going to grab everything after the U, which is going
to be Postgres.com. And then these, different characters. So I'm going to paste
that in here under the user. And for the password, this is where you're going to
paste in the password that you use to set up your super base project. And then
finally at the bottom, we're looking for a port, which is by default 5 3 4 2. But
in this case, we're going to grab the port from the transaction pooler right here,
which is following the lowercase P. So we have 6 5 4 3. And I copy that, paste that
into here as the port, and then we'll hit save and we'll see if we got connection
tested successfully. There we go. We got green and then I'm just going to rename
this so I can keep it organized. So there we go, we've connected to Postgres as our
chat memory. We can see that it is going to be using the connected chat trigger
node. That's how it's going to be using the key to store this information. And
it's going to be storing it in a table in SuperBase called N8nChatHistories. So
real quick, I'm going to talk to the agent. I'm just going to disconnect the
SuperBase so we don't get any errors. So now when I send off, hello AI agent, it's
going to respond to us with something like, hey, how can I help you today? Hello,
how can I assist you? And now you can see that there were two things stored in our
Postgres chat memory. So we'll switch over to SuperBase. And now we're going to
come up here in the left and go to table editor. we can see we have a new table
that we just created called N8N chat histories. And then we have two messages in
here. So the first one, as you can see, was a human type and the content was hello
AI agent, which is what we said to the AI agent. And then the second one was a type
AI and this is the AI's response to us. So it said, hello, how can I assist you
today? So this is where all of your chats are going to be stored based on the
session ID. And just once again, this session ID is coming from the connected chat
trigger node. So it's just coming from this node right here. As you can see,
there's the session ID that matches the one in our, chat memory table. And that is
how it's using it to store sort of like the unique chat conversations. Cool. Now
that we have Postgres chat memory set up, let's hook up our SuperBase vector store.
So we're going to drag it in and then now we need to go up here and connect our
credentials. So I'm going to create a new credential and we can see that we need
two things, a host and a service role secret. And the host is not going to be the
same one as the host that we use to set up our Postgres. So let's hop into
SuperBase and grab this information. So back in SuperBase, we're going to go down
to the settings. we're gonna click on data API, and then we have our project URL,
and then we have our service role secret. So this is all we're using. For URL,
we're gonna copy this, go back to SuperBase, and then we'll paste this in as our
host. As you can see, it's supposed to be HTTPS and then your SuperBase account.
So we'll paste that in, and you can see that's what we have.co. Also keep in
mind, this is because I launched up an organization and a project in SuperBase's
cloud. If you were to self-host this, it would be a little different, because you'd
have to access your local host. And then of course we need our service role secret.
So back in SuperBase, I'm going to reveal, copy, and then paste it into end to end.
So let me do that real quick. And as you can see, I got that huge token, just
paste it in. So what I'm going to do now is save it. Hopefully it goes green.
There we go. We have connection tested successfully. And then once again, just
going to rename this. The next step from here would be to create our SuperBase
vector store within the platform that we can actually push documents into. So
you're going to click on docs right here. You are going to go to the quick start
for setting up your vector store. And then all you have to do right here is copy
this command. So in the top right, copy this script. Come back into SuperBase,
you'll come on the left-hand side to SQL Editor. You'll paste that command in
here. You don't change anything at all. You'll just hit run. And then you should
see down here success, no rows returned. And then in the table editor, we'll have
a new table over here called documents. So this is where when we're actually
vectorizing our data, it's gonna go into this table. Okay, so I'm just gonna do a
real quick example of putting a Google Doc into our SuperBase vector database, just
to show you guys that everything's connected the way it should be and working as it
should. So I'm gonna grab a Google Drive node right here. I'm gonna click download
file. I'm going to select a file to download, which in this case, I'm just going to
grab body shop services, terms and conditions, and then hit test step. And we'll
see the binary data, which is a doc file over here. And now we have that
information. And what we want to do with it is add it to super base, super base
vector store. So I'm going to type in super base.
We'll see vector store. The operation is going to be add documents to vector
store. And then we have to choose the right credential because we have to choose
the table to put it in. So this is in this case, we already made a table. As you
can see in our super base, it's called documents. So back in here, I'm gonna choose
the credential I just made. I'm going to choose insert documents and I'm gonna
choose the table to insert it to, not the N8N chat histories. We wanna insert this
to documents because this one is set up for vectorization. From there, I have to
choose our document loader as well as our embeddings. So I'm not really going to
dive into exactly what this all means right now. If you're kind of confused and
you're wanting a deeper dive on RAG and building agents, definitely check out my
paid community. We've got different deep dive topics about all this kind of stuff,
but I'm just gonna set this up real quick so we can see the actual example. I'm
just choosing the binary data to load in here. I'm choosing the embedding and I'm
choosing our text splitter, which is going to be recursive. And so now all I have
to do here is hit run. It's going to be taking that binary data of that body shop
file. It split it up. And as you can see, there's three items. So if we go back
into our super base vector store and we hit refresh, we now see three items in our
vector database and we have the different content and all of this information here,
like the standard oil change, the synthetic oil change is coming from our body shop
document that I have right here that we put in there just to validate the rag. And
we know that this is a vector database store rather than a relational one, because
we can see we have our vector embedding over here, which is all the dimensions.
And then we have our metadata. So we have stuff like the source and the blob type,
all this kind of stuff. And this is where we could also go ahead and add more
metadata if we wanted to. Anyways, now that we have vectors in our documents
table, we can hook up the actual agent to the correct table. So in here, what
I'm going to call this is body shop for the description. I'm going to say, use
this to get information about the Body Shop. And then from the table name, we
have to choose the correct table, of course. So we know that we just put all this
into something called documents. So I'm going to choose documents. And finally,
we just have to choose our embeddings, of course, so that it can embed the query
and pull stuff back accurately. And that's pretty much it. We have our AI agent
set up. So let's go ahead and do a test and see what we get back. So I'm going to
go ahead and say what break services are offered at the Body Shop. It's going to
update the Postgres memory. So now we'll be able to see that query. It hit the
SuperBase VectorStore in order to retrieve that information and then create an
augmented generated answer for us. And now we have the body shop offers the
following brake services, 120 per axle for replacement, 150 per axle for rotor
replacement, and then full brake inspection is 30 bucks. So if we click back into
our document, we can see that that's exactly what it just pulled. And then if we go
into our vector database within Subabase, we could find that information in here,
but then we can also click on N8N chat history and we can see we have two more
chats. So the first one was a human, which is what we said. what break services are
offered at the body shop. And then the second one was a AI content, which is the
body shop offers the following break services, blah, blah, blah. And this is
exactly what it just responded to us with within NNN down here, as you can see.
And so keep in mind this AI agent has zero prompting. We didn't even open up the
system message. All that's in here is you are a helpful assistant. But if you are
setting this up, what you want to do is, you know, explain its role and you want to
tell it, you know, you have access to a vector database. It is called X. It has
information about X, Y, and Z. And you should use it when a client asks about X,
and Z. Anyways, that's gonna be it for this one. SubiBase and Postgres are super,
super powerful tools to use to connect up as a database for your agents, whether
it's gonna be relational or vector databases. And you've got lots of options with
self-hosting and some good options for security and scalability there. So anyways,
hope this one was helpful. If you learned something new, please give it a like. It
definitely helps me out a lot and appreciate you guys as always. So thanks, see you
in the next one.
Most rag tutorials on YouTube, you know your classic chat with your PDF documents,
are really not showing you how to build something you can actually use for real.
You'd have to change a lot, and I mean a lot, to really make it production ready.
And honestly, it might work well at first, but you're going to run into issues as
your knowledge base grows, your documents get updated, and users on your app ask
questions that you didn't expect or test. Now I'm not promising the perfect
solution by any means, but I definitely have something that you're going to want to
check out today because I'm going to show you how to build a robust RAG AI agent
with no code using N8n and SuperBase. N8n is an incredible and cost-effective
workflow automation tool similar to make.com and Zapier. And then SuperBase in my
mind is the best, absolute best database platform that also supports vectors for
RAG using PG Vector. And that's what we're really gonna be leveraging today to make
something that's way more immediately usable for RAG than a lot of the things
you've seen on YouTube at this point. It is so, so easy to combine N8n and
SuperBase together to create a production ready, cost effective, no code and simple
to understand RAG AI agent where you can use literally any of your documents as a
knowledge base. We'll start by quickly showing how powerful this RAG agent is. And
then I'm walk you step by step through setting this up yourself. So you can do it
in less than 15 minutes, including setting up SuperBase and creating this N8n
workflow for your RAG agent. So let's go ahead and dive right in. All right, so
here is the full N8n workflow for this RAG AI agent, which I'll dive into in a
little bit. But right now I have to show you how cool and easy to use this thing
is. And so at this point, I have absolutely nothing in my RAG knowledge base. And I
can even go over to SuperBase and show you that this documents table that manages
my knowledge base has nothing in it at this point. And so now what I can do is I
can go into the chat widget in the bottom middle here in my NA and workflow, and I
can ask a question to my agent that it shouldn't have the answer to right now
because I don't have any documents for its knowledge. And so I can ask something
like, what are the action items from the eight to 25 meeting? And this is something
that we'll make it able to answer in a little bit here, but right now I want to
show that it doesn't have this answer at this point. And so sure enough, can't find
the answer. So now let's go ahead and create a document in its knowledge base so
that it can actually answer this question. And so I'm going to take some fake silly
meeting notes that I made for 825. I'm going to copy these, then I'm to go into the
folder that I'm using for the knowledge base for this agent. So this is all hooked
into N8n. I'll click on new Google docs. So I'm going to make this as my file here,
and then I'll paste in everything here. And then for the title of this document,
we'll just say 825 meeting notes. So now There's going to be an end workflow that
triggers to add this into the rag knowledge base. And so I'll come back once that
is done in like 30 seconds to a minute. So just a minute later and this end
workflow has executed. I'll show more what these executions look like later and how
they work. But if I go into the step here where I insert into my knowledge base,
sure enough, the page content here is exactly what I created in this Google doc
just now. And I can also go over to SuperBase and see that our documents table that
manages our Knowledge Base also has a record now for this meeting notes file. We've
got the metadata as well, including the Google Drive file ID, and then also the
embedding vector, which we'll actually use for RAG. And so now I can go back to my
chat in N8n. So I'll go back to the editor, go to chat. And then this is our last
request here where it didn't have the information in the Knowledge Base. This is
just from a prior test where it was working, obviously. But now I can just copy
this question and ask it again. And this time it's going to have the answer because
this document is in its knowledge base, as I showed both in super base and the NNN
workflow execution. And sure enough, there we go. We got the correct answer. This
is exactly what those silly action items are that we have in our Google doc right
here. So super easy to use. This thing works really, really well. It handles any
updates to Google drive files as well. And so now I'm going to dive in and show you
step by step how to actually build this thing. All right, so I've already fully
built out this end workflow for my reg AI agent because I want this to be a really
smooth walkthrough, but I will go through this step by step for you so that you
have a really clear understanding of how this works and you can implement it
yourself. I'll even have a link to a GitHub repository in the description of this
video where you can download this workflow yourself and bring it into your own and
an instance very, very easily. All you have to do is download the JSON file for
this workflow and then go to the three dots here in the top right of your instance.
and then click on import from file. So you'll bring in that JSON file. And then
the only thing you have to do is put in your own credentials for things like the
Google Drive and SuperBase. And then any customizations you might want, like your
folder for your knowledge base and Google Drive. And then you're good to go. It
just takes a couple of minutes. You'll have this thing fully set up so you can
steal it from me very, very easily. All right, so with that, we can actually dive
into this end-to-end workflow and see exactly how it works. So the first thing that
I wanna do before we even get into, all the actions here in the workflows, I want
to get you set up with super base because this is going to be used both for our
chat memory and our vector database. And that's the beauty of super base is we can
use it for both. And so what you're going to need to do is go to super base dot com
and then just sign in with your GitHub account. You'll have a free tier, which let
me go to the pricing really quick and just show you the free tier is incredible.
Like that's literally all you need to get started. There's no reason to go to the
twenty five dollar a month plan until you're really starting to scale your AI app.
So once you sign in and create a password, you're going to be brought to a
dashboard page that looks like this. And so there's some credentials that we're
going to be needing for some of the steps in N8n. So let me point those out really
quick and I'll reference those later. So you go to your project settings in the
bottom left here and then go to the database section of the configuration. Now here
you're going to have everything that you need for your Postgres connection for
your chat memory. So you have your host, database name, port, user and password. So
that's for Postgres. And then what you're going to need for the vector database
side of things is the API connection. So you go down to the API tab and then here
you have your URL. This is custom to you. And then you can also reveal and copy
your service role secret as well. So both of these things and database and API will
need later. So anyway, back over to N8n, what we're going to do here is I'm going
to show you step by step executing this workflow and walking through every single
node. And so the first trigger that we have here, for our NAN workflow is when a
chat message is received. So if I go to the plus icon and just search for chat,
this is the trigger that we are using right here. And when you add this, it gives
you this chat widget in the bottom middle where I can ask a question like, what are
the 825 action items? The same thing that I asked in the demo there. And so when
you have this trigger, it gives you this chat window, which makes it so easy to
test and debug your AI agents just to iterate on it. as you're developing it, you
don't even have to go outside of your end workflow or deploy anything to be able to
test things out. So there we go. We got the answer from the AI agent. This is the
perfect answer just based on the document that I've got here in my knowledge base.
And so with that, you can even embed this on your website if you want. So you don't
have to write any code. You can just click on more info and it literally gives you
the code to embed this on your own website. If you want a chat widget for this
agent. really, really nice. And so now that I've executed it, I've got green check
marks for all of the nodes that were executed. So can click into one and actually
see the output from my chat trigger. So I've got the session ID, which is needed
for the chat memory, basically just my user session, and then also the input to
the AI agent. And then if I go to my AI agent, I can see the input on the left
side, which is the chat input from the trigger. And then also the output, which in
this case is the action items from the A25 meeting, just like I asked for. So I
can take the chat input and just drag it in here like I did. And that's how I get
the text, the basically the request to the AI. And then I've got a system message
as well. So that's how I set up the agent. And then there's also a couple of things
to hook into it, namely the chat model, which I'm using GPT. In this case, you can
use something like anthropic as well. And then I've got my chat memory, which is
using Postgres. And then I've got my tools for rag, which I'm using the super base
vector store in this case. So let me go over each one. really, really quickly here.
for the chat model, I'm just using GPT for a mini, you can use any option from GPT
with this node. You can see the response that I got on the right hand side here.
And then
for credentials, all you have to do is just feed your open AI API key. It is so
easy and super base is going to be just as easy. Google Drive is going to be pretty
easy as well. There's documentation that any end gives you for setting up
credentials for anything. So it is so easy to walk through that and get that set up
as well. And then for the chat memory, this is where we really get into something
that is going to be better than what you see in a lot of N8n tutorials on YouTube.
Because if I click on the memory options here, you can see that one of them is a
windowed buffered memory. And this means that it's going to store the chat memory
just locally on your N8n instance, which is going to overload the server that
you're using to host N8n. A lot of tutorials go with this because it's the easiest,
but using Postgres with SuperBase is the way to really make it scalable. And that's
what makes it production ready. And so setting up the credentials, you just use the
host database user and password that I showed how to get in super base. And then
for the table name, it can be whatever you want. And any N will actually create
this for you if you don't have it created already. So you don't even have to create
the table in the super base for this to work with any N, which just makes it so, so
easy. And then for rag, we're using this retrieve documents tool. There's a lot of
different tools here, but I'm just using the vector store tool and then connecting.
SuperBase as my vector store. And so for the credentials for this one, you just
need the host and service role secret, which I also showed how to get. And then
closing out of this here, you just have to select the table name and the options
for the query name. And to set up these two things, N8n has documentation for this
as well. So I'll click on the docs here and then scroll down to where it says a
quick start for setting up your vector store. And this gives you all of the SQL
code that you need to run. So you don't have to code anything. to get your
SuperBase set up for RAG. So you just copy everything that you see here, all of
this SQL code, go back over to SuperBase and go to the SQL editor on the left side,
and then just paste everything in there. So it'll create your extension, basically
adding PG Vector to your SuperBase account, creating the table for documents, and
then also the function to match for RAG. And so going to my table editor, this is
what creates this documents table that you saw in the demo. that has the metadata
content and embeddings for all the documents that I have in my knowledge base. And
then as an aside, this is also the table that NAN creates automatically for the
chat histories. So there you have it. That's basically everything in SuperBase
that we need for this workflow here. And then going down to the other half of this
workflow, a bit of a simpler piece, we just have the workflow for adding files to
our knowledge base when they are created or updated in my Google Drive. And so...
Right here, I'm not going to go over creating the Google credentials, but again,
there's really, really nice documentation that N8n gives for doing this. But once
you have the credential set up, you can select how often you pull for files that
are created or updated and do it in a specific folder. So in my case, I'm using
this meeting notes folder as basically the section of my Google Drive for RAG. And
so then this is for when files are created and I have a similar trigger for when
files are updated. So I can actually run one of these as a test event. fetch test
event right here. And I can get the last file that was created so that can go
through my workflow one at a time and watch all the inputs and outputs for my
nodes. So we can go on to the next step in the workflow, which is basically going
to extract the file ID for the rest of my workflow. So I have all the inputs from
my previous step, the trigger. I want to get the ID and pass that in as an output,
which is my file ID. So now I'm going to use that. in later steps to pull this file
from Google Drive and extract the content from it and then put it in my knowledge
base. Now, the next step we have here with Super Base, this is really, really
important. I want to delete all of the old vectors for this file that already exist
in my database because I don't want any duplicates. This is another big thing that
a lot of N8N tutorials miss out on with Reg is they won't delete all the records.
And so every time a document updates and gets reinserted into the vector database,
It is going to be a duplicate of what is already there because super base and all
of the other options for vector databases and any and they are not upserts. It
won't update a vector if it already exists because it doesn't have the ID of the
file as the ID of the vector in the database. And basically what that means is that
it's just going to keep adding duplicates. If you update a file and update a file
and keep reinserting it, that knowledge is going to be duplicated and that is bad.
And so what we're doing here is we're actually going to, if I go to SuperBase, when
we insert another vector for the 825 meeting notes, if it gets updated, we're going
to delete this one first and then insert a new vector with an ID of 30 that has the
updated contents of the file. This is really important. So basically any... vector
that has the file ID of the current Google Drive file that we are updating as a
part of the metadata, we're going to delete it. so going to the database here, you
can see that we have the file ID stored in the metadata. And this corresponds to
exactly the same file ID of the Google Drive file. Very, very important step. So
we're going to run this and then we can see here, if I go over to super base, sure
enough, this record is now gone. So the vector database is cleared, ready to be
inserted. ready to have the new record inserted for the updated version of this
file. So we have no duplicates. And so going back to the canvas here, the next step
is to actually download the file locally. And so that way we can extract the text
content from it. So I'm just getting the file ID from the previous step and then
converting any docs to text and Google Sheets to CSV so that I can extract the raw
text. So that is what I'm going to do in this step right here. And now we have the
raw text, is the meeting notes that we have exactly what you see right here in the
Google doc. And now the very last step is to insert it into the super base vector
database. So the credentials are the same as what you've already set up. And we're
going to do pretty much all the same parameters as before using the documents table
and then the match documents query. So when I run this, sure enough, I get an
output with the page content that matches exactly with the data that we put into
this step. And it doesn't have to do any chunking here because the file is small
enough. I've got a couple of things attached to this here, like OpenAI embeddings,
and then also the default document loader, which is just going to use a recursive
character text splitter. So I'm not going to get into chunking too much, but this
is just a nice simple recommendation for a simple chunking that makes this vector
database work quite well. So that is literally everything for this workflow. We now
have a fully functioning Honestly, pretty production ready rag AI agent, you might
want to enhance this in some ways to add like better semantic search or maybe like
keyword search. There's a lot of ways you can always extend a rag application. But
this is a really, really solid start. So I'm to go into the chat here. I'll ask you
another basic question like who are the attendees for the 825 meeting again, having
to use rag to go to my meeting those documents. And here we go. Lucy, Bob, Tom, Jim
and Stacey. Perfect. That is exactly the right answer. And that is using the new
vector that I added after replacing the old one when I was going through this
example just now. So that is everything for getting started with a production
ready rag AI agent built with no code. N8n and SuperBase is definitely one of my
favorite combinations in the AI world. So I'm definitely going to be making more
content on that in the future. If you have any questions at all with anything in
this workflow, or if you want me to implement something similar to this or to build
on top of it later, Let me know in the comments, I always love feedback, and I hope
that you're looking forward to more content with N8n and Supa Bass. If you are,
I'd really appreciate a like and a subscribe, and with that, I will see you in the
next video.
Hey, Daniel here. If you're using vector stores to ground your AI agents in your
own data, then you may have noticed that you don't always get accurate results. And
one of the reasons for this is that while a vector search is great for handling
natural language queries, it can sometimes struggle when it's dealing with specific
names or terms, maybe acronyms or codes that are in the knowledge base. In this
video, I'm going to show you how you can fix this by implementing hybrid search.
And I demonstrate this both using SuperBase and Pinecone. To explain the concept of
hybrid rag, let's first look at what vector search is. So vector search is great
when it comes to capturing the semantic intent and the meaning of a user's query.
So let's take a quick example. Let's say we have an AI agent that's deployed on an
e-commerce store, and we have a customer on the front end who asks this agent a
question, can I see the various blue cotton t-shirts? That query goes to the agent
who goes deep into the knowledge base to fetch similar products. And not only will
it retrieve a blue t-shirt, it may also retrieve other t-shirts that are cotton,
for example. So what's happening behind the scenes then is that query is sent into
the agent and the actual query is converted into a vector. And in this case, it's a
dense vector. So this is a numerical representation of the query itself. That is
sent into the knowledge base and the knowledge base already contains all of the
information about all of the products. And then what comes out of that are these
similar results. And then based on that query, let's say it's show me lightweight
cotton t-shirts, it will pick up products of a similar meaning. So it'll pick up
maybe comfy summer tops, breathable shirts for hot weather, soft tees for everyday
wear. And that's how you can end up with quite a broad selection of results for a
query like that. And this is the real strength of semantic search or vector search.
Where it falls down though, are questions like this. If someone asked, me blue t-
shirts that are medium sized, you may get back results that are not medium but are
blue, for example. Or if you ask the agent to show you a blue t-shirt with this
specific product code, the agent would again fetch a lot of results that are t-
shirts, that are blue, and it may or may not include the product with that actual
code. And this is an example of how semantic search can be a little too broad in
certain scenarios. So if we compare this then to the more traditional keyword
search or full-text search, Here we're matching exact words or partial words and
it's great for precision. So back to my example, if the customer is asking for blue
cotton t-shirts, that will go to the agent and if it was using keyword search
instead of vector search, it'll provide products that have the specific words blue,
cotton and t-shirt either in the title or the description. And there's a number of
ways that this can be done behind the scenes. But one of the most common ways is
for that query to actually be converted into a sparse vector, which is then sent
into the vector store and similar results are returned. So the strengths of this
approach, if you were to ask, do you have a black cotton t-shirt? It could return a
product like this, the Apex 25 black cotton t-shirt, because these are exact match
terms within the title or within the description. But the weaknesses then are the
opposite of semantic search. It has no understanding of meaning. T-shirt doesn't
equal T, for example, unless it was explicitly added via synonyms. So while keyword
search has a lot of strengths in getting the exact match back, it really isn't that
flexible. So then this is where hybrid search kicks in because you have the best of
both worlds. You have the precision of keyword search, as well as the semantic
understanding of vector search. And within this approach, you're actually merging
the results from both systems. So the customer looking for the blue cotton T-shirt
is going to receive a result set that has the blue cotton t-shirt at the very top,
but then it also has other similar t-shirts that they might be interested in. And
what happens behind the scenes then is that that query, when it goes to the AI
agent, is actually converted into a dense vector embedding for the semantic search,
as well as a sparse vector for the keyword search. And two different results sets
are created with those two different algorithms. And a key aspect then of this
hybrid result set is that they are then ranked. because you end up with scores from
both systems. So there's different weightings given to the scores so that a hybrid
result set can actually reflect what was the strongest results from both data sets.
So if we were to look at a typical AI agent within N8n, there is no option for
hybrid search within the tools. You have various vector stores, for example, but
all of these are simply semantic search. You don't have this full text lexical
keyword search. So there is a little bit of custom work you'll need to do to get
these systems up and running. But if you are struggling with the accuracy of your
RAG agent when it comes to retrieving data from your knowledge base, it's
definitely worth testing out hybrid search. In this video, I'll be going through
this implementation of SuperBase's hybrid search. And this is the resulting NNN
workflow, which is pretty simple. Most of the work actually is on the SuperBase
side. And this is the end result in SuperBase. This is my documents table. And not
only do we have our embedding column for our dense vector embeddings, we also have
this TS vector column for our full text search. And interestingly, if we copy one
of these out and have a look at it in a notepad, here we can see the various
keywords within the chunk or partial words within the chunk, as well as their
positions within the chunk. So this is used within the keyword search algorithm to
actually find these exact matches. The pine cone hybrid search implementation is a
little bit more fleshed out on the N8n side. But as a result, there's actually less
to do on the pine cone side. So here we have our hybrid search ingestion flow. We
have our AI agent that's able to hit our knowledge base, but that just triggers
this search query flow that you see here. And on the pine cone side, then you can
see our full knowledge base. And if we click into one of these chunks, you can see
the metadata under values. This is the dense embeddings and then sparse values
represents the full text search data. Before I jump into the implementation, If
you'd like to get access to these workflows so you can test them out within your
own AI agent, then check out the link in the description to our community, the AI
Automators, where you can join hundreds of fellow automators all looking to
leverage AI to automate their businesses. Here we'll be implementing hybrid search
on SuperBase. SuperBase have full documentation on how to actually get this up and
running. So if you come into this page, I'll link it in the description below. And
the first thing you need to do is copy out this query to create the actual
documents table. And what you'll see is it not only has an embeddings column for
the vectors, it also has a full text search column. So when you inject data into
this table, it'll populate out this column. So come into SuperBase, I'll create a
new project for this. And when you land in your dashboard, go to your database and
then click on extensions, because the first thing you need to do is enable the
vector extension. And from here then go to SQL editor, copy in the query to create
the documents table, and then we'll paste it in here. And the one thing you need to
do is you need to change the number of dimensions for the actual vector embedding
column. And this depends on which embedding model that you're going to use. So I'll
be using OpenAI's text embedding three small model, which has one, five, three, six
dimensions. So we'll click run here, and then you get a success, no rows returned.
And then you can verify that was created by going to database. And you can see that
there. And if you click on this icon, It brings you into the table view where you
can see the data, but obviously we don't actually have any data. Okay, so back to
the SQL editor then, and let's copy in the next snippet. So here we'll be creating
indexes for both the full text search column and the vector search column. So this
is to make sure that we get really fast results back from this table. And again,
we'll get success. And to verify that, if you go to Database, Indexes, you can then
see those indexes created. And now let's grab the next snippet. So this is to
create the hybrid search database function. And what this function will do is it's
going to carry out a vector search and a full-text search. And then both of those
results are going to be fused together using this reciprocal rank fusion process.
So copy that out and drop it in here. And again, just at the very top, you need to
specify the number of dimensions in the embedding model. And we'll leave everything
else as is. And we'll click Run. and that has returned success. And to verify that,
go to Databases and then Functions, and this is the hybrid search database function
that was just created. So at this point now, we have the database table, we have
the function created to actually carry out this hybrid search. We just now need a
means to actually trigger this from end end. If you go on the left-hand side here
and click Edge Functions, but let's create this edge function using the sample code
they provided. And then on this page, Click that you want to create your first edge
function via the editor. You grab your code and drop it in here. Now there are a
few changes we need to make and it's around the embedding model again. So we need
to change
our dimensions to 1536 and then we need to change the model name to text embedding
three small. And what's happening here is that this function is actually going to
go to OpenAI and it's going to take the user's query and it'll actually generate
the embeddings for it here and pass it to the search function. So we won't need to
do this on the N8N side when we're actually triggering this. But then as a result,
we'll need to provide our super base project with an OpenAI API key that it can
actually use. So then at the bottom, let's give our function a name. Okay, and that
has succeeded. And now we have our endpoint URL. And further down, it gives an
example of how we can actually invoke this function, but just be aware that the
data that it suggests that you pass is incorrect. It's not actually based on the
code in the function. So you will need to make some changes here. And what to do is
click on code and just copy out the name of this OpenAI key and then commit to
secrets because you need to add a new secret, which is that API key, and then go to
the OpenAI playground to create a new key. So I've copied that out and now I can
paste it in there and click save. And now the edge function will have access to
this key when it's actually triggered so it can generate the embeddings. So back
into functions and let's copy out this example curl. that it provides. And then if
we come into N8n, and then for out of first step, let's add a chat trigger. And
then after that click on plus, and we'll add a HTTP request. And we'll import this
curl. And you'll see that that actually populates out the method and the URL, as
well as the API key for super base. And then further down, then you can see these
body parameters. And this is where you need to make some changes. Where it says
request JSON, you can see that it's looking for a constant, which is query. So if
we just copy query out and bring it in there, so our query is going to be, and this
is what we're going to get from our chat trigger. So if we click execute previous
nodes, you can now see the chat input message there. So we can drag that in then to
our query, because this is what we're going to send into our vector store. We'll
click hello. And that's gone to SuperBase's edge function now. We're getting no
output returned, which is probably correct because there's nothing in the database.
So let's load up some data into this vector store now so that we can actually test
this properly. So let's make some space here on the canvas. And if you click on the
plus, let's add a manual trigger for the moment. Within our RAG Masterclass, we
built out a full data ingestion pipeline using Google Drive and web scraping for
SuperBase's VectorStore, as you can see here on screen. So check out the
Masterclass for that. And this workflow is also available for download in our
community. And from here, type in SuperBase, and we're going to use the add
documents to VectorStore action. So click on that. And from here then you'll need
to create a credential. So on the dropdown, click Create New Credential, and you'll
need to enter your host and a service role secret. And you can paste that in and
click Save. And you'll get a green connection message, which means it is able to
connect and you're all done. You can choose the table, which is Documents, and you
now need to add two more elements. So if you click on Embedding, we need to add our
embedding model. So this is OpenAI. And again, you need to create a connection if
you don't have it where you can drop in that API key you created previously. And
from there, then you can choose Text Embedding 3 Small. And this defaults to 1536
dimensions, so you don't even need to set it. And it's crucial that the embedding
model that you use when you're uploading documents is the same as when you're
querying the VectorStore. And then for document, we're just going to use a default
data loader. And we'll just load all data that's incoming into the SuperBase node.
So you can leave it as is. I generally use a recursive character text splitter, so
I can leave it at the chunk size of a thousand. So if we click test workflow, okay,
we've hit an error and yes, so could not find the metadata column of documents. And
the reason for that is the actual example code that we use to create the table
doesn't have that metadata field. So we just need to add that manually. So back
into SuperBase, if we go to our database and open it up, And then you'll see this
plus where you can add another column to the table. So if you click on that, you
just need to type in metadata. And then the type of the column is JSON B or it's
binary JSON data. Then if you test it again, that column now exists and it worked.
So we now have our Hello World content in our vector store. You can see the dense
embeddings that we have here. But now this is the full text search field, which is
TS vector. And this is what the full text search actually uses. when it's trying to
figure out what's the most relevant results in the database. Okay, so let's load in
a document. And the example document I use quite regularly in my rag videos is this
Formula One technical regulation document, which is 180 pages long. So we'll just
copy out that URL, because this is hosted on the FIA's website. And if we just come
into here and click plus, let's just download this document. But for the moment,
just to demonstrate hybrid search, we'll just use this one document. So there's the
request node, we'll paste that in, and if you click test step, we're getting the
binary. Back into our default data loader, and we'll just actually process the
binary as opposed to the JSON, and let's trigger this workflow. And now you'll see
that it's processing through this PDF, and it has just finished, and we have 683
chunks. And now we go back into our hybrid search table and refresh. We now have
that 180-odd page document, up-certed to our vector store, and it's embedded, as
you can see there. but we also now have these full text search fields populated as
well. If you copy one of them out and bring it into notepad just so that you can
see the format of this, this is what we're getting. So it's not a vector the way
the embedding is. It's a set of words or partial words as you can see there with
different weights and positions. So now that we have data here, let's come back to
our chat message and let's pass in a query that will appear in the document. So
maybe engine intake air. So just drop that in here. and we're getting 10 items. And
that is because if you look at the edge function and go into the code, you can see
that we have match count set to 10 there. So if we change that to say 20 and deploy
the updates, we're now getting 20 items back. And if we have a quick look at the
data that we're getting, you can see that we're retrieving the content of the
chunk, as well as the full text search data and the actual embedding itself, which
we don't really need. I think what would be interesting though is to understand
where this chunk ranked in both the vector search and the full text search. So
we'll add that in a couple of minutes. But maybe what we'll do now is we'll bring
this into an AI agent. So let's disconnect the chat message from the actual request
and let's add our AI agent. And then when it comes to a tool, the way you would
have done this before is you would click on plus and you would add in super base.
We can't actually do that with hybrid search because the actual database query that
we created has specific parameters that are required that we can't actually pass
using this node. So instead of this traditional node, if you delete that, we can
use just a direct HTTP request to our edge function. So we essentially need to hook
this up. But to do that, just click on plus HTTP request tool, and then you can
just copy out the parameters that you've set here. So bring in the URL, bring in
the header auth. And then when it comes to the body, query is the name, but when it
comes to value, just press this button so that the actual AI agent can define what
it's sending into the SuperBase Edge function. Before we test this, let's just do a
quick tidy up. So we need to give this tool a name, and then for the AI agent
itself, let's set a system message, only generate an answer based on results from
the connected knowledge base. And now let's ask a question. What are the rules for
the Engine Air Intake? All right, and there you go. We're getting quite a
comprehensive answer back. We're still getting all of the vectors on the full text
search data, so we need to strip all of that out because all of that's going to the
agent and it's just not required for it to actually generate an answer. So if you
go back into SuperBase and now go to the database function, which is your hybrid
search function, and then with the three dots on the right, click on edit function
with assistant, and then you can vibe code the changes with the AI assistant in
SuperBase. Update this function to output the content and metadata columns. in the
documents table. And actually, let's ask it to do something else. Also, can you
provide the rankings from the vector and keyword searches? And that way then we can
see which chunks actually performed better in which search before they were
actually fused together in the main list. Yeah, so it's getting the full text rank
and the semantic rank. So we'll click run query. Okay, we're getting an error,
which is there's a mismatch in the return type, which you can actually see there.
What I'll do is I'll just delete this because It doesn't seem like it's able to
change the return type via this AI assistant. So I'll just delete the function. And
now if we run it, it should create it successfully, which it has done. Now if I
refresh. So we now have our new hybrid search with the correct return type. So
we've removed the actual
embeddings being returned, but we've added in this full text rank and semantic
rank. So let's now test it out back into N8n. What are the rules for the
engineering take? Yeah, that was a lot faster. getting a decent answer back and now
if we double click into it, perfect. So we're getting our metadata, we're getting
the content, and we're getting the rankings from the full text search and the
semantic search. And now we test it out with some terms from the actual document.
So there's one here which is metal matrix composites, MMCs. And in theory, full
text search should work much better for an example like that, whereas semantic
search might totally miss the point. I might pick up things semantically related to
metal, such as iron or steel, whereas with full text search, it's going to look for
that specific thing. And after triggering it, the first chunk was both first in the
full text rank and the semantic rank. So in both searches, whereas the second item
returned actually finished third in both rankings. Item number three in the list
actually finished fourth in the vector search and sixth in the full text search. So
you kind of get the idea here. that these are different searching algorithms. And
then what determines the order of the chunks that we get back from the system is
actually a fusion of both of these rankings. And these types of search terms are
great examples of where hybrid search and full-text search works very well. So if
someone was searching for this specific term, ISO 16220, we drop it in here. What
is ISO 16220? And we get back a variety of chunks. But if you look at the very
first one, you can see that the semantic ranking of this chunk wasn't even in the
top 30 results, whereas the top result for the full text search. And if you look at
it there, there it is plain to see in the text. So that's a perfect example of how
vector search will produce a really bad answer for that type of query. Whereas full
text search nailed it. Whereas if you ask a kind of a general vague question, that
doesn't have any direct matches to the text. What is the impact of wind on an F1
car? The actual regulations probably don't have it in that language. They talk
about airflow, they talk about aerodynamics. So yeah, the first 10 results that
we're getting back are simply just reflecting the vector search. So it's not that
the full text search isn't running in this query. It's just that the relative
scores that it's getting back are quite low. So that when the two results sets are
fused together, they fall below the threshold that the semantic results are
setting. But then this is where re-ranking kicks in because this fusion of the two
results sets isn't bulletproof. So by actually putting in all of these results into
a re-ranking model like Cohere, you can get a really accurate ordering of these
results and then provide the top subset of those results into an AI agent to
generate the answer. Within our rag masterclass on this channel, I've built out
that re-ranking system that you can see here. So this is using Cohere 3.5. and it
is this hybrid search system with super base. So if you'd to learn more about this,
then check out my masterclass on a link in the card above. So that's how you can
set up hybrid rag in N8n using super base for both vector search and full text
search. Onto our pine cone implementation. I won't go through this build step by
step. I'll just show you how it's actually set up. And here we're following the
instructions that are listed in this hybrid search page. Specifically, we're using
a single hybrid index. as opposed to having separate dense and sparse indexes.
This will make more sense when I go through it. So to get up and running, you need
to create an index, give it a name, and then come into custom settings. And you
need to choose dense as the vector type. You need to put in your number of
dimensions. I have 102.4 set because I'm using the multilingual E5 large model.
And that has 102.4 dimensions, as you can see there. And when it comes to metric,
you must choose dot product. So pine cone only supports hybrid search in a single
index if you use the dot product metric. And now you should have a blank index and
you'll be able to access the host for that index here. So to talk through the
ingestion flow, I have a manual trigger here for the moment. Next, I have a set
fields node. And here I'm setting the index host. So this is essentially this URL.
So that's dropped in there. And I've also specified a namespace as well. I've just
set this as N8N. And then what I'm doing is I'm downloading this file. So this is
just a test file that I've used as part of this build. It's this F1 technical
regulations document, which is quite large. You can see that this outputs a binary
file. So then in my next node, this just extracts all of the data from this PDF, as
you can see there. And if we scroll to the bottom, we have a field called text that
is all of the text in this document. So this is a machine readable document. If
this was a scanned PDF, you would need to use the likes of Mistral OCR to actually
extract the contents. Again, I go through that in my Rag Masterclass on this
channel. Next up, we need to do our own chunking of this document. And while there
is text splitters within N8n, none of them are standalone nodes. You need to bring
in your own chunking logic. And for this, I vibe-coded this with ChatGBT. I just
asked it to generate a JavaScript chunking script. where I can specify the
different chunk sizes and chunk overlaps. And here I've just specified I want the
text to be what I'm getting from the PDF, which is here. So I just drag that in
like that. And that's basically what this code node does. There was a little bit of
a back and forth as I tested it out, but nothing too complicated. So at this point
now, if I trigger that, you can see we've ended up with 1,152 chunks. This is a
180 page document, so that makes sense. And then I'm looping through these. in
batches of 96. The reason it's 96 is that the dense embedding model that I'm using
has a maximum batch size of 96. So that's why I'm doing that. And you can see those
96 items in this batch have then flowed through to this next node. And what I'm
doing then is I'm simply just aggregating them all into an array so that I can go
back to a single item. And there we have it. So we jump into this now. Here we can
see we have 96 items. And here we have one item. Okay, so next up, I'm converting
the format because here what I need is an inputs array that actually has text as an
item within an object. So this is what's required by the pine cone API. This is
just an array with strings, whereas here now we have an array with objects that
have a text parameter that's a string. And now we can actually trigger the
embedding of these chunks. And I'm using Pinecones API to actually embed these
chunks. So that's using the pinecone.io forward slash embed endpoint. I'm passing
my header auth, which is the X API key, and then onto the body. And here is what
I'm sending. So the model here is multilingual E5 large. I'm passing the
parameters input type passage. That's pretty important because when you're actually
inferring, you need to pass input type query. I got stuck on that for a while. And
then the inputs then are what we set up in the previous node. And you can see it
there on the right. So now we do the same for our sparse embedding. The only
difference is we're now passing a different model name. This is pinecone sparse
English v0. So we'll just press play on that and that'll generate our sparse
embeddings. And you can see they're a bit different. So we have sparse values and
sparse indices. And this is the format that pinecone requires when you're actually
inserting the data. And then we have a code node to build out this vector array.
this pine cone endpoint to up cert the vectors requires a specific format. And
that's this format that I actually build in this node. So I'm generating a unique
ID for each vector. I'm loading in the arrays of dense embeddings of sparse
embeddings and of the chunks. And then we're simply just looping through all of
that and building out our new vector array that we can then send to pine cone. And
all of that is just then returned off the back of this node. So if I click test
step here, you can see that this is the exact output that it requires. And then
finally, onto the Upsert Vectors node, we're hitting this Vectors Upsert endpoint,
and then it's simply that JSON.output that we're getting from the previous node.
And now if I come into here and let's click Test Workflow, that's downloading that
190 page document, and then it goes through this loop, upserts the vectors, and
then it goes to generate new embeddings, upserts them, generates new embeddings,
upserts them. And this is essentially what happens with the default data loader in
a standard AI agent and the chunking strategy, except here we've just built it out
manually. It's pretty fast with these batches of 96 items. Okay, and we're done.
And now if we come into Pinecone and let's refresh, and if you jump into any single
chunk, you can see we have the text of the chunk as well as the vector embeddings
and the sparse values for the keyword search. So then onto the actual inference
phase, we have our AI agent. As I mentioned, we can't use the standard pine cone
vector store because that doesn't support hybrid search. So instead, we need to
just call an N8n workflow. And what we're doing is we're calling this exact same
workflow, which then means we can add this node here. So essentially, the agent is
going to trigger this, which is going to trigger this. Within that tool, we're
specifying query as a workflow input, and we're getting the agent to actually
define that automatically. And that's done by just pressing this button here. to
let the model
define that parameter. And then over here, if you look at this when executed by
another workflow trigger, we've had to add that query as an input field for this
actual flow. Great, so we now have the query from the AA agent showing up here. So
then we can do the same thing. We're just going to set some variables, which is our
index host and our namespace. And it's the same thing again. We're just generating
the dense embeddings and the sparse embeddings. But the key difference is... When
we generate them, we need to change the input type to query as opposed to passage.
So now if I click on generate the dense embeddings that has gone to pine cone, it's
sent in that query and it's generated out this dense vector embedding that we can
then query with pine cone. And that's the same thing with the sparse embedding. We
get back a representation of that query and then we can query pine cone. So this is
hitting the query endpoints again, passing header auth and this is the structure of
the query. is all documented in pine cones API documentation. So we're the
namespace, the dense vector, the sparse vector with its indices and values, the top
case, we want to get back 10 results. We don't need the vectors back, but we do
need the metadata back because that's the text. And this is what it looks like when
we get the output. So we're getting our chunks, we're getting the score as well. So
this is the hybrid result set. That's kind of the common score between both
systems. And then all of these chunks are sent back into the AI agents to actually
generate its output. Okay, so let's test it out. Can you explain the plank assembly
rules? Okay, and there's our result. So 11 different rules. And if we jump into the
tool calling itself, you can see the chunks that were provided to actually generate
that text. And if we search for say plank assembly or just plank, you can see
there's a lot of exact matches on plank. And this is the real benefit of hybrid
search. is that if you search for very specific terms, you're going to get much
more results back with that term than you would if it was just semantic search. I
hope you found this video useful. Make sure to like and subscribe to our channel
for more. And thanks for watching and I'll see you in the next one.