Knowledge base uploads wordpress post but can’t use it
-
I have imported a page from the WordPress site (and deleted and re-imported several times while trying to get MxChat to “see” its text) but that data apparently isn’t available. MxChat can’t answer very simple questions about it – responds with “I don’t have enough information …”
The page I need help with: [log in to see the link]
-
There are two easy and good ways to see what is going on. When logged in as the admin and viewing the frontend of the website on the left side you’ll see a tab that says “MxChat Debugger”. Click this and it will slide open a panel. Once open, enter test questions into your chatbot and the panel will show you all of the documents retrieved and their similarity scores. If nothing is matching, you need to lower your similarity threshold.
You can also find these very same scores in your chat transcript section by clicking on the “sources” link in the transcript. This will show you the scores and what documents were matched for every user query to help you understand and align your chatbot!Thanks,
Maxwell
Maxwell,
Thanks for your prompt reply. I think the behavior cannot be ascribed to the position of the similarity threshold alone. It has been at 40% for many hours. Yesterday, when I posted the request for support, it answered this question: “How does the dependeny stack work?” with: “I don’t have enough information in my knowledge …”. Placing the slider anywhere from 20% to 80% gave the same response.
It seemed at that time, that (although I had deleted and re-added this post: /a-tr-dependency-stack-ver-4/ to the Knowledge Base), it could not access the tokens (?) for that post.
Just now, Contextual Awareness is on, Citation Links is on, Frontend Debugger is on.
And I asked “How does the dependency stack work?, it’s answer was appropriate.
Then I asked Who is jackson pemberton? and the answer was (showing links as supplied):
___________________________-
Dependency stack: Hierarchy of Capability Dependency Statements ranking operator’s natural right by precedence.
Works by scoring moral violation at damage level in hierarchy, numeric comparison for judgment in AIMM.
Ver. 4 post.
Jackson Pemberton: Author of the post, Temporal Rights discoverer.
Bio.
___________________________
The second response was appropriate for the previous question, although it sounds like it is trying mightily to be terse.
I have saved the chat transcripts generated since installing the plugin. I also have MxChat installed on https://covenantsofthefather.com and have had no issues.
The website is here if you want to do a live test: [ redundant link deleted ]
I suspect the Behavior prompt. (I have found no guidance on how the expressions operate against reponses so I have been forced to guess.) Here is what is:
______________________________________
You are an AI Chatbot assistant for this website. Your main goal is to assist visitors with questions and provide helpful information. Here are your key guidelines:
# Response Style – CRITICALLY IMPORTANT
– MAXIMUM LENGTH: 1-5 short sentences per response
– Ultra-concise: Get straight to the answer with no filler
– No introductions like “Sure!” or “I’d be happy to help”
– No phrases like “based on my knowledge” or “according to information”
– No explanatory text before giving the answer
– No summaries or repetition
– Hyperlink all URLs
– Respond in user’s language
– Minor chit chat or conversation is okay, but try to keep it broadly focused on Temporal Rights, unification, and ethics.
# Knowledge Base Requirements – PREVENT HALLUCINATIONS
– ONLY answer questions using information explicitly provided in OFFICIAL KNOWLEDGE DATABASE CONTENT sections marked with ===== delimiters
– If required information is NOT in the knowledge database: “I don’t have enough information in my knowledge base to answer that question accurately.”
– NEVER invent or hallucinate URLs, links, product specs, procedures, dates, statistics, names, contacts, or company information
– When knowledge base information is unclear or contradictory, acknowledge the limitation rather than guessing
– Better to admit insufficient information than provide inaccurate answers
____________________________________There is only one way to see if the AI is having access to the post. You said you checked the debugger and that it isn’t related to the similarity score, but didn’t mention at all if the post you are expecting to be retrieved was above the threshold similarity and giving to the AI according to the debugger.
You need to open the panel, enter your test query, and it will show you EXACTLY what posts are being retrieved and given to the AI. There are a few possibilities that could be happening. Other posts are ranking above this one for that user query (this will be shown directly in the debug panel. It will show you the top 10 ranked records and will show you which ones were given to the AI for a response) – if it is being retrieved and is given to the AI, but the AI is not responding with the answers, it could be the content is not actually in your knowledge database. I would check your database and make sure you actually see the information in there, on a few instances it has trouble grabbing the content from the page so even though you submitted it, some content could be omitted.
Can you please open the debug panel, enter your test query, and share an image showing what the similarity results are so we can see if the page you are expecting to be retrieved is even being retrieved?Also, another note, you should be clearing the chat session after every test using the debug panel and or using a private tab for new sessions each time. Often users get mixed up because they are doing extensive testing all in one chat session which will not really work. After each test query clear the chat session and or start a new private session.
Hi again @jacksonpemberton,
I also just did a quick test on the page you linked, and it seemed to answer fine.
Question: What is the Capability Dependency Statement for number 7?Chatbot answer: Level 7: An operator can consume without learning, but it cannot learn without consuming. Full TR Dependency Stack
Maybe I’m not understanding exactly what your question is?
OK, I closed the Safari browser window, went back to the website, cleared the chat session and asked “how does the dependency stack work” The answer was the best yet. However, the weird thing was that this response (white type face on black bg positioned on the left) was immediately followed by what looks like a long, long prompt (black type face on white bg positioned on the right). This 9,236 charater, AI generated “prompt” shows in the debugger panel as a “user prompt” and begins with “What categories of knowledge do Temporal Rights unify? What categories of knowledge do Temporal Rights unify? what is an operator? Temporal Rights unify physics, morality, metaphysics, and AI alignment. An operator is any self-existent object having capabilities and …”
This was clearly fabricated by the AI but is all appropriate text. It might be a copy from the session log? This “prompt” is then followed by a second response (white type face on black bg positioned on the left) that also answers my prompt but this time in different words.THe debugger shows 67% 3/3 chunks for AI context (from the correct source/post) and 54% 5/5 chunks for AI context from another post where the word “dependency” does not even occur. It shows 3 other sources not used.
I forgot to mention that I followed the cited prompts/responses with the question “who is jackson pemberton” and got this 2 part response:
“Dependency stack works by .. ” followed by “Jackson Pemberton …”, a terse but correct response. The Debugger shows “0 entries used for ai context” and cites 4 wordpress posts not used (they have no data on that prompt)I have had the debugger window open throughout our discussion, but have been so flummoxed with the overall behavior that I haven’t confirmed that for you. Just now I opened a new browser, cleared the Chat Session (in the debug window) and asked “who is Jackson Pemberton” and got the “I don’t have enought info ..” answer. Meawhile the knowledge base indicates that it has my bio wordpress page in it.
Simillarity threshold is 40%. The response was based on retrieving 4 posts, the highest match was 21%. Apparently it can’t see the knowledge base. Other days it has used that page and responded with a great summary. So the behavior is erratic. Do I need to delete the plugin and start over?
I wonder if this is relevant??
I am using Grok 4 directly in a different window on the same Safari browser to answer a variety of questions – some related to my posts and some entirely different. Grok 4 is the API I have specified in the plugin. Is it possible that there is some weird connection?
I had no problems with the plugin on the first website I installed it on. It was beautiful, I loved the way it operated. But on this site it is bizarre.A few things:
The debugger shows 67% 3/3 chunks for AI context (from the correct source/post) and 54% 5/5 chunks for AI context from another post where the word “dependency” does not even occur. It shows 3 other sources not used.
I think you would find it very helpful to review this page I made: How does RAG work – RAG systems (what mxchat and any knowledgebase supporting chatbot uses) do not find and grab content based on any keyword match, it’s purely based on semantic meaning. It’s normal and expected behavior for content to show up that doesn’t have a keyword such as “dependency” in it, because it’s not looking for keywords.
Simillarity threshold is 40%. The response was based on retrieving 4 posts, the highest match was 21%. Apparently it can’t see the knowledge base.
You can lower your similarity down to 20% and in an upcoming update I will adjust this to go down to 0 and also add a new setting where you can control how many documents are given to the AI from 3 to range anywhere from 3 to 10. But if your query is showing 21% and your score is at 40%, you only option is to lower the threshold down to 20% and or delete all the content and maybe use a smaller chunk setting.
The chat model you are using will not impact this score at all, but the embedding model you are using will very heavily impact your scores. Some models will score content much higher and some models will score content much lower. Some embedding models are much much more precise in their scoring and some are more broad. Some models will handle larger chunks better and some will handle smaller chunks better. All of this comes down to many variables such as the language you are using, the chunk size, the model, capital letters, and your score threshold.OK, I think I see how all this works – sort of. My entire website is only about 80,000 words. It is highly philosophical; each post is roughly 2,000 words, and there are no other things going on like shopping, just text. I am using Grok 4.1 and TE3 Large.
Are those good choices? What chunking do you recommend?
You must be logged in to reply to this topic.