Skip to content

feat: Improve node search with fuzzy matching and ranking#5370

Merged
HenryHengZJ merged 2 commits intoFlowiseAI:mainfrom
HavardHomb:feature/Fuzzy-Search
Oct 30, 2025
Merged

feat: Improve node search with fuzzy matching and ranking#5370
HenryHengZJ merged 2 commits intoFlowiseAI:mainfrom
HavardHomb:feature/Fuzzy-Search

Conversation

@HavardHomb
Copy link
Copy Markdown
Contributor

@HavardHomb HavardHomb commented Oct 25, 2025

Problem:
When searching for nodes in the Agent/chat canvases, the searches are annoyingly strict. I added a fuzzy search, so that smaller spelling mistakes are accepted. Results are sorted by how good the match is.

Two Types of Matching

  1. Exact Match (Best)
  • If your search term appears exactly anywhere in the node name, you get a high score (1000+ points)
  • Example: Searching "chat" finds "chatopenai", "conversationchat"
  • Bonuses:
    • At the very start: +200 points → "chatopenai" scores highest
    • At word start: +100 points → "openchat" gets bonus
    • Further away = fewer points → penalized by position
  1. Character-by-Character Match (Fuzzy)
  • All letters from your search must appear in order in the target
  • They don't have to be consecutive
  • Example: Searching "llm" matches "Language Learning Model"
  • Bonuses:
    • Letters close together: more points
    • Match at start: +20 points
    • Match after space/dash: +15 points (word boundaries)

How Nodes Are Ranked

Each node is scored on three fields:

  1. Name (e.g., "chatOpenAI")
  2. Label (e.g., "Chat OpenAI")
  3. Category (e.g., "Chat Models") - counts half as much

The highest score from these three wins.

Penalties

  • Longer words with same match: fewer points
  • Gaps between matched letters: fewer points

Real Examples

Searching "conv":

  • "ConversationChain" → 1200 points (exact match at start)
  • "ConverseAPI" → 1180 points (exact match at start, but longer)
  • "RecursiveCharacterTextSplitter" → 45 points (fuzzy match, far apart)

Searching "llm":

  • "LLM Chain" → 1200 points (exact match at start)
  • "LanguageLearningModel" → 65 points (fuzzy, close together)
  • "Call Language Model" → 40 points (fuzzy, more gaps)
image

Summary by CodeRabbit

  • New Features

    • Implemented fuzzy search for node discovery with partial and flexible matching.
    • Search results are now scored and ranked by relevance across node attributes (name, label, category).
    • Replaced simple substring filtering with relevance-based retrieval across both Agent and non-Agent canvases.
  • Chores

    • Public API prop renamed: onFlowGenerat → onFlowGenerated (update integrations accordingly).

@HenryHengZJ
Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Oct 29, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Oct 29, 2025

Walkthrough

Added fuzzy search to AddNodes: implemented fuzzyScore and scoreAndSortNodes, replaced substring filtering with similarity scoring in both Agent and non-Agent flows, updated filterSearch to use new scoring, and renamed prop onFlowGeneratonFlowGenerated.

Changes

Cohort / File(s) Change Summary
AddNodes: fuzzy search & API fix
packages/ui/src/views/canvas/AddNodes.jsx
Implemented fuzzyScore(searchTerm, text) and scoreAndSortNodes(nodes, searchValue) to compute similarity across name, label, and category; replaced substring-based filtering with scoring in Agent and non-Agent branches; updated filterSearch flow to call new scoring; renamed prop onFlowGeneratonFlowGenerated.

Sequence Diagram

sequenceDiagram
    participant User
    participant AddNodes
    participant getSearchedNodes
    participant scoreAndSortNodes
    participant fuzzyScore
    participant UI

    User->>AddNodes: type search term (filterSearch)
    AddNodes->>getSearchedNodes: request with searchValue, nodes

    rect rgb(200,220,255)
    Note over getSearchedNodes: agent/non-agent path chooses node subset
    getSearchedNodes->>scoreAndSortNodes: nodes subset, searchValue
    end

    rect rgb(220,200,255)
    Note over scoreAndSortNodes: score & sort execution
    loop for each node field (name,label,category)
        scoreAndSortNodes->>fuzzyScore: searchTerm, fieldValue
        fuzzyScore-->>scoreAndSortNodes: score
    end
    scoreAndSortNodes->>scoreAndSortNodes: keep score>0, sort desc
    end

    scoreAndSortNodes-->>getSearchedNodes: scored nodes
    getSearchedNodes-->>AddNodes: results
    AddNodes->>UI: render grouped results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to focus:
    • fuzzyScore correctness, edge cases, and performance (long inputs, repeated calls).
    • Proper application of scores across name/label/category and aggregation weights.
    • Verify the prop rename onFlowGenerated across the codebase to avoid breakages.
    • Ensure Agent vs non-Agent code paths still produce expected grouping and debounce behavior.

Poem

🐰
My whiskers sniff a fuzzy phrase,
I score and sort through node-filled maze.
A typo hopped and then was gone,
Now matching hops from dusk to dawn. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Improve node search with fuzzy matching and ranking" directly and accurately summarizes the main change in the pull request. The raw summary confirms that the primary work involves adding fuzzy search capability through new functions like fuzzyScore() and scoreAndSortNodes(), replacing substring-based filtering with fuzzy-based retrieval, and applying this logic across both Agent Canvas and non-Agent Canvas paths. The title is specific and clear, identifying both what is being changed (node search) and how it's being improved (fuzzy matching and ranking), without vague terminology. The secondary API change (onFlowGenerat to onFlowGenerated) is a minor property fix that doesn't diminish the title's accuracy regarding the main feature.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b88d33d and f4adc97.

📒 Files selected for processing (1)
  • packages/ui/src/views/canvas/AddNodes.jsx (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
packages/ui/src/views/canvas/AddNodes.jsx (1)
packages/ui/src/views/canvas/index.jsx (2)
  • nodes (94-94)
  • isAgentCanvas (74-74)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build (ubuntu-latest, 18.15.0)
  • GitHub Check: build
🔇 Additional comments (4)
packages/ui/src/views/canvas/AddNodes.jsx (4)

119-193: Excellent fix! Past critical issues resolved.

The critical issues flagged in the previous review have been properly addressed:

  • Null/trim guards now prevent errors on undefined inputs
  • Gap penalty correctly calculates the span between first and last matched characters

The fuzzy scoring logic is well-implemented with appropriate bonuses for exact matches, word boundaries, consecutive matches, and penalties for gaps and length differences.


195-217: LGTM! Clean scoring and sorting logic.

The function correctly:

  • Returns unsorted nodes when search is empty
  • Scores each node across name, label, and category (with category weighted at 0.5)
  • Filters and sorts by highest score

219-233: LGTM! Proper integration of fuzzy search.

Both agent canvas and non-agent canvas paths correctly integrate the new scoreAndSortNodes function, replacing the previous substring-based filtering with the fuzzy matching logic.


73-73: Good catch! Typo fixed.

Prop name corrected from onFlowGenerat to onFlowGenerated with consistent usage throughout the file (lines 73, 406, 759).


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
packages/ui/src/views/canvas/AddNodes.jsx (4)

190-213: Guard inputs, trim once, and avoid nodes shadowing in scoreAndSortNodes.

Pre-trim the query, short‑circuit on empty, and rename the param to avoid confusion with state nodes.

-    const scoreAndSortNodes = (nodes, searchValue) => {
-        // Return all nodes unsorted if search is empty
-        if (!searchValue || searchValue.trim() === '') {
-            return nodes
-        }
+    const scoreAndSortNodes = (candidateNodes, searchValue) => {
+        const q = (searchValue ?? '').trim()
+        if (!q) return candidateNodes
@@
-        const nodesWithScores = nodes.map((nd) => {
-            const nameScore = fuzzyScore(searchValue, nd.name)
-            const labelScore = fuzzyScore(searchValue, nd.label)
-            const categoryScore = fuzzyScore(searchValue, nd.category) * 0.5 // Lower weight for category
+        const nodesWithScores = candidateNodes.map((nd) => {
+            const nameScore = fuzzyScore(q, nd?.name)
+            const labelScore = fuzzyScore(q, nd?.label)
+            const categoryScore = fuzzyScore(q, nd?.category) * 0.5 // Lower weight for category
             const maxScore = Math.max(nameScore, labelScore, categoryScore)
 
             return { node: nd, score: maxScore }
         })

214-219: Avoid shadowing state nodes in getSearchedNodes (agent canvas branch).

Rename local nodes for clarity.

-        if (isAgentCanvas) {
-            const nodes = nodesData.filter((nd) => !blacklistCategoriesForAgentCanvas.includes(nd.category))
-            nodes.push(...addException())
-            return scoreAndSortNodes(nodes, value)
-        }
+        if (isAgentCanvas) {
+            const filteredNodes = nodesData.filter((nd) => !blacklistCategoriesForAgentCanvas.includes(nd.category))
+            filteredNodes.push(...addException())
+            return scoreAndSortNodes(filteredNodes, value)
+        }

220-228: Same rename for non‑agent branch; minor readability.

-        let nodes = nodesData.filter((nd) => nd.category !== 'Multi Agents' && nd.category !== 'Sequential Agents')
+        let filteredNodes = nodesData.filter((nd) => nd.category !== 'Multi Agents' && nd.category !== 'Sequential Agents')
@@
-            nodes = nodes.filter((nd) => !nodeNames.includes(nd.name))
+            filteredNodes = filteredNodes.filter((nd) => !nodeNames.includes(nd.name))
         }
-
-        return scoreAndSortNodes(nodes, value)
+        return scoreAndSortNodes(filteredNodes, value)

230-242: Debounce should clear pending timers to avoid stale updates.

Using bare setTimeout can apply older results after newer keystrokes and leaks on unmount. Replace with a cancellable debounce.

-    const filterSearch = (value, newTabValue) => {
-        setSearchValue(value)
-        setTimeout(() => {
+    const debounceRef = useRef()
+    const filterSearch = (value, newTabValue) => {
+        setSearchValue(value)
+        if (debounceRef.current) clearTimeout(debounceRef.current)
+        debounceRef.current = setTimeout(() => {
             if (value) {
                 const returnData = getSearchedNodes(value)
                 groupByCategory(returnData, newTabValue ?? tabValue, true)
                 scrollTop()
             } else if (value === '') {
                 groupByCategory(nodesData, newTabValue ?? tabValue)
                 scrollTop()
             }
-        }, 500)
+        }, 500)
     }

Optionally, add useEffect(() => () => debounceRef.current && clearTimeout(debounceRef.current), []) for cleanup.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f94d61 and b88d33d.

📒 Files selected for processing (1)
  • packages/ui/src/views/canvas/AddNodes.jsx (2 hunks)
🔇 Additional comments (1)
packages/ui/src/views/canvas/AddNodes.jsx (1)

399-402: Guard onFlowGenerated before calling it.

The prop is optional in PropTypes (line 754: PropTypes.func without .isRequired), but called unconditionally at line 401, which will throw if omitted. Apply the guard:

-        onFlowGenerated()
+        if (typeof onFlowGenerated === 'function') onFlowGenerated()

Note: The old prop name onFlowGenerat is not found anywhere in the codebase, so the back-compat alias suggestion is unnecessary. The rename has been fully completed. Consider marking the prop as required in PropTypes if it's truly mandatory.

Copy link
Copy Markdown
Contributor

@HenryHengZJ HenryHengZJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@HenryHengZJ HenryHengZJ merged commit 9751598 into FlowiseAI:main Oct 30, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants