Skip to content

Feat/Windows Search Index file search#54431

Open
markguo2016 wants to merge 7 commits intoopenclaw:mainfrom
markguo2016:fix/windows-search-plugin-loader
Open

Feat/Windows Search Index file search#54431
markguo2016 wants to merge 7 commits intoopenclaw:mainfrom
markguo2016:fix/windows-search-plugin-loader

Conversation

@markguo2016
Copy link
Copy Markdown

PR: windows-search (Windows Search Index file search)

  • Plugin id: windows-search
  • Tool name: windows_file_search
  • Add plugin config file extensions/windows-search/openclaw.plugin.json
  • Add extensions/windows-search/package.json for plugin dependencies and metadata
  • Implement windows_file_search tool in extensions/windows-search/index.ts to query the native Windows Search Index (Search.CollatorDSO) for fast filename searches

Summary

  • Problem: directory traversal is slow and scales poorly on large disks for filename search.
  • Why it matters: Windows already maintains an index; using it makes searches near-instant for Windows users.
  • What changed: add a Windows-only tool (windows_file_search) backed by the Windows Search Index, with common filters and user-friendly failure messages.
  • What did NOT change (scope boundary): no cross-platform filesystem search; no file contents search; only indexed metadata/paths.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #N/A
  • Related #N/A
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

N/A

Regression Test Plan (if applicable)

N/A

User-visible / Behavior Changes

  • New tool: windows_file_search
  • Supports filters: scope, extension, days_ago, limit
  • scope uses SCOPE = 'file:.../' and normalizes paths:
    • C:\Users\me\Documentsfile:C:/Users/me/Documents/
    • \\server\share\dirfile://server/share/dir/
  • Returns clear errors when Windows Search service (WSearch) is not running or the ADO/COM query fails
  • UTF-8 output handling to avoid garbled non-ASCII paths on zh-CN systems
  • Bilingual output support (auto/zh/en)

Tool Parameters

  • query (string, required): filename keyword (matches System.FileName LIKE '%query%')
  • limit (number, optional, default 20): max results, clamped to [1, 1000]
  • extension (string, optional): txt or .txt (normalized to .txt)
  • scope (string, optional): folder path or file: scope; normalized to file:.../
  • days_ago (number, optional): modified within last N days, clamped to [0, 3650] (0 disables the filter)

Security Impact (required)

  • New permissions/capabilities? (Yes/No) Yes (adds a new search tool)
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) Yes (new tool entrypoint)
  • Data access scope changed? (Yes/No) Yes (can enumerate indexed file paths/metadata)
  • If any Yes, explain risk + mitigation:
    • Risk: tool can reveal file paths that are present in the Windows index.
    • Mitigation: Windows-only; explicit user invocation; result limiting; restrictable scope (scope) and time filter (days_ago); does not read file contents.

Repro + Verification

Environment

  • OS: Windows (requires Windows Search / WSearch)
  • Runtime/container: Node >= 22.16.0
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): N/A

Steps

  1. Ensure Windows Search is enabled and running: open services.msc, start “Windows Search” (service name: WSearch).
  2. Ensure the target folder is indexed: open “Indexing Options” and include the folder (or wait for indexing).
  3. Run example tool calls:
    • {"query":"report"}
    • {"query":"report","scope":"C:\\Users\\me\\Documents","extension":"pdf","days_ago":7,"limit":50}

Expected

  • Returns up to limit matching file results quickly, honoring scope/extension/days_ago filters.
  • On missing/disabled Windows Search, returns a clear error message explaining how to enable/start WSearch.

Actual

  • Pending reviewer verification on a Windows machine with indexing enabled.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: N/A (not executed in this write-up)
  • Edge cases checked: N/A
  • What you did not verify: actual search results across different index states and UNC scopes

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: remove/disable the windows-search plugin or revert the extensions/windows-search changes
  • Files/config to restore:
    • extensions/windows-search/index.ts
    • extensions/windows-search/openclaw.plugin.json
    • extensions/windows-search/package.json
  • Known bad symptoms reviewers should watch for: tool errors about ADO/COM, WSearch not running, or empty results due to missing indexing coverage

Risks and Mitigations

  • Risk: behavior depends on local Windows indexing coverage/state and can appear to “miss” files.
    • Mitigation: document indexing prerequisites and expose scope to guide users toward indexed locations

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation size: M labels Mar 25, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba6d07688e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +277 to +280
const filePrefix = raw.match(/^file:(\/\/\/|\/\/)?/i);
let rest = raw;
if (filePrefix) {
rest = raw.slice(filePrefix[0].length);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve UNC host when normalizing file:// scopes

normalizeWindowsSearchScope strips the file:// prefix and then falls through to the generic path branch for URI-style UNC scopes, so file://server/share/dir becomes file:server/share/dir/ instead of file://server/share/dir/. Because the SQL filter uses SCOPE = '<normalized>', this malformed scope will not match indexed UNC paths and the tool can return no results for valid file: inputs on network shares.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR adds a new windows-search plugin that exposes a windows_file_search tool backed by the Windows Search Index (via an ADODB COM query executed through an inline PowerShell script). The plugin is Windows-only, self-contained, and follows the established extension conventions.

Key changes:

  • New plugin in extensions/windows-search/ with platform guard, input sanitisation, and user-friendly error messages for missing/stopped WSearch service.
  • Config baseline and bundled-plugin-metadata generated files updated to register the plugin.

Issues found:

  • P1 – Broken ESCAPE clause: The SQL LIKE ESCAPE clause is given '\\' (two backslashes) because PowerShell double-quoted strings pass \\ literally. The Windows Search OLE DB provider expects a single-character escape argument; this makes the wildcard escaping for _, %, [, and ] in search queries non-functional.
  • P2 – Empty query enumerates all files: query is declared as Type.String() with no minLength, allowing an empty string that generates LIKE '%%' and matches every indexed file.

Confidence Score: 3/5

  • Safe to merge once the ESCAPE clause bug is fixed; the empty-query issue is a lower-priority hardening item.
  • The P1 ESCAPE clause bug means that any search query containing _, %, [, or ] will silently treat those characters as SQL LIKE wildcards instead of literals, producing incorrect (over-broad) results. The fix is a one-character change. The P2 empty-query issue is secondary. All other aspects (platform guard, input clamping, SQL injection escaping logic, error handling, UTF-8 encoding, plugin registration) look correct.
  • extensions/windows-search/index.ts — line 128 (ESCAPE clause) and line 20 (query minLength)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/windows-search/index.ts
Line: 128

Comment:
**ESCAPE clause uses an invalid two-character escape character**

In a PowerShell double-quoted string, `\\` is two literal backslash characters (backslash is not an escape character in PowerShell). The SQL therefore receives `ESCAPE '\\'`, which is the two-character string `\\`. The SQL `LIKE … ESCAPE` clause requires a **single** character — passing two makes the escape character definition undefined/invalid depending on the Windows Search OLE DB provider's behavior.

When the escape character is not correctly defined, `_`, `%`, `[`, and `]` that appear in the search query will be interpreted as LIKE wildcards instead of literal characters. For example, searching for `100%_report` would return far broader results than expected and the escaping work done by `$queryLike.Replace(…)` would be silently bypassed.

The fix is to use a single backslash: `ESCAPE '\'`. In a PowerShell double-quoted string, a bare `\` is already literal — no further escaping is needed.

```suggestion
    $whereParts += "System.FileName LIKE '%$queryLike%' ESCAPE '\'"
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/windows-search/index.ts
Line: 20-21

Comment:
**Empty query silently matches all indexed files**

`Type.String()` imposes no minimum length, so the framework will accept an empty string for `query`. After trimming and slicing, an empty query produces `$queryLike = ""`, and the generated SQL becomes:

```sql
System.FileName LIKE '%%' ESCAPE '\'
```

This matches every file in the index (or within the specified scope). Combined with the default `limit` of 20 this may be tolerable, but it is undocumented behaviour that could surprise callers and can enumerate a lot of path metadata if `limit` is raised.

Adding `minLength: 1` makes the schema self-documenting and lets the framework reject empty strings before any PowerShell is spawned:

```suggestion
          query: Type.String({ description: "Keyword to search for in filenames.", minLength: 1 }),
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(windows-search): escape LIKE and nor..." | Re-trigger Greptile

$rs = New-Object -ComObject ADODB.Recordset
$con.Open("Provider=Search.CollatorDSO;Extended Properties='Application=Windows';")
$whereParts = @()
$whereParts += "System.FileName LIKE '%$queryLike%' ESCAPE '\\'"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ESCAPE clause uses an invalid two-character escape character

In a PowerShell double-quoted string, \\ is two literal backslash characters (backslash is not an escape character in PowerShell). The SQL therefore receives ESCAPE '\\', which is the two-character string \\. The SQL LIKE … ESCAPE clause requires a single character — passing two makes the escape character definition undefined/invalid depending on the Windows Search OLE DB provider's behavior.

When the escape character is not correctly defined, _, %, [, and ] that appear in the search query will be interpreted as LIKE wildcards instead of literal characters. For example, searching for 100%_report would return far broader results than expected and the escaping work done by $queryLike.Replace(…) would be silently bypassed.

The fix is to use a single backslash: ESCAPE '\'. In a PowerShell double-quoted string, a bare \ is already literal — no further escaping is needed.

Suggested change
$whereParts += "System.FileName LIKE '%$queryLike%' ESCAPE '\\'"
$whereParts += "System.FileName LIKE '%$queryLike%' ESCAPE '\'"
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/windows-search/index.ts
Line: 128

Comment:
**ESCAPE clause uses an invalid two-character escape character**

In a PowerShell double-quoted string, `\\` is two literal backslash characters (backslash is not an escape character in PowerShell). The SQL therefore receives `ESCAPE '\\'`, which is the two-character string `\\`. The SQL `LIKE … ESCAPE` clause requires a **single** character — passing two makes the escape character definition undefined/invalid depending on the Windows Search OLE DB provider's behavior.

When the escape character is not correctly defined, `_`, `%`, `[`, and `]` that appear in the search query will be interpreted as LIKE wildcards instead of literal characters. For example, searching for `100%_report` would return far broader results than expected and the escaping work done by `$queryLike.Replace(…)` would be silently bypassed.

The fix is to use a single backslash: `ESCAPE '\'`. In a PowerShell double-quoted string, a bare `\` is already literal — no further escaping is needed.

```suggestion
    $whereParts += "System.FileName LIKE '%$queryLike%' ESCAPE '\'"
```

How can I resolve this? If you propose a fix, please make it concise.

@markguo2016 markguo2016 force-pushed the fix/windows-search-plugin-loader branch from 25c1126 to be3af41 Compare March 25, 2026 12:00
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be3af41329

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +67 to +68
const trimmedQuery = String(query ?? "").trim();
const boundedQuery = trimmedQuery.slice(0, MAX_QUERY_LEN);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject blank queries before issuing index search

When query is empty or whitespace, trimmedQuery becomes "" and the SQL predicate becomes System.FileName LIKE '%%', which matches essentially every indexed file. In that case the tool can return unrelated paths (up to limit) and unintentionally broaden file enumeration instead of doing a keyword search. Add an explicit post-trim guard that returns a validation error when the query is blank.

Useful? React with 👍 / 👎.

@markguo2016 markguo2016 force-pushed the fix/windows-search-plugin-loader branch from be3af41 to 2bdf2bc Compare March 25, 2026 12:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2bdf2bc0f5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +308 to +310
const forward = rest.replace(/\\/g, "/");
const cleaned = forward.replace(/^\/+/, "").replace(/\/+$/g, "") + "/";
return `file:${cleaned}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve authority for // UNC scope inputs

Handle forward-slash UNC scopes (//server/share/...) as network authorities instead of generic paths. The UNC matcher only recognizes backslash form, so //... falls through to the fallback branch and is normalized to file:server/share/.../ (missing //). Because the SQL filter uses exact SCOPE = '<normalized>', this malformed scope will not match indexed UNC items (file://server/...) and the tool returns empty results for valid network-share scopes entered with forward slashes.

Useful? React with 👍 / 👎.

- Add plugin config file `openclaw.plugin.json`
- Add `package.json` to define plugin dependencies and metadata
- Implement core search logic in `index.ts` to quickly locate files via the Windows Search Index
- Support search parameters such as filename keywords, extension filters, directory scopes, and time ranges
- Handle Windows Search service availability and query errors
- Provide standardized error responses and a consistent JSON format for search results
- Add plugin config file `openclaw.plugin.json`
- Add `package.json` to define plugin dependencies and metadata
- Implement core search logic in `index.ts` to quickly locate files via the Windows Search Index
- Support search parameters such as filename keywords, extension filters, directory scopes, and time ranges
- Handle Windows Search service availability and query errors
- Provide standardized error responses and a consistent JSON format for search results
…xtensions/windows-search 对 openclaw 的 workspace 链接)
@markguo2016 markguo2016 force-pushed the fix/windows-search-plugin-loader branch from 2bdf2bc to 5441c34 Compare March 25, 2026 14:21
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5441c34e0c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


const effective = {
limit: safeLimit,
extension: normalizedExtension || undefined,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use validated extension in effective response

If extension fails extPattern, the SQL filter is dropped because validatedExtension is empty, but effective.extension is still populated from normalizedExtension. That means the tool reports an extension filter that was not actually applied, so callers can trust filtered results when the query was really broadened (for example with ".tar.gz" or other invalid extension strings). Use validatedExtension (or return a validation error) so the reported effective filters match the executed search.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant