feat(web-search): set Exa as default provider via auto-detection (EXA_API_KEY activates Exa)#35062
feat(web-search): set Exa as default provider via auto-detection (EXA_API_KEY activates Exa)#35062louiswalsh wants to merge 1 commit intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3c0475f7ac
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| suppressToolErrorWarnings: z.boolean().optional(), | ||
| lightContext: z.boolean().optional(), | ||
| }) |
There was a problem hiding this comment.
Restore
heartbeat.lightContext in runtime config schema
HeartbeatSchema is strict, so removing lightContext here turns heartbeat.lightContext into an unknown key and causes config validation to fail for existing installations that already use this flag. The runtime still reads this option (for example, heartbeat execution checks heartbeat?.lightContext to choose lightweight bootstrap context), so this change introduces a backwards-incompatible break where valid prior configs are now rejected at load time.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds Exa as the 6th Critical issue:
Minor issue: The Zod schema for Confidence Score: 2/5
Last reviewed commit: 3c0475f |
| prompt: z.string().optional(), | ||
| ackMaxChars: z.number().int().nonnegative().optional(), | ||
| suppressToolErrorWarnings: z.boolean().optional(), | ||
| lightContext: z.boolean().optional(), | ||
| }) |
There was a problem hiding this comment.
lightContext: z.boolean().optional() was removed from HeartbeatSchema, but the field is still in use throughout the codebase:
src/infra/heartbeat-runner.ts:747readsheartbeat?.lightContext === truesrc/config/types.agent-defaults.ts:248still declareslightContext?: boolean- Tests still verify this behavior
Because HeartbeatSchema uses .strict(), any user with heartbeat.lightContext: true in their config file will now receive a Zod validation error at startup. This is a breaking change.
| prompt: z.string().optional(), | |
| ackMaxChars: z.number().int().nonnegative().optional(), | |
| suppressToolErrorWarnings: z.boolean().optional(), | |
| lightContext: z.boolean().optional(), | |
| }) | |
| ackMaxChars: z.number().int().nonnegative().optional(), | |
| suppressToolErrorWarnings: z.boolean().optional(), | |
| lightContext: z.boolean().optional(), | |
| }) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/zod-schema.agent-runtime.ts
Line: 33-36
Comment:
`lightContext: z.boolean().optional()` was removed from `HeartbeatSchema`, but the field is still in use throughout the codebase:
- `src/infra/heartbeat-runner.ts:747` reads `heartbeat?.lightContext === true`
- `src/config/types.agent-defaults.ts:248` still declares `lightContext?: boolean`
- Tests still verify this behavior
Because `HeartbeatSchema` uses `.strict()`, any user with `heartbeat.lightContext: true` in their config file will now receive a Zod validation error at startup. This is a breaking change.
```suggestion
ackMaxChars: z.number().int().nonnegative().optional(),
suppressToolErrorWarnings: z.boolean().optional(),
lightContext: z.boolean().optional(),
})
```
How can I resolve this? If you propose a fix, please make it concise.| exa: z | ||
| .object({ | ||
| apiKey: SecretInputSchema.optional().register(sensitive), | ||
| numResults: z.number().int().positive().optional(), | ||
| highlightsMaxChars: z.number().int().positive().optional(), | ||
| }) | ||
| .strict() | ||
| .optional(), |
There was a problem hiding this comment.
The Zod schema for exa.numResults accepts any positive integer, but at runtime resolveSearchCount() (called in createWebSearchTool) silently clamps the value to MAX_SEARCH_COUNT (10). A user who sets exa.numResults: 25 will pass config validation but only receive 10 results — with no warning.
Consider adding .max(10) to the schema to catch this at config-parse time, or add a comment in types.tools.ts documenting the 1–10 cap so users understand the constraint.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/zod-schema.agent-runtime.ts
Line: 309-316
Comment:
The Zod schema for `exa.numResults` accepts any positive integer, but at runtime `resolveSearchCount()` (called in `createWebSearchTool`) silently clamps the value to `MAX_SEARCH_COUNT` (10). A user who sets `exa.numResults: 25` will pass config validation but only receive 10 results — with no warning.
Consider adding `.max(10)` to the schema to catch this at config-parse time, or add a comment in `types.tools.ts` documenting the 1–10 cap so users understand the constraint.
How can I resolve this? If you propose a fix, please make it concise.3c0475f to
3081754
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30817541c5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/tools/web-search.ts
Outdated
| url, // Keep raw for tool chaining | ||
| description: description ? wrapWebContent(description, "web_search") : "", | ||
| published: entry.publishedDate || undefined, | ||
| author: entry.author || undefined, |
There was a problem hiding this comment.
Wrap Exa author metadata as untrusted external content
runExaSearch marks the payload as externalContent.wrapped: true, but author is returned raw while title/description are wrapped. Exa author values come from web page metadata and can contain arbitrary text, so attacker-controlled pages can inject unwrapped instructions into web_search output despite the wrapper contract. This weakens the prompt-injection boundary for any workflow that consumes Exa search results.
Useful? React with 👍 / 👎.
src/agents/tools/web-search.ts
Outdated
| function resolveExaApiKey(exa?: ExaSearchConfig): string | undefined { | ||
| const fromConfig = normalizeApiKey(exa?.apiKey); | ||
| if (fromConfig) { |
There was a problem hiding this comment.
Resolve SecretInput refs when reading Exa API key
tools.web.search.exa.apiKey is validated as SecretInputSchema, but resolveExaApiKey only normalizes plain strings. If a user configures this field as a SecretRef object (which validation allows), it is treated as empty and the code falls back to env vars, producing a false missing_exa_api_key error when EXA_API_KEY is unset. This is a runtime mismatch between accepted config shape and key resolution.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5075f15bca
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| : undefined; | ||
| const paths = [ | ||
| "apiKey", | ||
| "exa.apiKey", |
There was a problem hiding this comment.
Include exa in selected-provider gating for web search secrets
Adding exa.apiKey to the resolved-path list without also recognizing provider: "exa" in selectedProvider causes explicit Exa configs to be treated like auto-detect mode (selectedProvider === undefined). In that state, secret refs for other providers (for example grok.apiKey) are incorrectly marked active and resolved; if those refs are unresolved/missing, prepareSecretsRuntimeSnapshot will throw and block startup even though those providers are inactive for this config.
Useful? React with 👍 / 👎.
Adds Exa (exa.ai) as a bundled web search provider using the plugin architecture. Uses highlights-based content retrieval matching the Hermes agent call signature, with x-exa-integration: openclaw header. - New extensions/exa/ directory with plugin config, entry point, and search provider implementation - autoDetectOrder: 5 (first in auto-detect chain when EXA_API_KEY set) - Supports date_after/date_before filtering via ISO-8601 date ranges - Updates bundled plugin IDs, contract registry, and all related tests Co-Authored-By: Louis Walsh <[email protected]>
6e8e5b0 to
8937b55
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8937b55240
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| register(api) { | ||
| api.registerWebSearchProvider(createExaWebSearchProvider()); |
There was a problem hiding this comment.
Wire Exa into the live
web_search tool path
Registering the Exa plugin here does not make it reachable in production yet. src/agents/openclaw-tools.ts:87-90 still exposes createWebSearchTool(), and that implementation only resolves brave|perplexity|grok|gemini|kimi (src/agents/tools/web-search.ts:261-268 and 1337-1376). In other words, an agent session will keep using the old hard-coded provider stack even when EXA_API_KEY is set, so the feature added by this commit never activates.
Useful? React with 👍 / 👎.
| id: "exa", | ||
| label: "Exa Search", | ||
| hint: "AI-native semantic search · date filters · highlights", | ||
| envVars: ["EXA_API_KEY"], |
There was a problem hiding this comment.
Add
exa to the runtime web-search config schema
This introduces a new provider id, but the validated config surface still rejects tools.web.search.provider: "exa". src/config/zod-schema.agent-runtime.ts:261-269 and src/config/types.tools.ts:433-475 only allow the pre-existing five providers, so users cannot explicitly pin Exa in config or have config-writing flows persist that selection. The only way to use Exa is to leave provider unset and rely on auto-detection, which is a regression for anyone who already has another search provider configured.
Useful? React with 👍 / 👎.
|
Good call raising this follow-up. This is related, but I am not treating it as the same change. #52617 landed the Exa bundled provider itself. This PR is about a different policy question: whether Exa should become the default auto-detect choice when I'm leaving this separate for now because that selection policy diverges from the landed scope. If you want this folded differently, tell me the intended auto-detect rule and I can reassess it quickly. |
Summary
EXA_API_KEYis sufficient to activate Exa (no explicitproviderconfig required). Exa is first in the auto-detect priority chain; when the key is absent, the chain falls through to Brave.web_fetch, no breaking changes to existing configs.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
"exa"option fortools.web.search.providertools.web.search.exa.apiKey,tools.web.search.exa.numResults(default: 5),tools.web.search.exa.highlightsMaxChars(default: 8000)EXA_API_KEY— when set with no explicitproviderconfig, Exa is selected automatically"Search the web using Exa search. Returns structured results with titles, URLs, and highlights."EXA_API_KEY: no behavior change — chain falls through to Brave as beforeSecurity Impact (required)
EXA_API_KEYenv var /exa.apiKeyconfig, follows same pattern as existing providershttps://api.exa.ai/searchwithx-exa-integration: "openclaw"headerEXA_API_KEYis present. Uses existingwithTrustedWebSearchEndpointwrapper with timeout and abort support.Repro + Verification
Environment
{ tools: { web: { search: { exa: { apiKey: "exa-..." } } } } }(no explicitproviderneeded)Steps
EXA_API_KEYin environment (no other config change required)web_searchtool with a queryExpected
numResults(default 5) with highlights limited tohighlightsMaxChars(default 8000)Actual
Evidence
20 tests passing in
config.web-search-provider.test.ts, 47 inweb-search.test.ts.Human Verification (required)
EXA_API_KEYset, Exa priority over Brave when both keys present, fallback to Brave whenEXA_API_KEYabsentprovideroverride still respectedCompatibility / Migration
EXA_API_KEYenv var andtools.web.search.exaconfig blockFailure Recovery (if this breaks)
EXA_API_KEYor setprovider: "brave"in config — falls back to BraveEXA_API_KEYis invalid, the tool returns an API error message (same pattern as other providers)Risks and Mitigations
EXA_API_KEYis unset, there is zero impact on existing usersEXA_API_KEYfor other purposesEXA_API_KEY) and only read by the web_search tool resolver