Skip to content

Comments

fix: Correct web search filter logic to consistently block filtered domains#19670

Merged
tjbck merged 2 commits intoopen-webui:devfrom
kjpoccia:web-search-block
Dec 2, 2025
Merged

fix: Correct web search filter logic to consistently block filtered domains#19670
tjbck merged 2 commits intoopen-webui:devfrom
kjpoccia:web-search-block

Conversation

@kjpoccia
Copy link
Contributor

@kjpoccia kjpoccia commented Dec 1, 2025

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. Not targeting the dev branch will lead to immediate closure of the PR.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: If necessary, update relevant documentation Open WebUI Docs like environment variables, the tutorials, or other documentation sources.
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description.
  • Agentic AI Code: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review AND manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR.
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

🔒 Web search filtering now correctly blocks results when any resolved hostname or IP address matches a blocked domain, preventing blocked sites from appearing due to permissive hostname resolution.

Description

Related issue: #19669

This PR fixes an issue in the web search filtering logic where blocked domains (prefixed with !) were still allowed if any resolved hostname or IP address passed the filter check.

Web search filtering currently behaves incorrectly when a domain resolves to multiple hostnames or IP addresses.
In get_filtered_results, each search result domain is resolved to:

  • the original domain (e.g. www.accuweather.com)
  • plus one or more IPv4 / IPv6 addresses
    Each value is then passed individually to is_string_allowed; if any one of those values is allowed, the entire search result is included.

This behavior contradicts the expected semantics of a block rule: if any resolved hostname or IP is blocked, the entire result should be excluded.

2025-12-01 15 46 43

The issue is caused by this logic:

if any(is_string_allowed(hostname, filter_list) for hostname in hostnames):
    filtered_results.append(result)

Changed

  • is_string_allowed now evaluates a sequence of resolved hostnames/IPs together to ensure block rules are applied consistently to an entire search result.

Additional Information

The logs show the issue clearly with the resolved hostnames logged in get_filtered_results. The hostnames are all associated with a single result, and because the resolved IP addresses are not explicitly blocked, the YouTube result is incorrectly allowed through.

DEBUG: hostnames in get_filtered_results: ['www.youtube.com', '172.217.4.46', '172.217.4.46', '172.217.4.46', '142.250.191.206', '142.250.191.206', '142.250.191.206', <ip list truncated for brevity>]

DEBUG: is_string_allowed string: [www.youtube.com,⁠](http://www.youtube.com,/) allow_list: [], block_list: ['www.accuweather.com', 'www.youtube.com']

DEBUG: is_string_allowed string: 172.217.4.46, allow_list: [], block_list: ['www.accuweather.com', 'www.youtube.com']

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

@pr-validator-bot
Copy link

👋 Welcome and Thank You for Contributing!

We appreciate you taking the time to submit a pull request to Open WebUI!

⚠️ Important: Testing Requirements

We've recently seen an increase in PRs that have significant issues:

  • PRs that don't actually fix the bug they claim to fix
  • PRs that don't implement the feature they describe
  • PRs that break existing functionality
  • PRs that are clearly AI-generated without proper testing being done by the author
  • PRs that simply don't work as intended

These untested PRs consume significant time from maintainers and volunteer contributors who review and test PRs in their free time.
Time that could be spent testing other PRs or improving Open WebUI in other ways.

Before marking your PR as "Ready for Review":

Please explicitly confirm:

  1. ✅ You have personally tested ALL changes in this PR
  2. How you tested it (specific steps you took to verify it works)
  3. Visual evidence where applicable (screenshots or videos showing the feature/fix working) - if applicable to your specific PR

If you're not certain your PR works exactly as intended, please leave it in DRAFT mode until you've thoroughly tested it.

Thank you for helping us maintain quality and respecting the time of our community! 🙏

@kjpoccia
Copy link
Contributor Author

kjpoccia commented Dec 1, 2025

Here are some videos of my tests. Note the logs confirm the results were blocked/allowed as appropriate (and the search didn't just happen to return results that fit the filtering).

Testing of allow list only:
2025-12-01 17 03 29

Testing of block list only:
2025-12-01 17 04 24

Testing of allow + block:
2025-12-01 17 13 06

@tjbck
Copy link
Contributor

tjbck commented Dec 2, 2025

Thanks!

@tjbck tjbck merged commit 6e53167 into open-webui:dev Dec 2, 2025
puffinjiang pushed a commit to puffinjiang/open-webui that referenced this pull request Dec 9, 2025
lentiann added a commit to ZalaziumGmbh/anox that referenced this pull request Dec 9, 2025
* refac

* refac

* fix(i18n): correct Thai translation in sidebar (open-webui#19363)

* Update translation.json (open-webui#19364)

* refac

* refac

* fix: translation

* refac: search chat postgres

* fix(i18n): comprehensive revision and improvement of all Thai translations across the app (open-webui#19377)

* Update translation.json (pt-BR) (open-webui#19384)

new translations of the newly added items

* refac/fix: chat search null byte filter

* refac: clean null bytes on load

* perf: 50x performance improvement for external embeddings (open-webui#19296)

* Update utils.py (open-webui#77)

Co-authored-by: Claude <[email protected]>

* refactor: address code review feedback for embedding performance improvements (open-webui#92)

Co-authored-by: Claude <[email protected]>

* fix: prevent sentence transformers from blocking async event loop (open-webui#95)

Co-authored-by: Claude <[email protected]>

---------

Co-authored-by: Claude <[email protected]>

* refac

* refac

* refac: models workspace optimization

* feat/enh: move chats in folder on delete

Co-Authored-By: expruc <[email protected]>

* refac: rm folder id on chat archive

* chore (open-webui#19389)

* Upd:i18n es-ES_Spanish Translation_v0.6.37 (open-webui#19388)

* Upd:i18n es-ES_Spanish Translation_v0.6.37

### es-ES Spanish Translation v0.6.37

Added new strings.

* Corrected string

* refac

* refac

* refac

* refac

* chore: user header forward minimize code changes throughout codebase (open-webui#19392)

* Update external.py

* remove unused imports

* Update ollama.py

* Update ollama.py

* Update ollama.py

* Update openai.py

* chore: google-genai bump

* chore: Update README (open-webui#19398)

* refac: disable single tilde

* refac: sources and citations

* refac

* refac

* enh: group members selector

* refac

* fix: kokorojs tts

* refac

* refac

* refac/fix: refresh folder chat list

* refac: folder page chat list

* chore: format

* refac

* chore: CHANGELOG 0.6.37 (open-webui#19126)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* refac

* refac

* refac: styling

* refac: prompt suggestions component

Co-Authored-By: Classic298 <[email protected]>

* refac

* refac

* refac: styling

* chore: format

* refac: styling

* refac

* refac: styling

* refac

* chore: format

* i18n: improve Chinese translation

* fix: hybrid search

* fix

* refac/fix: oauth

* fix: tool server save error handling

* chore: bump

* doc: changelog

* Update docker-build.yaml

* refac

* Update translation.json (pt-BR)

New translations of the items added in the latest version.

* fix: "No connection adapters were found" routers/images.py (open-webui#19435)

* Update knowledge.py (open-webui#19434)

* refac/fix: db operations

* Update translation.json (open-webui#19445)

Co-authored-by: Tim Baek <[email protected]>

* refac/breaking: docling params

* fix: inline citations

* refac/fix: group member user list

* feat/enh: async embedding processing setting

Co-Authored-By: Classic298 <[email protected]>

* refac

* feat/enh: tool server function name filter list

* refac

* refac: styling

* feat/enh: show user count in channels

* fix: ENABLE_CHAT_RESPONSE_BASE64_IMAGE_URL_CONVERSION env var

* refac

* feat: user list in channels

* chore: version bump

* refac

* refac: styling

* chore: add chardet (open-webui#19458)

* Update pyproject.toml

* Update requirements-min.txt

* Update requirements.txt

* Update requirements-min.txt

* Update requirements.txt

* Update pyproject.toml

* refac

* refac

* refac

* fix: i18n

* chore: format

* CHANGELOG: 0.6.39 (open-webui#19446)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* refac/enh: copy formatted table

* doc: changelog

* fix: changelog

* fix: postgres user list issue

* chore: bump

* chore: bump python-socketio==5.14.0

* Update CHANGELOG.md (open-webui#19463)

* Update CHANGELOG.md

* Update CHANGELOG.md

* refac: channel user list order by

* fix/refac: workspace shared model list

* Merge pull request open-webui#19464 from aleixdorca/dev

i18n: Update Catalan translation.json

* fix: user preview profile image

* refac/fix: function name filter type

* refac

* refac

* fix: update dependency to prevent rediss:// failure (open-webui#19488)

* Update pyproject.toml

* Update requirements.txt

* Update requirements-min.txt

* i18n: de-de (open-webui#19471)

* fix: async save docs to vector db

* chore: dep bump pypdf to ver 6.4.0 (open-webui#19508)

* Update pyproject.toml

* Update requirements.txt

* chore: Update pymilvus dep (open-webui#19507)

* Update requirements.txt

* Update pyproject.toml

* chore: update transformers dependency to fix issue open-webui#19512 (open-webui#19513)

* Update pyproject.toml

* Update requirements.txt

* Update requirements.txt

* Update pyproject.toml

* feat: also consider OAUTH_ROLES_SEPARATOR for string claims themselves (open-webui#19514)

* i18n: improve Chinese translation (open-webui#19497)

* refac

* refac

* refac/enh: knowledge base name on icon hover

* refac/enh: drop profile_image_url field in responses

* fix: correct role check on OAuth login (open-webui#19476)

When a users role is switched from admin to user in the OAuth provider
their groups are not correctly updated when ENABLE_OAUTH_GROUP_MANAGEMENT
is enabled.

* enh/feat: toggle folders & user perm

* refac

* fix: button without type (open-webui#19534)

* refac: chat history data structure

* enh: redis dict for internal models state

Co-Authored-By: cw.a <[email protected]>

* Update catalan translation.json (open-webui#19536)

* feat/enh: channels unread messages count

* refac/fix: files batch/add endpoint

* feat/enh: group export endpoint

* refac: hide channel add button for users

* refac

* refac

* refac

* feat: dm channels

* refac

* refac

* refac

* refac

* chore: format

* refac

* refac

* refac: styling

* refac

* Update french translation.json (open-webui#19547)

* refac: db

* refac

* refac: rm print

* refac

* refac/fix: db migration issue

* refac: hide active user count in sidebar user menu

* refac: profile preview

* enh: dm active user indicator

* refac: styling

* refac: user table db migration

* refac: oauth_sub -> oauth migration

* refac

* refac: api_key table migration

* refac: user oauth display

* refac

* enh/refac: deprecate USER_POOL

* refac

* refac: pin icons

* refac: admin user list active indicator

* refac

* feat/enh: pinned messages in channels

* refac

* refac: styling

* refac: styling

* refac/enh: channel message

* refac

* refac/fix: ollama model delete

* refac/fix: temp chat image generation

* refac: db group

* refac

* refac: styling

* refac

* Update middleware.py

* refac

* refac: knowledge file delete behaviour

* enh: message reaction user names

* refac

* refac

* refac

* refac: styling

* refac

* refac: styling

* refac: styling

* refac

* refac

* feat/enh: group channel

* refac

* feat/enh: add/remove users from group channel

* refac

* refac

* feat/enh: dm from user profile preview

* Update translation.json (pt-BR) (open-webui#19603)

translations of the new items that have been included

* refac

* refac

* refac

* refac

* refac

* chore: otel bump

* chore: otel bump

* i18n: improve Chinese translation (open-webui#19651)

* fix: audit

* feat/enh: user status

* refac

* refac

* Chore: dep bump (open-webui#19667)

* Update pyproject.toml

* Update requirements-min.txt

* Update requirements.txt

---------

Co-authored-by: Tim Baek <[email protected]>

* refac

* feat: signin rate limit

* Update milvus_multitenancy.py (open-webui#19680)

* refac

* refac

* fix/adjust web search to properly block domains (open-webui#19670)

Co-authored-by: Tim Baek <[email protected]>

* refac

* refac

* refac

* refac: styling

* refac: show connection type for custom models

* refac

* refac

* feat/enh: kb files db migration

* refac

* refac/perf: has_access_to_file optimization

* enh: group members endpoint

* refac

* refac

* feat: Adds document intelligence model configuration (open-webui#19692)

* Adds document intelligence model configuration

Enables the configuration of the Document Intelligence model to be used by the RAG pipeline.

This allows users to specify the model they want to use for document processing, providing flexibility and control over the extraction process.

* Added Titel to Document Intelligence Model Config

Added Titel to Document Intelligence Model Config

* Fix dropdown backgrounds (open-webui#19693)

* refac

* fix: Update milvus.py (open-webui#19602)

* Update milvus.py

* Update milvus.py

* Update milvus.py

* Update milvus.py

* Update milvus.py

---------

Co-authored-by: Tim Baek <[email protected]>

* Update milvus_multitenancy.py (open-webui#19695)

* Update translation.json (open-webui#19696)

* chore: format

* fix: Default Group ID assignment on SSO/OAUTH and LDAP (open-webui#19685)

* fix (open-webui#99)

Co-authored-by: Tim Baek <[email protected]>
Co-authored-by: Claude <[email protected]>

* Update auths.py

* unified logic

* PUSH

* remove getattr

* rem getattr

* whitespace

* Update oauth.py

* trusted header group sync

Added default group re-application after trusted header group sync

* not apply after syncs

* .

* rem

---------

Co-authored-by: Tim Baek <[email protected]>
Co-authored-by: Claude <[email protected]>

* Update translation.json (open-webui#19697)

* Update translation.json

* Update translation.json

* chore: bump

* refac

* chore: 0.6.41 Changelog (open-webui#19473)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* chore: format

* Fixes for requirements and audio

---------

Co-authored-by: Timothy Jaeryang Baek <[email protected]>
Co-authored-by: Siwadon S. (Jay) <[email protected]>
Co-authored-by: Classic298 <[email protected]>
Co-authored-by: joaoback <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: expruc <[email protected]>
Co-authored-by: _00_ <[email protected]>
Co-authored-by: Shirasawa <[email protected]>
Co-authored-by: Alexandr Promakh <[email protected]>
Co-authored-by: Aleix Dorca <[email protected]>
Co-authored-by: gerhardj-b <[email protected]>
Co-authored-by: Tobias Genannt <[email protected]>
Co-authored-by: stevessr <[email protected]>
Co-authored-by: cw.a <[email protected]>
Co-authored-by: RomualdYT <[email protected]>
Co-authored-by: Poccia <[email protected]>
Co-authored-by: Henne <[email protected]>
Co-authored-by: Matthew Kusz <[email protected]>
@fluxik
Copy link

fluxik commented Dec 10, 2025

Using OpenWebUI v0.6.41 domain filtering is not working for me. I've saved changes on Web Search page and restarted open-webui multiple times. After reload changes are persist but domain list is not actually used during filtering.

OpenWebUI settings:
image

Redacted backend/open_webui/utils/misc.py to debug filter list:
image

Output in console during web search:
[DEBUG] FILTER LIST = ['!169.254.169.254', '!metadata.google.internal', '!metadata.azure.com', '!fd00:ec2::254', '!100.100.100.200']

@Classic298
Copy link
Collaborator

@fluxik that setting in the admin panel is only for the web search, not for web fetching.

@fluxik
Copy link

fluxik commented Dec 11, 2025

@Classic298 ok, i have also configured env param "WEB_FETCH_FILTER_LIST", but still zero filtering.

And i figured out why, because in backend/open_webui/retrieval/web/ollama.py function search_ollama_cloud doesn't have get_filtered_results(results, filter_list).

@Classic298
Copy link
Collaborator

@fluxik PR welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants