fix: skip GitHub PR review diff tables from translation#1175
fix: skip GitHub PR review diff tables from translation#1175mengxi-ream merged 1 commit intomainfrom
Conversation
🦋 Changeset detectedLatest commit: 4452c8d The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Documentation Updates 2 document(s) were updated by changes in this PR: Auto Translation FeaturesView Changes@@ -47,7 +47,7 @@
Mathematical formulas and notation are preserved by fully excluding all MathML elements from translation. All 32 MathML Core tags (such as `math`, `mi`, `mo`, `mfrac`, `msub`, `msup`, `mrow`, etc.) are recognized and excluded, ensuring mathematical expressions remain intact and unaltered.
## Improved Handling of Academic Content
-Special handling is included for academic sites and code listings. For example, code blocks on arXiv.org (`.ltx_listing`) and Reddit-specific elements (such as `faceplate-screen-reader-content > *`, `reddit-header-large *`, `shreddit-comment-action-row > *`, and `shreddit-post-flair`) are excluded from translation. These custom selectors are centrally managed for maintainability.
+Special handling is included for academic sites and code listings. For example, code blocks on arXiv.org (`.ltx_listing`), GitHub PR review diff tables (`table.diff-table`), and Reddit-specific elements (such as `faceplate-screen-reader-content > *`, `reddit-header-large *`, `shreddit-comment-action-row > *`, and `shreddit-post-flair`) are excluded from translation. These custom selectors are centrally managed for maintainability.
## Enabling Auto Translation for URL Patterns
Auto translation is triggered for web pages whose URLs match patterns specified in the `autoTranslatePatterns` array within the `translate.page` configuration. **Domain pattern matching is precise and respects TLD boundaries:**Translation Toggle Logic and Content DetectionView Changes@@ -35,7 +35,10 @@
- `shreddit-comment-action-row > *` - Comment action buttons
- `shreddit-post-flair` - Post flair tags (to prevent misalignment during translation)
-These site-specific exclusions ensure that UI elements, navigation, and metadata are not translated, preventing visual misalignment and maintaining the intended user experience on each platform.
+For github.com, the following elements are excluded:
+- `table.diff-table` - PR review diff tables (to prevent code diffs in review comments from being translated)
+
+These site-specific exclusions ensure that UI elements, navigation, metadata, and code snippets are not translated, preventing visual misalignment and maintaining the intended user experience on each platform.
**Exclusion of Structural Elements in Main Content Mode:**
In main content mode (when `config.translate.page.range` is not set to `'all'`), top-level structural elements such as `<header>`, `<footer>`, `<nav>`, and `<aside>` are excluded from translation to avoid translating site-wide navigation and structural elements. However, when these same elements appear inside content containers like `<article>` or `<main>`, they are included in translation, as they are likely to be content-specific (such as article titles within a `<header>` inside an `<article>`).
@@ -117,7 +120,7 @@
Additional unit tests verify that elements with the `sr-only` or `visually-hidden` classes, elements with `aria-hidden="true"`, and similar hidden elements are excluded from translation and text extraction. This ensures that only visible content is processed, preventing hidden or accessibility-only content from being translated or included in extracted text. Tests also verify that `isDontWalkIntoButTranslateAsChildElement` correctly excludes `sr-only` and `visually-hidden` elements (returning false), while `isDontWalkIntoAndDontTranslateAsChildElement` correctly identifies these elements for complete exclusion.
**Site-Specific Exclusion Tests:**
-Unit tests verify that site-specific elements in the `CUSTOM_DONT_WALK_INTO_ELEMENT_SELECTOR_MAP` are correctly excluded. For www.reddit.com, tests confirm that `shreddit-post-flair` elements are identified by `isCustomDontWalkIntoElement` and excluded by `isDontWalkIntoAndDontTranslateAsChildElement`, preventing flair tag misalignment during Reddit translations.
+Unit tests verify that site-specific elements in the `CUSTOM_DONT_WALK_INTO_ELEMENT_SELECTOR_MAP` are correctly excluded. For www.reddit.com, tests confirm that `shreddit-post-flair` elements are identified by `isCustomDontWalkIntoElement` and excluded by `isDontWalkIntoAndDontTranslateAsChildElement`, preventing flair tag misalignment during Reddit translations. For github.com, tests confirm that `table.diff-table` elements are correctly identified and excluded, ensuring that PR review code snippets are not translated and maintaining the integrity of code diffs displayed in review comments.
**Content Container Context Tests:**
Unit tests verify that structural elements like `<header>`, `<footer>`, and `<nav>` are correctly excluded or included based on their context. The tests confirm that: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4452c8d5ab
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "header *", | ||
| "#repository-container-header *", | ||
| "[class*=\"OverviewContent-module__Box_1--\"] *", | ||
| "table.diff-table", // https://github.com/mengxi-ream/read-frog/issues/1174 |
There was a problem hiding this comment.
Match diff-table descendants in no-walk selector
The new GitHub rule only matches the <table> node itself, so descendants like <td> are still considered walkable when translation starts from inside the table (for example via removeOrShowNodeTranslation → walkAndLabelElement, which checks the current element rather than ancestor selectors). In that flow, review diff snippet text can still be translated, so the fix is incomplete unless the selector also covers descendants (e.g. table.diff-table *) or ancestor checks are applied consistently.
Useful? React with 👍 / 👎.
Type of Changes
Description
Skip GitHub PR review code snippets rendered as
table.diff-tableduring page translation.This fixes the case where review/comment code blocks on GitHub PR pages were being extracted as normal page text and translated.
Closes #1174.
How Has This Been Tested?
Tested locally with:
pnpm exec vitest run src/utils/host/dom/__tests__/custom-dont-walk.test.ts src/utils/host/__tests__/translate.integration.test.tsxpnpm exec eslint src/utils/constants/dom-rules.ts src/utils/host/dom/__tests__/custom-dont-walk.test.ts src/utils/host/__tests__/translate.integration.test.tsxChecklist
Additional Information
The changeset file is included in the branch as:
.changeset/github-pr-comment-diff-table.mdSummary by cubic
Skip translating GitHub PR review diff tables so code snippets in review comments stay intact. Fixes #1174.
table.diff-tableon GitHub PR pages, blocking descendants.Written for commit 4452c8d. Summary will update on new commits.