Skip to content

Conversation

@allilevine
Copy link
Contributor

@allilevine allilevine commented Jul 21, 2025

Submission Review Guidelines:

Changes proposed in this Pull Request:

How to test the changes in this Pull Request:

  • Check that tests pass.

Testing that has already taken place:

Changelog entry

  • Automatically create a changelog entry from the details below.
  • This Pull Request does not require a changelog entry. (Comment required below)
Changelog Entry Details

Significance

  • Patch
  • Minor
  • Major

Type

  • Fix - Fixes an existing bug
  • Add - Adds functionality
  • Update - Update existing functionality
  • Dev - Development related task
  • Tweak - A minor adjustment to the codebase
  • Performance - Address performance issues
  • Enhancement - Improvement to existing functionality

Message

Changelog Entry Comment

Comment

Integrated the convert class from the external HTML to Text library and removed the dependency from the Email Editor package.

@allilevine allilevine self-assigned this Jul 21, 2025
@github-actions github-actions bot added the focus: monorepo infrastructure Issues and PRs related to monorepo tooling. label Jul 21, 2025
@allilevine allilevine force-pushed the wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor branch 2 times, most recently from 63f793b to ebeb589 Compare July 21, 2025 19:50
// Check all options are valid.
foreach ( $options as $key => $value ) {
if ( ! in_array( $key, array_keys( static::default_options() ), true ) ) {
// phpcs:ignore WordPress.Security.EscapeOutput.ExceptionNotEscaped
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that this package needs to work standalone without WordPress core functions available, so using htmlspecialchars() instead of esc_html(). Let me know if that's an incorrect assumption.

}

if ( ! empty( $options['ignore_errors'] ) ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring snake case rule because these are PHP's built-in property names from the DOM classes.

* @return bool True if this appears to be an Office document.
*/
public static function is_office_document( string $html ): bool {
return strpos( $html, 'urn:schemas-microsoft-com:office' ) !== false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using strpos instead of str_contains here (and str_starts_with elsewhere in the file) because of the PHP 7.4 compatability requirement.

@@ -0,0 +1,622 @@
<?php
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allilevine allilevine requested a review from Copilot July 21, 2025 20:54
@github-actions
Copy link
Contributor

github-actions bot commented Jul 21, 2025

Testing Guidelines

Hi @prettyboymp @jeherve @pavel-mailpoet @woocommerce/ballade,

Apart from reviewing the code changes, please make sure to review the testing instructions (Guide) and verify that relevant tests (E2E, Unit, Integration, etc.) have been added or updated as needed.

Reminder: PR reviewers are required to document testing performed. This includes:

  • 🖼️ Screenshots or screen recordings.
  • 📝 List of functionality tested / steps followed.
  • 🌐 Site details (environment attributes such as hosting type, plugins, theme, store size, store age, and relevant settings).
  • 🔍 Any analysis performed, such as assessing potential impacts on environment attributes and other plugins, conducting performance profiling, or using LLM/AI-based analysis.

⚠️ Within the testing details you provide, please ensure that no sensitive information (such as API keys, passwords, user data, etc.) is included in this public issue.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates the HTML to Text converter directly into the WooCommerce email editor by forking the soundasleep/html2text library's convert functionality. The main purpose is to remove an external dependency while maintaining the same HTML to text conversion capabilities.

  • Removes the soundasleep/html2text dependency from composer.json
  • Adds custom Html2Text and Html2Text_Exception classes with forked functionality
  • Updates the renderer import to use the new internal class

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

File Description
packages/php/email-editor/composer.json Removes the soundasleep/html2text dependency
packages/php/email-editor/src/Engine/Renderer/class-renderer.php Updates import to use new internal Html2Text class
packages/php/email-editor/src/Engine/Renderer/class-html2text.php Adds new Html2Text class with forked conversion functionality
packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php Adds custom exception class for HTML conversion errors

@allilevine allilevine marked this pull request as ready for review July 21, 2025 21:14
@allilevine allilevine requested a review from a team as a code owner July 21, 2025 21:14
@allilevine allilevine requested review from prettyboymp and removed request for a team July 21, 2025 21:14
@allilevine
Copy link
Contributor Author

@woocommerce/ballade 👋 Is there a way to test this integration? Also, should there be a changelog entry?

@allilevine allilevine requested a review from a team July 21, 2025 21:16
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 21, 2025

Warning

Rate limit exceeded

@jeherve has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 40 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 6a66b9b and fc30ed5.

📒 Files selected for processing (1)
  • packages/php/email-editor/tasks/build-package.sh (1 hunks)
📝 Walkthrough

"""

Walkthrough

The changes remove the external "soundasleep/html2text" dependency and introduce an internal HTML-to-text conversion implementation. A new exception class and a comprehensive Html2Text class are added. The renderer is updated to use the new internal class. Extensive new tests for the converter and renderer integration are added. A license, security policy, and changelog entries are also included.

Changes

File(s) Change Summary
packages/php/email-editor/composer.json Removed "soundasleep/html2text" dependency from Composer requirements.
packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php Added new Html2Text_Exception class for handling HTML-to-text conversion errors.
packages/php/email-editor/src/Engine/Renderer/class-html2text.php Added new Html2Text class implementing internal HTML-to-text conversion logic.
packages/php/email-editor/src/Engine/Renderer/class-renderer.php Removed import of external Soundasleep\Html2Text class; no other changes.
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php Added multiple integration tests for HTML-to-text conversion covering diverse HTML elements and edge cases.
packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php Added comprehensive unit tests for Html2Text class covering conversion logic, options, edge cases, and errors.
packages/php/email-editor/changelog/59859-wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor Added changelog entry documenting the integration of the HTML-to-text converter into the Email Editor package.
packages/php/email-editor/changelog/59859-wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor-2 Added changelog entry with licensing and security policy information.
packages/php/email-editor/SECURITY.md Added security policy outlining vulnerability reporting and researcher guidelines.
packages/php/email-editor/license.txt Added GPL v2 license text and third-party MIT license notice for included code.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Html2Text
    participant DOMDocument
    participant Html2Text_Exception

    Caller->>Html2Text: convert(html, options)
    Html2Text->>Html2Text: fix_newlines(html)
    Html2Text->>Html2Text: is_office_document(html)
    Html2Text->>DOMDocument: loadHTML(html)
    alt loadHTML fails
        Html2Text->>Html2Text_Exception: throw exception
    else loadHTML succeeds
        Html2Text->>Html2Text: recursively process DOM nodes
        Html2Text->>Html2Text: process_whitespace_newlines(text)
        Html2Text-->>Caller: return plain text
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~45 minutes
"""

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@allilevine allilevine requested a review from jeherve July 21, 2025 21:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (2)

53-61: Remove unused variable in option validation loop.

The $value variable in the foreach loop is not used and can be removed.

-		foreach ( $options as $key => $value ) {
+		foreach ( $options as $key ) {

340-628: Consider refactoring this method for better maintainability.

The iterate_over_node method is 288 lines long, making it difficult to maintain and test. Consider extracting the node type handling into separate methods.

For example, you could extract methods like:

  • handle_text_node()
  • handle_heading_node()
  • handle_link_node()
  • handle_list_node()

This would improve readability and make the code easier to unit test.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ae7114 and b0d3f99.

⛔ Files ignored due to path filters (1)
  • packages/php/email-editor/composer.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • packages/php/email-editor/composer.json (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php
  • packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php
  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php
  • packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php
  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
🧠 Learnings (4)
📓 Common learnings
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59632
File: packages/js/email-editor/src/layouts/flex-email.tsx:116-122
Timestamp: 2025-07-14T10:41:46.200Z
Learning: In WooCommerce projects, formatting suggestions should respect the project's Prettier configuration and linting rules. Changes that would break the lint job should be avoided, even if they appear to improve readability.
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.
packages/php/email-editor/src/Engine/Renderer/class-renderer.php (4)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: triple0t
PR: #59186
File: packages/js/email-editor/src/store/initial-state.ts:9-10
Timestamp: 2025-06-26T12:13:32.062Z
Learning: In WooCommerce email editor store initialization (packages/js/email-editor/src/store/initial-state.ts), the current_post_id and current_post_type from window.WooCommerceEmailEditor are required parameters that should cause explicit errors if missing, rather than using fallback values or optional chaining. The design preference is to fail fast when critical initialization data is unavailable.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: gigitux
PR: #58846
File: plugins/woocommerce/client/blocks/tests/e2e/tests/all-products/all-products.block_theme.spec.ts:41-52
Timestamp: 2025-06-16T09:20:22.981Z
Learning: In WooCommerce E2E tests, the database is reset to the initial state for each test, so there's no need to manually restore global template changes (like clearing the header template) as the test infrastructure handles cleanup automatically.

packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php (9)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: jorgeatorres
PR: #59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in plugins/woocommerce/changelog/ directory and are processed during releases, not at the repository root level.

Learnt from: opr
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-20T17:38:16.565Z
Learning: WooCommerce legacy JavaScript files in plugins/woocommerce/client/legacy/js/ must use older JavaScript syntax and cannot use modern features like optional chaining (?.) due to browser compatibility requirements. Explicit null checking with && operators should be used instead.

Learnt from: samueljseay
PR: #58716
File: plugins/woocommerce/client/blocks/assets/js/blocks/mini-cart/iapi-frontend.ts:83-101
Timestamp: 2025-06-17T07:07:53.443Z
Learning: In WooCommerce blocks, when porting existing code patterns that have known issues (like parseInt truncating decimal money values), maintain consistency with existing implementation rather than making isolated fixes. The preference is for systematic refactoring approaches (like broader Dinero adoption) over piecemeal changes.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: Aljullu
PR: #58809
File: plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php:222-225
Timestamp: 2025-06-13T17:11:13.732Z
Learning: In plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php, the team intentionally relies on toggling the disabled CSS class (via data-wp-class--disabled) instead of binding the disabled attribute, to mirror the behavior of the classic WooCommerce template.

Learnt from: ralucaStan
PR: #58782
File: plugins/woocommerce/client/blocks/assets/js/base/utils/render-frontend.tsx:0-0
Timestamp: 2025-06-16T16:12:12.148Z
Learning: For WooCommerce checkout blocks, lazy loading was removed in favor of direct imports to prevent sequential "popping" effects during component loading. This approach prioritizes user experience over code splitting, with minimal bundle size impact and improved performance (1.7s to 1.1s speed score improvement). The checkout flow benefits from having all components load together rather than incrementally.

Learnt from: prettyboymp
PR: #59048
File: .github/workflows/cherry-pick-milestoned-prs.yml:60-83
Timestamp: 2025-06-26T12:45:40.709Z
Learning: WooCommerce uses WordPress versioning conventions where minor versions in X.Y.Z format are constrained to 0-9 (Y cannot exceed 9). This means version increment logic should reset minor to 0 and increment major when minor reaches 9, rather than allowing two-digit minor versions like 9.10 or 9.11.

Learnt from: mikejolley
PR: #57961
File: plugins/woocommerce/includes/class-wc-session-handler.php:302-333
Timestamp: 2025-06-19T11:58:57.484Z
Learning: In WooCommerce's session handler, the cart merging behavior was revised to always merge guest and user carts when both contain items, rather than having the guest cart take precedence. The migrate_cart_data() method in class-wc-session-handler.php implements this by using array_merge() to combine both carts when neither is empty.

packages/php/email-editor/src/Engine/Renderer/class-html2text.php (12)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: samueljseay
PR: #59142
File: plugins/woocommerce/src/Blocks/BlockTypes/MiniCart.php:594-602
Timestamp: 2025-06-25T06:51:41.381Z
Learning: WooCommerce avoids using DOMDocument for performance reasons as a general rule of thumb in their codebase.

Learnt from: vladolaru
PR: #58784
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/HelioPay.php:65-68
Timestamp: 2025-06-18T10:14:44.690Z
Learning: In PHP namespaced code, leading backslashes () before constant names explicitly reference global namespace constants. This is correct practice and should not be flagged as an error when used in namespaced classes to reference global constants like plugin-defined constants.

Learnt from: vladolaru
PR: #59486
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/WooPayments/WooPaymentsService.php:1544-1544
Timestamp: 2025-07-08T11:18:07.871Z
Learning: WooCommerce has polyfills for newer PHP functions (like str_starts_with() from PHP 8.0+), so these functions can be safely used even though WooCommerce supports PHP 7.4+. No need to suggest PHP 7.4 compatible alternatives when polyfills are available.

Learnt from: jorgeatorres
PR: #59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in plugins/woocommerce/changelog/ directory and are processed during releases, not at the repository root level.

Learnt from: opr
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-20T17:38:16.565Z
Learning: WooCommerce legacy JavaScript files in plugins/woocommerce/client/legacy/js/ must use older JavaScript syntax and cannot use modern features like optional chaining (?.) due to browser compatibility requirements. Explicit null checking with && operators should be used instead.

Learnt from: samueljseay
PR: #58716
File: plugins/woocommerce/client/blocks/assets/js/blocks/mini-cart/iapi-frontend.ts:83-101
Timestamp: 2025-06-17T07:07:53.443Z
Learning: In WooCommerce blocks, when porting existing code patterns that have known issues (like parseInt truncating decimal money values), maintain consistency with existing implementation rather than making isolated fixes. The preference is for systematic refactoring approaches (like broader Dinero adoption) over piecemeal changes.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: Aljullu
PR: #58809
File: plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php:222-225
Timestamp: 2025-06-13T17:11:13.732Z
Learning: In plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php, the team intentionally relies on toggling the disabled CSS class (via data-wp-class--disabled) instead of binding the disabled attribute, to mirror the behavior of the classic WooCommerce template.

Learnt from: ralucaStan
PR: #58782
File: plugins/woocommerce/client/blocks/assets/js/base/utils/render-frontend.tsx:0-0
Timestamp: 2025-06-16T16:12:12.148Z
Learning: For WooCommerce checkout blocks, lazy loading was removed in favor of direct imports to prevent sequential "popping" effects during component loading. This approach prioritizes user experience over code splitting, with minimal bundle size impact and improved performance (1.7s to 1.1s speed score improvement). The checkout flow benefits from having all components load together rather than incrementally.

Learnt from: prettyboymp
PR: #59048
File: .github/workflows/cherry-pick-milestoned-prs.yml:60-83
Timestamp: 2025-06-26T12:45:40.709Z
Learning: WooCommerce uses WordPress versioning conventions where minor versions in X.Y.Z format are constrained to 0-9 (Y cannot exceed 9). This means version increment logic should reset minor to 0 and increment major when minor reaches 9, rather than allowing two-digit minor versions like 9.10 or 9.11.

Learnt from: mikejolley
PR: #57961
File: plugins/woocommerce/includes/class-wc-session-handler.php:302-333
Timestamp: 2025-06-19T11:58:57.484Z
Learning: In WooCommerce's session handler, the cart merging behavior was revised to always merge guest and user carts when both contain items, rather than having the guest cart take precedence. The migrate_cart_data() method in class-wc-session-handler.php implements this by using array_merge() to combine both carts when neither is empty.

🪛 PHPMD (2.15.0)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php

53-53: Avoid unused local variables such as '$value'. (Unused Code Rules)

(UnusedLocalVariable)

🔇 Additional comments (5)
packages/php/email-editor/composer.json (1)

19-22: LGTM! Dependency removal aligns with PR objectives.

The removal of soundasleep/html2text dependency is appropriate since the functionality is now implemented internally.

packages/php/email-editor/src/Engine/Renderer/class-renderer.php (1)

15-15: Import statement correctly updated to use internal implementation.

Good job maintaining the same API interface, which made this a simple drop-in replacement.

packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php (1)

15-42: Well-structured exception class with proper encapsulation.

The implementation follows best practices with private property and public getter method. Good documentation and type hints throughout.

packages/php/email-editor/src/Engine/Renderer/class-html2text.php (2)

182-182: Good use of PHP 7.4 compatible functions.

I see you're using strpos instead of str_contains for PHP 7.4 compatibility. This is consistent throughout the file.


269-276: Excellent security improvement in error handling.

The HTML content is properly truncated in logs and a generic error message is thrown to prevent sensitive data exposure.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (4)

73-76: Consider using PHP_VERSION_ID for cleaner version comparison.

The current version check works but could be simplified.

-		if ( PHP_MAJOR_VERSION * 10 + PHP_MINOR_VERSION < 81 && mb_detect_encoding( $html, 'UTF-8', true ) ) {
+		if ( PHP_VERSION_ID < 80100 && mb_detect_encoding( $html, 'UTF-8', true ) ) {

225-225: Maintain consistency with PHP version checks.

For consistency with the earlier suggestion, consider using PHP_VERSION_ID here as well.

-		if ( PHP_MAJOR_VERSION * 10 + PHP_MINOR_VERSION >= 81 ) {
+		if ( PHP_VERSION_ID >= 80100 ) {

350-628: Consider refactoring this method to improve maintainability.

While the implementation is correct, this method spans 278 lines and handles many responsibilities. Consider extracting some logic into smaller helper methods, such as:

  • Link processing (lines 534-589)
  • Text node processing (lines 351-373)
  • Start/end whitespace handling for different elements

This would improve readability and make the code easier to test and maintain.


205-206: Note: DOMDocument usage may have performance implications.

Based on WooCommerce's general practices, DOMDocument is typically avoided for performance reasons. However, since this is replacing an external library that likely used similar DOM parsing, and email content is usually limited in size, this implementation should be acceptable.

Consider monitoring performance in production and potentially implementing a simpler regex-based approach if performance issues arise with large email content.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0d3f99 and d347702.

📒 Files selected for processing (1)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
🧠 Learnings (2)
📓 Common learnings
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Provide clear, step-by-step instructions for how to test the changes in the PR description.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59632
File: packages/js/email-editor/src/layouts/flex-email.tsx:116-122
Timestamp: 2025-07-14T10:41:46.200Z
Learning: In WooCommerce projects, formatting suggestions should respect the project's Prettier configuration and linting rules. Changes that would break the lint job should be avoided, even if they appear to improve readability.
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (12)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: samueljseay
PR: #59142
File: plugins/woocommerce/src/Blocks/BlockTypes/MiniCart.php:594-602
Timestamp: 2025-06-25T06:51:41.381Z
Learning: WooCommerce avoids using DOMDocument for performance reasons as a general rule of thumb in their codebase.

Learnt from: vladolaru
PR: #58784
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/HelioPay.php:65-68
Timestamp: 2025-06-18T10:14:44.690Z
Learning: In PHP namespaced code, leading backslashes () before constant names explicitly reference global namespace constants. This is correct practice and should not be flagged as an error when used in namespaced classes to reference global constants like plugin-defined constants.

Learnt from: vladolaru
PR: #59486
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/WooPayments/WooPaymentsService.php:1544-1544
Timestamp: 2025-07-08T11:18:07.871Z
Learning: WooCommerce has polyfills for newer PHP functions (like str_starts_with() from PHP 8.0+), so these functions can be safely used even though WooCommerce supports PHP 7.4+. No need to suggest PHP 7.4 compatible alternatives when polyfills are available.

Learnt from: jorgeatorres
PR: #59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in plugins/woocommerce/changelog/ directory and are processed during releases, not at the repository root level.

Learnt from: opr
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-20T17:38:16.565Z
Learning: WooCommerce legacy JavaScript files in plugins/woocommerce/client/legacy/js/ must use older JavaScript syntax and cannot use modern features like optional chaining (?.) due to browser compatibility requirements. Explicit null checking with && operators should be used instead.

Learnt from: samueljseay
PR: #58716
File: plugins/woocommerce/client/blocks/assets/js/blocks/mini-cart/iapi-frontend.ts:83-101
Timestamp: 2025-06-17T07:07:53.443Z
Learning: In WooCommerce blocks, when porting existing code patterns that have known issues (like parseInt truncating decimal money values), maintain consistency with existing implementation rather than making isolated fixes. The preference is for systematic refactoring approaches (like broader Dinero adoption) over piecemeal changes.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: Aljullu
PR: #58809
File: plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php:222-225
Timestamp: 2025-06-13T17:11:13.732Z
Learning: In plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php, the team intentionally relies on toggling the disabled CSS class (via data-wp-class--disabled) instead of binding the disabled attribute, to mirror the behavior of the classic WooCommerce template.

Learnt from: ralucaStan
PR: #58782
File: plugins/woocommerce/client/blocks/assets/js/base/utils/render-frontend.tsx:0-0
Timestamp: 2025-06-16T16:12:12.148Z
Learning: For WooCommerce checkout blocks, lazy loading was removed in favor of direct imports to prevent sequential "popping" effects during component loading. This approach prioritizes user experience over code splitting, with minimal bundle size impact and improved performance (1.7s to 1.1s speed score improvement). The checkout flow benefits from having all components load together rather than incrementally.

Learnt from: prettyboymp
PR: #59048
File: .github/workflows/cherry-pick-milestoned-prs.yml:60-83
Timestamp: 2025-06-26T12:45:40.709Z
Learning: WooCommerce uses WordPress versioning conventions where minor versions in X.Y.Z format are constrained to 0-9 (Y cannot exceed 9). This means version increment logic should reset minor to 0 and increment major when minor reaches 9, rather than allowing two-digit minor versions like 9.10 or 9.11.

Learnt from: mikejolley
PR: #57961
File: plugins/woocommerce/includes/class-wc-session-handler.php:302-333
Timestamp: 2025-06-19T11:58:57.484Z
Learning: In WooCommerce's session handler, the cart merging behavior was revised to always merge guest and user carts when both contain items, rather than having the guest cart take precedence. The migrate_cart_data() method in class-wc-session-handler.php implements this by using array_merge() to combine both carts when neither is empty.

🔇 Additional comments (2)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (2)

27-33: LGTM!

The default options are well-defined and properly typed.


1-20: Html2Text appropriately uses core PHP escaping

I’ve confirmed that Html2Text is a standalone utility with no dependencies on WordPress functions (no esc_html, esc_attr or wp_* calls in this class), and is invoked purely via Html2Text::convert() in the email-editor renderer. Relying on htmlspecialchars() here is intentional—this class is decoupled from WP and must work in any PHP context. No changes needed.

Comment on lines +20 to +629
class Html2Text {

/**
* Default options for HTML to text conversion
*
* @return array<string, bool|string> Default options array.
*/
public static function default_options(): array {
return array(
'ignore_errors' => false,
'drop_links' => false,
'char_set' => 'auto',
);
}

/**
* Converts HTML into plain text format
*
* @param string $html The input HTML.
* @param boolean|array<string, bool|string> $options Conversion options.
* @return string The HTML converted to text.
* @throws Html2Text_Exception|\InvalidArgumentException If the HTML could not be loaded or invalid options are provided.
*/
public static function convert( string $html, $options = array() ): string {

if ( false === $options || true === $options ) {
// Using old style (< 1.0) of passing in options.
$options = array( 'ignore_errors' => $options );
}

$options = array_merge( static::default_options(), $options );

// Check all options are valid.
foreach ( array_keys( $options ) as $key ) {
if ( ! in_array( $key, array_keys( static::default_options() ), true ) ) {
// Log invalid option for debugging purposes without exposing in exception.
// phpcs:ignore WordPress.PHP.DevelopmentFunctions.error_log_error_log -- Security: Logging sensitive data separately from user-facing exception messages.
error_log( 'Html2Text: Invalid option provided: ' . htmlspecialchars( (string) $key, ENT_QUOTES, 'UTF-8' ) . '. Valid options are: ' . htmlspecialchars( implode( ',', array_keys( static::default_options() ) ), ENT_QUOTES, 'UTF-8' ) );
// Throw generic error message to avoid exposing user input.
throw new \InvalidArgumentException( 'Invalid option provided for html2text conversion.' );
}
}

$is_office_document = self::is_office_document( $html );

if ( $is_office_document ) {
// Remove office namespace.
$html = str_replace( array( '<o:p>', '</o:p>' ), '', $html );
}

$html = self::fix_newlines( $html );

// Use mb_convert_encoding for legacy versions of php.
if ( PHP_MAJOR_VERSION * 10 + PHP_MINOR_VERSION < 81 && mb_detect_encoding( $html, 'UTF-8', true ) ) {
$converted = mb_convert_encoding( $html, 'HTML-ENTITIES', 'UTF-8' );
$html = false !== $converted ? $converted : $html;
}

// Ensure $html is always a string before passing to get_document.
if ( ! is_string( $html ) ) {
$html = (string) $html;
}

$doc = self::get_document( $html, $options );

$output = self::iterate_over_node( $doc, null, false, $is_office_document, $options );

// Process output for whitespace/newlines.
$output = self::process_whitespace_newlines( $output );

return $output;
}

/**
* Unify newlines
*
* Converts \r\n to \n, and \r to \n. This means that all newlines
* (Unix, Windows, Mac) all become \ns.
*
* @param string $text Text with any number of \r, \r\n and \n combinations.
* @return string The fixed text.
*/
public static function fix_newlines( string $text ): string {
// Replace \r\n to \n.
$text = str_replace( "\r\n", "\n", $text );
// Remove \rs.
$text = str_replace( "\r", "\n", $text );

return $text;
}

/**
* Get non-breaking space character codes
*
* @return array<string> Array of nbsp codes.
*/
public static function nbsp_codes(): array {
return array(
"\xc2\xa0",
"\u00a0",
);
}

/**
* Get zero-width non-joiner character codes
*
* @return array<string> Array of zwnj codes.
*/
public static function zwnj_codes(): array {
return array(
"\xe2\x80\x8c",
"\u200c",
);
}

/**
* Remove leading or trailing spaces and excess empty lines from provided multiline text
*
* @param string $text Multiline text with any number of leading or trailing spaces or excess lines.
* @return string The fixed text.
*/
public static function process_whitespace_newlines( string $text ): string {

// Remove excess spaces around tabs.
$result = preg_replace( '/ *\t */im', "\t", $text );
$text = null !== $result ? $result : $text;

// Remove leading whitespace.
$text = ltrim( $text );

// Remove leading spaces on each line.
$result = preg_replace( "/\n[ \t]*/im", "\n", $text );
$text = null !== $result ? $result : $text;

// Convert non-breaking spaces to regular spaces to prevent output issues,
// do it here so they do NOT get removed with other leading spaces, as they
// are sometimes used for indentation.
$text = self::render_text( $text );

// Remove trailing whitespace.
$text = rtrim( $text );

// Remove trailing spaces on each line.
$result = preg_replace( "/[ \t]*\n/im", "\n", $text );
$text = null !== $result ? $result : $text;

// Unarmor pre blocks.
$text = self::fix_newlines( $text );

// Remove unnecessary empty lines.
$result = preg_replace( "/\n\n\n*/im", "\n\n", $text );

return null !== $result ? $result : $text;
}

/**
* Can we guess that this HTML is generated by Microsoft Office?
*
* @param string $html The HTML content.
* @return bool True if this appears to be an Office document.
*/
public static function is_office_document( string $html ): bool {
return strpos( $html, 'urn:schemas-microsoft-com:office' ) !== false;
}

/**
* Check if text is whitespace
*
* @param string $text The text to check.
* @return bool True if the text is whitespace.
*/
public static function is_whitespace( string $text ): bool {
return 0 === strlen( trim( self::render_text( $text ), "\n\r\t " ) );
}

/**
* Parse HTML into a DOMDocument
*
* @param string $html The input HTML.
* @param array<string, bool|string> $options Parsing options.
* @return \DOMDocument The parsed document tree.
* @throws Html2Text_Exception If the HTML could not be loaded.
*/
private static function get_document( string $html, array $options ): \DOMDocument {

$doc = new \DOMDocument();

$html = trim( $html );

if ( ! $html ) {
// DOMDocument doesn't support empty value and throws an error.
// Return empty document instead.
return $doc;
}

if ( '<' !== $html[0] ) {
// If HTML does not begin with a tag, we put a body tag around it.
// If we do not do this, PHP will insert a paragraph tag around
// the first block of text for some reason which can mess up
// the newlines. See pre.html test for an example.
$html = '<body>' . $html . '</body>';
}

$header = '';
// Use char sets for modern versions of php.
if ( PHP_MAJOR_VERSION * 10 + PHP_MINOR_VERSION >= 81 ) {
// Use specified char_set, or auto detect if not set.
$char_set = ! empty( $options['char_set'] ) && is_string( $options['char_set'] ) ? $options['char_set'] : 'auto';
if ( 'auto' === $char_set ) {
$detected = mb_detect_encoding( $html );
$char_set = false !== $detected ? $detected : 'UTF-8';
} elseif ( strpos( $char_set, ',' ) !== false ) {
$encoding_list = explode( ',', $char_set );
$encoding_list = array_map( 'trim', $encoding_list );
$encoding_list = array_filter(
$encoding_list,
function ( $encoding ) {
return ! empty( $encoding );
}
);
if ( ! empty( $encoding_list ) ) {
// Ensure we have a proper list with consecutive integer keys.
$encoding_list = array_values( $encoding_list );
mb_detect_order( $encoding_list );
$detected = mb_detect_encoding( $html );
$char_set = false !== $detected ? $detected : 'UTF-8';
}
}
// Turn off error detection for Windows-1252 legacy html.
if ( strpos( $char_set, '1252' ) !== false ) {
$options['ignore_errors'] = true;
}
$header = '<?xml version="1.0" encoding="' . $char_set . '">';
}

if ( ! empty( $options['ignore_errors'] ) ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$doc->strictErrorChecking = false;
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$doc->recover = true;
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$doc->xmlStandalone = true;
$old_internal_errors = libxml_use_internal_errors( true );
$load_result = $doc->loadHTML( $header . $html, LIBXML_NOWARNING | LIBXML_NOERROR | LIBXML_NONET | LIBXML_PARSEHUGE );
libxml_use_internal_errors( $old_internal_errors );
} else {
$load_result = $doc->loadHTML( $header . $html );
}

if ( ! $load_result ) {
// Log truncated HTML content for debugging purposes (limit to 500 chars to prevent log bloat).
$html_preview = strlen( $html ) > 500 ? substr( $html, 0, 500 ) . '...[truncated]' : $html;
// phpcs:ignore WordPress.PHP.DevelopmentFunctions.error_log_error_log -- Security: Logging sensitive data separately from user-facing exception messages.
error_log( 'Html2Text: Failed to load HTML content: ' . htmlspecialchars( $html_preview, ENT_QUOTES, 'UTF-8' ) );
// Throw a generic error message to avoid exposing sensitive data.
throw new Html2Text_Exception( 'Could not load HTML - the content may be malformed.' );
}

return $doc;
}

/**
* Replace any special characters with simple text versions
*
* This prevents output issues:
* - Convert non-breaking spaces to regular spaces; and
* - Convert zero-width non-joiners to '' (nothing).
*
* This is to match our goal of rendering documents as they would be rendered
* by a browser.
*
* @param string $text The text to process.
* @return string The processed text.
*/
private static function render_text( string $text ): string {
$text = str_replace( self::nbsp_codes(), ' ', $text );
$text = str_replace( self::zwnj_codes(), '', $text );
return $text;
}

/**
* Get the next child name
*
* @param \DOMNode|null $node The node to check.
* @return string|null The next child name.
*/
private static function next_child_name( ?\DOMNode $node ): ?string {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( null === $node || null === $node->nextSibling ) {
return null;
}

// Get the next child.
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$next_node = $node->nextSibling;
while ( null !== $next_node ) {
if ( $next_node instanceof \DOMText ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( ! self::is_whitespace( $next_node->wholeText ) ) {
break;
}
}

if ( $next_node instanceof \DOMElement ) {
break;
}

// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$next_node = $next_node->nextSibling;
}

$next_name = null;
if ( $next_node instanceof \DOMElement || $next_node instanceof \DOMText ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$next_name = strtolower( $next_node->nodeName );
}

return $next_name;
}

/**
* Iterate over a DOM node and convert to text
*
* @param \DOMNode $node The DOM node.
* @param string|null $prev_name Previous node name.
* @param bool $in_pre Whether we're in a pre block.
* @param bool $is_office_document Whether this is an Office document.
* @param array<string, bool|string> $options Conversion options.
* @return string The converted text.
*/
private static function iterate_over_node( \DOMNode $node, ?string $prev_name, bool $in_pre, bool $is_office_document, array $options ): string {
if ( $node instanceof \DOMText ) {
// Replace whitespace characters with a space (equivalent to \s).
if ( $in_pre ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$text = "\n" . trim( self::render_text( $node->wholeText ), "\n\r\t " ) . "\n";

// Remove trailing whitespace only.
$result = preg_replace( "/[ \t]*\n/im", "\n", $text );
$text = null !== $result ? $result : $text;

// Armor newlines with \r.
return str_replace( "\n", "\r", $text );
}
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$text = self::render_text( $node->wholeText );
$result = preg_replace( "/[\\t\\n\\f\\r ]+/im", ' ', $text );
$text = null !== $result ? $result : $text;

if ( ! self::is_whitespace( $text ) && ( 'p' === $prev_name || 'div' === $prev_name ) ) {
return "\n" . $text;
}
return $text;
}

if ( $node instanceof \DOMDocumentType || $node instanceof \DOMProcessingInstruction ) {
// Ignore.
return '';
}

// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$name = strtolower( $node->nodeName );
$next_name = self::next_child_name( $node );

// Start whitespace.
switch ( $name ) {
case 'hr':
$prefix = '';
if ( null !== $prev_name ) {
$prefix = "\n";
}
return $prefix . "---------------------------------------------------------------\n";

case 'style':
case 'head':
case 'title':
case 'meta':
case 'script':
// Ignore these tags.
return '';

case 'h1':
case 'h2':
case 'h3':
case 'h4':
case 'h5':
case 'h6':
case 'ol':
case 'ul':
case 'pre':
// Add two newlines.
$output = "\n\n";
break;

case 'td':
case 'th':
// Add tab char to separate table fields.
$output = "\t";
break;

case 'p':
// Microsoft exchange emails often include HTML which, when passed through
// html2text, results in lots of double line returns everywhere.
//
// To fix this, for any p element with a className of `MsoNormal` (the standard
// classname in any Microsoft export or outlook for a paragraph that behaves
// like a line return) we skip the first line returns and set the name to br.
if ( $is_office_document && $node instanceof \DOMElement && 'MsoNormal' === $node->getAttribute( 'class' ) ) {
$output = '';
$name = 'br';
break;
}

// Add two lines.
$output = "\n\n";
break;

case 'tr':
// Add one line.
$output = "\n";
break;

case 'div':
$output = '';
if ( null !== $prev_name ) {
// Add one line.
$output .= "\n";
}
break;

case 'li':
$output = '- ';
break;

default:
// Print out contents of unknown tags.
$output = '';
break;
}

// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( $node->childNodes->length > 0 ) {

// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$n = $node->childNodes->item( 0 );
$previous_sibling_names = array();
$previous_sibling_name = null;
$parts = array();
$trailing_whitespace = 0;

while ( null !== $n ) {

$text = self::iterate_over_node( $n, $previous_sibling_name, $in_pre || 'pre' === $name, $is_office_document, $options );

// Pass current node name to next child, as previousSibling does not appear to get populated.
if ( $n instanceof \DOMDocumentType
|| $n instanceof \DOMProcessingInstruction
|| ( $n instanceof \DOMText && self::is_whitespace( $text ) ) ) {
// Keep current previousSiblingName, these are invisible.
++$trailing_whitespace;
} else {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$previous_sibling_name = strtolower( $n->nodeName );
$previous_sibling_names[] = $previous_sibling_name;
$trailing_whitespace = 0;
}

$node->removeChild( $n );
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$n = $node->childNodes->item( 0 );

$parts[] = $text;
}

// Remove trailing whitespace, important for the br check below.
while ( $trailing_whitespace-- > 0 ) {
array_pop( $parts );
}

// Suppress last br tag inside a node list if follows text.
$last_name = array_pop( $previous_sibling_names );
if ( 'br' === $last_name ) {
$last_name = array_pop( $previous_sibling_names );
if ( '#text' === $last_name ) {
array_pop( $parts );
}
}

$output .= implode( '', $parts );
}

// End whitespace.
switch ( $name ) {
case 'h1':
case 'h2':
case 'h3':
case 'h4':
case 'h5':
case 'h6':
case 'pre':
case 'p':
// Add two lines.
$output .= "\n\n";
break;

case 'br':
// Add one line.
$output .= "\n";
break;

case 'div':
break;

case 'a':
// Links are returned in [text](link) format.
$href = $node instanceof \DOMElement ? $node->getAttribute( 'href' ) : '';

$output = trim( $output );

// Remove double [[ ]] s from linking images.
if ( '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
$output = substr( $output, 1, strlen( $output ) - 2 );

// For linking images, the title of the <a> overrides the title of the <img>.
if ( $node instanceof \DOMElement && $node->getAttribute( 'title' ) ) {
$output = $node->getAttribute( 'title' );
}
}

// If there is no link text, but a title attr.
if ( ! $output && $node instanceof \DOMElement && $node->getAttribute( 'title' ) ) {
$output = $node->getAttribute( 'title' );
}

if ( ! $href ) {
// It doesn't link anywhere.
if ( $node instanceof \DOMElement && $node->getAttribute( 'name' ) ) {
if ( $options['drop_links'] ) {
$output = "$output";
} else {
$output = "[$output]";
}
}
} elseif ( $href === $output || "mailto:$output" === $href || "http://$output" === $href || "https://$output" === $href ) {
// Link to the same address: just use link.
$output = "$output";
} elseif ( $output ) {
// Replace it.
if ( $options['drop_links'] ) {
$output = "$output";
} else {
$output = "[$output]($href)";
}
} else {
// Empty string.
$output = "$href";
}

// Does the next node require additional whitespace?
switch ( $next_name ) {
case 'h1':
case 'h2':
case 'h3':
case 'h4':
case 'h5':
case 'h6':
$output .= "\n";
break;
}
break;

case 'img':
if ( $node instanceof \DOMElement && $node->getAttribute( 'title' ) ) {
$output = '[' . $node->getAttribute( 'title' ) . ']';
} elseif ( $node instanceof \DOMElement && $node->getAttribute( 'alt' ) ) {
$output = '[' . $node->getAttribute( 'alt' ) . ']';
} else {
$output = '';
}
break;

case 'li':
$output .= "\n";
break;

case 'blockquote':
// Process quoted text for whitespace/newlines.
$output = self::process_whitespace_newlines( $output );

// Add leading newline.
$output = "\n" . $output;

// Prepend '> ' at the beginning of all lines.
$result = preg_replace( "/\n/im", "\n> ", $output );
$output = null !== $result ? $result : $output;

// Replace leading '> >' with '>>'.
$result = preg_replace( "/\n> >/im", "\n>>", $output );
$output = null !== $result ? $result : $output;

// Add another leading newline and trailing newlines.
$output = "\n" . $output . "\n\n";
break;
default:
// Do nothing.
}

return $output;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add unit tests for the Html2Text class.

This implementation handles many edge cases and complex scenarios (Office documents, various HTML elements, encoding issues, etc.). Please add comprehensive unit tests to ensure the conversion behaves correctly for:

  • Basic HTML elements (p, div, br, etc.)
  • Links with various formats
  • Images with alt/title attributes
  • Blockquotes and nested elements
  • Office document handling
  • Character encoding edge cases
  • Error scenarios (malformed HTML, invalid options)

Would you like me to help generate a comprehensive test suite for this class?

🤖 Prompt for AI Agents
In packages/php/email-editor/src/Engine/Renderer/class-html2text.php around
lines 20 to 629, the Html2Text class lacks unit tests covering its complex HTML
to text conversion logic. To fix this, create a comprehensive test suite that
verifies correct conversion for basic HTML elements like p, div, and br; tests
link formatting including mailto and http variations; checks image alt and title
attribute handling; covers blockquotes and nested elements; validates Office
document specific processing; tests character encoding edge cases; and ensures
proper error handling for malformed HTML and invalid options. Use PHPUnit or a
similar framework to implement these tests.

@allilevine allilevine added the package: @woocommerce/email-editor issues related to the @woocommerce/email-editor package label Jul 21, 2025
Copy link
Contributor

@jeherve jeherve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth bringing some of the tests from soundasleep/html2text's tests/Html2TextTest.php alongside with the class, to ensure there were no regressions?

Maybe we could also extend the tests in packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php to check for HTML conversion? We do have a few existing tests that ensure things are working (like testItRendersPostWithinAssociatedTemplate), but maybe there could be an additional one with a data provider that would check for a few usual tags, like divs, links, and maybe special entities?

Copy link
Contributor

@pavel-mailpoet pavel-mailpoet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @allilevine thank you for preparing the pull request.

Apart from what Jeremy wrote, I think it would be good to include a changelog in this pull request. I saw your comment, but I still think we should add it. Otherwise, when we publish a new version of the library without a changelog, people wouldn’t know what the difference is between the versions.

@costasovo
Copy link
Contributor

costasovo commented Jul 22, 2025

👋 Is there a way to test this integration?

@allilevine Woo itself doesn't use the text version from the editor package. There are a couple of test cases in the integration tests that check the text version.

It is also possible to check in this place by logging $rendered_email_data['text'] when a transaction email is being sent (e.g. new order created).
This needs the block editor feature to be activated in WooCommerce > Settings > Advanced > Features

@allilevine allilevine force-pushed the wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor branch from d347702 to 2cd38ad Compare July 22, 2025 13:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (1)

20-629: Add comprehensive unit tests for the Html2Text class.

This implementation handles many complex scenarios that require thorough testing. Based on the retrieved learnings, unit tests are expected for new code. Please add tests covering:

  • Basic HTML element conversion (p, div, br, h1-h6)
  • Link formatting with various href patterns
  • Image alt/title text handling
  • List and blockquote formatting
  • Microsoft Office document handling
  • Character encoding edge cases
  • Error scenarios (malformed HTML, invalid options)
  • Whitespace and newline processing

Would you like me to generate a comprehensive PHPUnit test suite for this class?

🧹 Nitpick comments (2)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (2)

78-81: Remove unnecessary type checking for string parameter.

The $html parameter is already declared with a string type hint, making this type check redundant. PHP will enforce the type at the function boundary.

-		// Ensure $html is always a string before passing to get_document.
-		if ( ! is_string( $html ) ) {
-			$html = (string) $html;
-		}

487-490: Potential issue with DOM manipulation during iteration.

Removing child nodes while iterating ($node->removeChild($n)) modifies the DOM structure during traversal. This approach works because you're always getting item(0) after removal, but it's mutating the original DOM unnecessarily.

Consider iterating without removing nodes:

-			$node->removeChild( $n );
-			// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
-			$n = $node->childNodes->item( 0 );
+			// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
+			$n = $n->nextSibling;
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d347702 and 2cd38ad.

⛔ Files ignored due to path filters (1)
  • packages/php/email-editor/composer.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • packages/php/email-editor/composer.json (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php (1 hunks)
  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php (1 hunks)
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
🧠 Learnings (2)
📓 Common learnings
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Provide clear, step-by-step instructions for how to test the changes in the PR description.
Learnt from: triple0t
PR: woocommerce/woocommerce#59186
File: packages/js/email-editor/src/store/initial-state.ts:9-10
Timestamp: 2025-06-26T12:13:32.062Z
Learning: In WooCommerce email editor store initialization (packages/js/email-editor/src/store/initial-state.ts), the current_post_id and current_post_type from window.WooCommerceEmailEditor are required parameters that should cause explicit errors if missing, rather than using fallback values or optional chaining. The design preference is to fail fast when critical initialization data is unavailable.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59632
File: packages/js/email-editor/src/layouts/flex-email.tsx:116-122
Timestamp: 2025-07-14T10:41:46.200Z
Learning: In WooCommerce projects, formatting suggestions should respect the project's Prettier configuration and linting rules. Changes that would break the lint job should be avoided, even if they appear to improve readability.
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (13)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: samueljseay
PR: #59142
File: plugins/woocommerce/src/Blocks/BlockTypes/MiniCart.php:594-602
Timestamp: 2025-06-25T06:51:41.381Z
Learning: WooCommerce avoids using DOMDocument for performance reasons as a general rule of thumb in their codebase.

Learnt from: vladolaru
PR: #58784
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/HelioPay.php:65-68
Timestamp: 2025-06-18T10:14:44.690Z
Learning: In PHP namespaced code, leading backslashes () before constant names explicitly reference global namespace constants. This is correct practice and should not be flagged as an error when used in namespaced classes to reference global constants like plugin-defined constants.

Learnt from: vladolaru
PR: #59486
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/WooPayments/WooPaymentsService.php:1544-1544
Timestamp: 2025-07-08T11:18:07.871Z
Learning: WooCommerce has polyfills for newer PHP functions (like str_starts_with() from PHP 8.0+), so these functions can be safely used even though WooCommerce supports PHP 7.4+. No need to suggest PHP 7.4 compatible alternatives when polyfills are available.

Learnt from: jorgeatorres
PR: #59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in plugins/woocommerce/changelog/ directory and are processed during releases, not at the repository root level.

Learnt from: opr
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-20T17:38:16.565Z
Learning: WooCommerce legacy JavaScript files in plugins/woocommerce/client/legacy/js/ must use older JavaScript syntax and cannot use modern features like optional chaining (?.) due to browser compatibility requirements. Explicit null checking with && operators should be used instead.

Learnt from: samueljseay
PR: #58716
File: plugins/woocommerce/client/blocks/assets/js/blocks/mini-cart/iapi-frontend.ts:83-101
Timestamp: 2025-06-17T07:07:53.443Z
Learning: In WooCommerce blocks, when porting existing code patterns that have known issues (like parseInt truncating decimal money values), maintain consistency with existing implementation rather than making isolated fixes. The preference is for systematic refactoring approaches (like broader Dinero adoption) over piecemeal changes.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: Aljullu
PR: #58809
File: plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php:222-225
Timestamp: 2025-06-13T17:11:13.732Z
Learning: In plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php, the team intentionally relies on toggling the disabled CSS class (via data-wp-class--disabled) instead of binding the disabled attribute, to mirror the behavior of the classic WooCommerce template.

Learnt from: ralucaStan
PR: #58782
File: plugins/woocommerce/client/blocks/assets/js/base/utils/render-frontend.tsx:0-0
Timestamp: 2025-06-16T16:12:12.148Z
Learning: For WooCommerce checkout blocks, lazy loading was removed in favor of direct imports to prevent sequential "popping" effects during component loading. This approach prioritizes user experience over code splitting, with minimal bundle size impact and improved performance (1.7s to 1.1s speed score improvement). The checkout flow benefits from having all components load together rather than incrementally.

Learnt from: prettyboymp
PR: #59048
File: .github/workflows/cherry-pick-milestoned-prs.yml:60-83
Timestamp: 2025-06-26T12:45:40.709Z
Learning: WooCommerce uses WordPress versioning conventions where minor versions in X.Y.Z format are constrained to 0-9 (Y cannot exceed 9). This means version increment logic should reset minor to 0 and increment major when minor reaches 9, rather than allowing two-digit minor versions like 9.10 or 9.11.

Learnt from: mikejolley
PR: #57961
File: plugins/woocommerce/includes/class-wc-session-handler.php:302-333
Timestamp: 2025-06-19T11:58:57.484Z
Learning: In WooCommerce's session handler, the cart merging behavior was revised to always merge guest and user carts when both contain items, rather than having the guest cart take precedence. The migrate_cart_data() method in class-wc-session-handler.php implements this by using array_merge() to combine both carts when neither is empty.

Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/code-quality.mdc:0-0
Timestamp: 2025-07-18T14:54:36.232Z
Learning: Applies to **/*.{php,js,jsx,ts,tsx} : Ensure code has unit or E2E tests where applicable

✅ Files skipped from review due to trivial changes (2)
  • packages/php/email-editor/composer.json
  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/php/email-editor/src/Engine/Renderer/class-html2text-exception.php
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/src/Engine/Renderer/class-html2text.php
🧠 Learnings (2)
📓 Common learnings
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Provide clear, step-by-step instructions for how to test the changes in the PR description.
Learnt from: triple0t
PR: woocommerce/woocommerce#59186
File: packages/js/email-editor/src/store/initial-state.ts:9-10
Timestamp: 2025-06-26T12:13:32.062Z
Learning: In WooCommerce email editor store initialization (packages/js/email-editor/src/store/initial-state.ts), the current_post_id and current_post_type from window.WooCommerceEmailEditor are required parameters that should cause explicit errors if missing, rather than using fallback values or optional chaining. The design preference is to fail fast when critical initialization data is unavailable.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59632
File: packages/js/email-editor/src/layouts/flex-email.tsx:116-122
Timestamp: 2025-07-14T10:41:46.200Z
Learning: In WooCommerce projects, formatting suggestions should respect the project's Prettier configuration and linting rules. Changes that would break the lint job should be avoided, even if they appear to improve readability.
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (13)

Learnt from: lysyjan
PR: #59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.

Learnt from: samueljseay
PR: #59142
File: plugins/woocommerce/src/Blocks/BlockTypes/MiniCart.php:594-602
Timestamp: 2025-06-25T06:51:41.381Z
Learning: WooCommerce avoids using DOMDocument for performance reasons as a general rule of thumb in their codebase.

Learnt from: vladolaru
PR: #58784
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/HelioPay.php:65-68
Timestamp: 2025-06-18T10:14:44.690Z
Learning: In PHP namespaced code, leading backslashes () before constant names explicitly reference global namespace constants. This is correct practice and should not be flagged as an error when used in namespaced classes to reference global constants like plugin-defined constants.

Learnt from: vladolaru
PR: #59486
File: plugins/woocommerce/src/Internal/Admin/Settings/PaymentsProviders/WooPayments/WooPaymentsService.php:1544-1544
Timestamp: 2025-07-08T11:18:07.871Z
Learning: WooCommerce has polyfills for newer PHP functions (like str_starts_with() from PHP 8.0+), so these functions can be safely used even though WooCommerce supports PHP 7.4+. No need to suggest PHP 7.4 compatible alternatives when polyfills are available.

Learnt from: jorgeatorres
PR: #59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in plugins/woocommerce/changelog/ directory and are processed during releases, not at the repository root level.

Learnt from: opr
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-20T17:38:16.565Z
Learning: WooCommerce legacy JavaScript files in plugins/woocommerce/client/legacy/js/ must use older JavaScript syntax and cannot use modern features like optional chaining (?.) due to browser compatibility requirements. Explicit null checking with && operators should be used instead.

Learnt from: samueljseay
PR: #58716
File: plugins/woocommerce/client/blocks/assets/js/blocks/mini-cart/iapi-frontend.ts:83-101
Timestamp: 2025-06-17T07:07:53.443Z
Learning: In WooCommerce blocks, when porting existing code patterns that have known issues (like parseInt truncating decimal money values), maintain consistency with existing implementation rather than making isolated fixes. The preference is for systematic refactoring approaches (like broader Dinero adoption) over piecemeal changes.

Learnt from: NeosinneR
PR: woocommerce/woocommerce#0
File: :0-0
Timestamp: 2025-06-26T14:25:08.421Z
Learning: In WooCommerce transactional emails, product images have historically had display issues due to lazy loading attributes being applied, which email clients cannot process since they don't execute JavaScript. This issue existed in both old and new email templates, but became more visible with the new email template system. The fix involves preventing lazy loading on attachment images specifically during email generation by adding skip classes and data attributes.

Learnt from: Aljullu
PR: #58809
File: plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php:222-225
Timestamp: 2025-06-13T17:11:13.732Z
Learning: In plugins/woocommerce/src/Blocks/BlockTypes/ProductButton.php, the team intentionally relies on toggling the disabled CSS class (via data-wp-class--disabled) instead of binding the disabled attribute, to mirror the behavior of the classic WooCommerce template.

Learnt from: ralucaStan
PR: #58782
File: plugins/woocommerce/client/blocks/assets/js/base/utils/render-frontend.tsx:0-0
Timestamp: 2025-06-16T16:12:12.148Z
Learning: For WooCommerce checkout blocks, lazy loading was removed in favor of direct imports to prevent sequential "popping" effects during component loading. This approach prioritizes user experience over code splitting, with minimal bundle size impact and improved performance (1.7s to 1.1s speed score improvement). The checkout flow benefits from having all components load together rather than incrementally.

Learnt from: prettyboymp
PR: #59048
File: .github/workflows/cherry-pick-milestoned-prs.yml:60-83
Timestamp: 2025-06-26T12:45:40.709Z
Learning: WooCommerce uses WordPress versioning conventions where minor versions in X.Y.Z format are constrained to 0-9 (Y cannot exceed 9). This means version increment logic should reset minor to 0 and increment major when minor reaches 9, rather than allowing two-digit minor versions like 9.10 or 9.11.

Learnt from: mikejolley
PR: #57961
File: plugins/woocommerce/includes/class-wc-session-handler.php:302-333
Timestamp: 2025-06-19T11:58:57.484Z
Learning: In WooCommerce's session handler, the cart merging behavior was revised to always merge guest and user carts when both contain items, rather than having the guest cart take precedence. The migrate_cart_data() method in class-wc-session-handler.php implements this by using array_merge() to combine both carts when neither is empty.

Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/code-quality.mdc:0-0
Timestamp: 2025-07-18T14:54:36.232Z
Learning: Applies to **/*.{php,js,jsx,ts,tsx} : Ensure code has unit or E2E tests where applicable

🔇 Additional comments (1)
packages/php/email-editor/src/Engine/Renderer/class-html2text.php (1)

203-279: Well-implemented HTML parsing with proper security considerations.

The method correctly handles:

  • Empty HTML edge cases
  • Character encoding detection for different PHP versions
  • Security through LIBXML_NONET flag to prevent external entity resolution
  • Sensitive data protection by truncating logged HTML content

Comment on lines +540 to +541
if ( '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
$output = substr( $output, 1, strlen( $output ) - 2 );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add bounds checking for string operations.

The substr operations assume the string has specific characters at the boundaries without explicit length checks.

-				if ( '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
+				if ( strlen( $output ) >= 2 && '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ( '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
$output = substr( $output, 1, strlen( $output ) - 2 );
if ( strlen( $output ) >= 2 && '[' === substr( $output, 0, 1 ) && ']' === substr( $output, -1 ) ) {
$output = substr( $output, 1, strlen( $output ) - 2 );
🤖 Prompt for AI Agents
In packages/php/email-editor/src/Engine/Renderer/class-html2text.php around
lines 540 to 541, the code uses substr to check and trim the first and last
characters of $output without verifying the string length. Add explicit length
checks to ensure $output is at least 2 characters long before performing these
substr operations to avoid potential errors or unexpected behavior with shorter
strings.

@prettyboymp
Copy link
Contributor

Do we need to include any original copyright lines and a copy of the MIT license to include this?

@jeherve
Copy link
Contributor

jeherve commented Jul 22, 2025

Do we need to include any original copyright lines and a copy of the MIT license to include this?

That's an excellent point, and a good question.

Internal reference to figure that out: p2y3YZ-9Ry-p2

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php (2)

287-292: Use regex assertions to avoid brittle new-line expectations

Some cases (heading_tags, list_items, ordered_list) rely on exact \n placement or the - bullet.
If the converter later switches to PHP_EOL, * or numbered lists, these tests will fail even though the output is still correct.
Favour assertMatchesRegularExpression() with a tolerant pattern instead.


431-463: Silence static-analysis false-positive & clarify intent

empty( $case_data[1] ) is flagged as always truthy/falsy.
Replace with a more explicit check:

$expected = $case_data[1];
if ( $expected === '' ) {
    $this->assertSame( '', trim( $rendered['text'] ), "Failed edge case: {$case_name}" );
} else {
    $this->assertStringContainsString( $expected, $rendered['text'], "Failed edge case: {$case_name}" );
}

This removes the offset warning and improves readability.

packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php (2)

65-72: Brittle expectation for <hr> conversion

The test hard-codes a 63-character dash line. The converter may legitimately choose a different length without breaking semantics.
Assert on the presence of at least three consecutive dashes instead, e.g.:

$this->assertMatchesRegularExpression( '/Before\s+\-{3,}\s+After/', $result );

153-158: Tie exception expectation to concrete implementation

You expect \InvalidArgumentException, but Html2Text already ships its own Html2Text_Exception.
Targeting the library-specific exception prevents false positives if internal validation is refactored:

$this->expectException( Html2Text_Exception::class );
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cd38ad and dda69fa.

📒 Files selected for processing (3)
  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php (0 hunks)
  • packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php (1 hunks)
  • packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php (1 hunks)
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php
  • packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php
  • packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php
🧠 Learnings (1)
📓 Common learnings
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Applies to .github/PULL_REQUEST_TEMPLATE.md : Preserve required markdown sections in the PR description, especially those used by automation (e.g., changelog entry details and comments). Do not remove or alter them.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Provide clear, step-by-step instructions for how to test the changes in the PR description.
🪛 GitHub Check: PHPStan: PHP 8.4 - @woocommerce/email-editor-config [unit]
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php

[failure] 457-457:
Offset 1 on array{'…'|'<!-- wp:paragraph -…', ''} in empty() always exists and is always falsy.


[failure] 452-452:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.


[failure] 404-404:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.

🪛 GitHub Check: PHPStan: PHP 7.4 - @woocommerce/email-editor-config [unit]
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php

[failure] 457-457:
Offset 1 on array{'…'|'<!-- wp:paragraph -…', ''} in empty() always exists and is always falsy.


[failure] 452-452:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.


[failure] 404-404:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.

💤 Files with no reviewable changes (1)
  • packages/php/email-editor/src/Engine/Renderer/class-renderer.php
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{php,js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/code-quality.mdc)

**/*.{php,js,jsx,ts,tsx}: Guard against unexpected inputs
Sanitize and validate any potentially dangerous inputs
Ensure code is backwards compatible
Write code that is readable and intuitive
Ensure code has unit or E2E tests where applicable

Files:

  • packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php
  • packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php
**/*.{php,js,ts,jsx,tsx}

⚙️ CodeRabbit Configuration File

**/*.{php,js,ts,jsx,tsx}: Don't trust that extension developers will follow the best practices, make sure the code:

  • Guards against unexpected inputs.
  • Sanitizes and validates any potentially dangerous inputs.
  • Is backwards compatible.
  • Is readable and intuitive.
  • Has unit or E2E tests where applicable.

Files:

  • packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php
  • packages/php/email-editor/tests/unit/Engine/Renderer/Html2Text_Test.php
🧠 Learnings (1)
📓 Common learnings
Learnt from: jorgeatorres
PR: woocommerce/woocommerce#59675
File: .github/workflows/release-bump-as-requirement.yml:48-65
Timestamp: 2025-07-15T15:39:21.856Z
Learning: In WooCommerce core repository, changelog entries for all PRs live in `plugins/woocommerce/changelog/` directory and are processed during releases, not at the repository root level.
Learnt from: lysyjan
PR: woocommerce/woocommerce#59070
File: packages/php/email-editor/src/Integrations/Core/class-initializer.php:103-141
Timestamp: 2025-06-23T16:55:58.246Z
Learning: The core/spacer block in the WooCommerce email editor intentionally does not have a dedicated renderer class and is handled by the fallback renderer instead of having an explicit render_email_callback assigned in the switch statement.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Applies to .github/PULL_REQUEST_TEMPLATE.md : Preserve required markdown sections in the PR description, especially those used by automation (e.g., changelog entry details and comments). Do not remove or alter them.
Learnt from: CR
PR: woocommerce/woocommerce#0
File: .cursor/rules/generate-pr-description.mdc:0-0
Timestamp: 2025-07-21T05:22:46.417Z
Learning: Provide clear, step-by-step instructions for how to test the changes in the PR description.
🪛 GitHub Check: PHPStan: PHP 8.4 - @woocommerce/email-editor-config [unit]
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php

[failure] 457-457:
Offset 1 on array{'…'|'<!-- wp:paragraph -…', ''} in empty() always exists and is always falsy.


[failure] 452-452:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.


[failure] 404-404:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.

🪛 GitHub Check: PHPStan: PHP 7.4 - @woocommerce/email-editor-config [unit]
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php

[failure] 457-457:
Offset 1 on array{'…'|'<!-- wp:paragraph -…', ''} in empty() always exists and is always falsy.


[failure] 452-452:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.


[failure] 404-404:
Parameter #1 $post of function get_post expects int|WP_Post|null, int|WP_Error given.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: PHP: 8.1 WP: latest [WP latest] - @woocommerce/email-editor-config [unit:php]
🔇 Additional comments (1)
packages/php/email-editor/tests/integration/Engine/Renderer/Renderer_Test.php (1)

270-279: Guard against factory failures before get_post()

WP_UnitTest_Factory_For_Post::create() can return WP_Error on failure.
Add an early assertNotInstanceOf( \WP_Error::class, $email_post_id ) (or skip the get_post() call when it fails) to satisfy PHPStan and avoid a fatal in unlikely edge-cases.

⛔ Skipped due to learnings
Learnt from: triple0t
PR: woocommerce/woocommerce#59186
File: packages/js/email-editor/src/store/initial-state.ts:9-10
Timestamp: 2025-06-26T12:13:32.062Z
Learning: In WooCommerce email editor store initialization (packages/js/email-editor/src/store/initial-state.ts), the current_post_id and current_post_type from window.WooCommerceEmailEditor are required parameters that should cause explicit errors if missing, rather than using fallback values or optional chaining. The design preference is to fail fast when critical initialization data is unavailable.

allilevine and others added 10 commits July 23, 2025 09:26
# This is the 1st commit message:

Email Editor: Integrate the convert class from the HTML to Text library and remove the dependency.

# This is the commit message #2:

Fix linting errors.

# This is the commit message #3:

Fix more linting errors.

# This is the commit message #4:

Fix package linting errors.

# This is the commit message #5:

Fix additional package linting errors.
…ndency

- Internalized soundasleep/html2text functionality
- Added Html2Text and Html2Text_Exception classes
- Updated composer.json to remove dependency
- Fixed all PHPStan and WordPress coding standards violations
- Ensured PHP 7.4+ compatibility and standalone package support
@allilevine allilevine force-pushed the wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor branch from ba8bd92 to 85d9264 Compare July 23, 2025 13:26
@jeherve jeherve dismissed pavel-mailpoet’s stale review July 24, 2025 08:11

Changes have been addressed: we now have tests and a changelog entry.

@jeherve jeherve requested a review from a team July 24, 2025 08:48
Copy link
Contributor

@jeherve jeherve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and the tests look good. I'll approve, but I'll let someone more familiar with the package chime in too :)

Copy link
Contributor

@pavel-mailpoet pavel-mailpoet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for your contribution, tests are passing, so let's merge it @allilevine

@allilevine allilevine merged commit 83728fb into trunk Jul 24, 2025
19 checks passed
@allilevine allilevine deleted the wooplug-4979-integrate-the-html-to-text-library-into-the-email-editor branch July 24, 2025 13:10
@github-actions github-actions bot added this to the 10.2.0 milestone Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

focus: monorepo infrastructure Issues and PRs related to monorepo tooling. package: @woocommerce/email-editor issues related to the @woocommerce/email-editor package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants