Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 3, 2025

File headers (Plugin Name, Theme Name, Description, etc.) are extracted using regex but lack line number references, making it impossible to print warnings with location information for invalid/missing header strings.

Changes

FileDataExtractor

  • Added PREG_OFFSET_CAPTURE flag to capture byte offsets during regex matching
  • Calculate line numbers from offsets: substr_count($text, "\n", 0, $offset) + 1
  • Always returns ['value' => string, 'line' => int] format
  • Simplified API with no optional parameters

MakePotCommand

  • Updated get_main_file_data() to always receive line numbers
  • Modified all main_file_data accesses to handle array structure with value and line keys
  • Enhanced addReference() calls to include line numbers

IterableCodeExtractor

  • Added line number support for Template Name, Pattern Title, and Pattern Description extraction

Tests

  • Added unit tests for line number extraction
  • Updated behat expectations: #: foo-plugin.php#: foo-plugin.php:5

Example

// Before: warnings lack location context
Warning: Issue with plugin description

// After: precise line references
Warning: Issue with plugin description (foo-plugin.php:5)

File references in POT files now include line numbers:

- #: foo-plugin.php
+ #: foo-plugin.php:5
Original prompt

This section details on the original issue you should resolve

<issue_title>Add line number references to extracted file header data</issue_title>
<issue_description>This is something I noticed while working on #61 / #64.

When printing warnings for missing / invalid strings I want to print the warning together with the location of where that string is.

For file headers that's currently not possible because they're being extracted using regex. This is mainly because I borrowed that code from https://develop.svn.wordpress.org/trunk/tools/i18n/makepot.php, which has been doing that for years.

However, since we already leverage a powerful gettext extraction library, I wonder if we could add a new extractor just for these file headers.

retrieve_main_file_data(), get_file_data() get_file_headers(), get_file_data_from_string(), and _cleanup_header_comment() would be affected/obsoleted by such a change.

I'm not sure how feasible it is, but it would make for a more robust solution that is more similar to the other extractors and easier to test than some old regex from WordPress core.

Also, we could make sure to get the filename / line references that way.

I guess it's the same argument as regex vs. using a DOM parser to parse some HTML in PHP.


I just saw that the Template Name: extraction already resides in \WP_CLI\I18n\IterableCodeExtractor.

We should probably consolidate this somehow.</issue_description>

<agent_instructions>Perhaps you can find a way to achieve this using PREG_OFFSET_CAPTURE or, alternatively, a more complex solution with a dedicated parser or so.

If you write unit tests, use composer phpunit to run them.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@swissspidy Another benefit:

For WordPress core we wouldn't need to extract Hello Dolly separately, but simultaneously and with the correct location info.

We'd still need some way to detect whether the project is a plugin, theme, core, or something else.</comment_new>
<comment_new>@swissspidy
Follow-up: #376</comment_new>
<comment_new>@swissspidy
See #377 (comment) for how this could maybe be achieved with regex.</comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Add line number references to extracted file header data Add line number references to file header extraction Nov 3, 2025
Copilot AI requested a review from swissspidy November 3, 2025 14:09
@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 88.23529% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/IterableCodeExtractor.php 64.28% 5 Missing ⚠️
src/MakePotCommand.php 96.15% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@swissspidy swissspidy marked this pull request as ready for review November 3, 2025 17:37
@swissspidy swissspidy requested a review from a team as a code owner November 3, 2025 17:37
@swissspidy
Copy link
Member

@copilot I don't think $with_line_nums is ever false, simplify the implementation to always collect line numbers

Remove the optional $with_line_nums parameter as suggested in review.
The function now always returns the array format with 'value' and 'line' keys,
simplifying the API and reducing conditional logic.

Co-authored-by: swissspidy <[email protected]>
Copy link
Contributor Author

Copilot AI commented Dec 11, 2025

@copilot I don't think $with_line_nums is ever false, simplify the implementation to always collect line numbers

Simplified in 61019dd. Removed the $with_line_nums parameter - the functions now always return the array format with value and line keys.

@swissspidy swissspidy added this to the 2.6.7 milestone Dec 11, 2025
@swissspidy swissspidy merged commit 4055e85 into main Dec 11, 2025
57 of 112 checks passed
@swissspidy swissspidy deleted the copilot/fix-1570774-111592377-9a5322aa-fa5a-4159-9cbc-71d1f6d9f486 branch December 11, 2025 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add line number references to extracted file header data

2 participants