Skip to content

IRI: Simplify parse_iri()#890

Merged
jtojnar merged 1 commit intomasterfrom
wip/jtojnar/regex-null
Sep 30, 2024
Merged

IRI: Simplify parse_iri()#890
jtojnar merged 1 commit intomasterfrom
wip/jtojnar/regex-null

Conversation

@jtojnar
Copy link
Member

@jtojnar jtojnar commented Sep 29, 2024

When a regex contains a capture group within another capture group followed by ?,
and the outer group is not matched, the resulting matched text for both groups will be an empty string. (1)
Or, if there are no further matched capture groups, the indices for the groups will be simply omitted from the $matches array. (2)

We want missing components of the URI to be null so we had a conditionals for the aforementioned matches.

As of PHP 7.2, we can just use PREG_UNMATCHED_AS_NULL to get nulls in the case (1):
https://www.php.net/manual/en/function.preg-match.php#refsect1-function.preg-match-parameters

If we raised minimum PHP version to 7.4, we would not even need the isset to handle (2) because from that version onward, the flag also disables the omission of tail unmatched groups:
https://www.php.net/manual/en/migration74.incompatible.php#migration74.incompatible.pcre

Switch the optional outer capture groups (1, 3, 6 and 8) to anonymous ones now that we are no longer using them for anything.

This is follow up to #878 and should fix the type issue in #857

@jtojnar
Copy link
Member Author

jtojnar commented Sep 29, 2024

Ugh, I forgot we are still targetting 7.2, not 7.4.

@jtojnar jtojnar force-pushed the wip/jtojnar/regex-null branch from 90ac9cf to 9a3ca60 Compare September 29, 2024 16:20
@jtojnar jtojnar changed the title Simplify IRI::parse_iri IRI: Simplify parse_iri() Sep 29, 2024
@jtojnar jtojnar requested a review from Art4 September 29, 2024 16:21
@jtojnar jtojnar mentioned this pull request Sep 30, 2024
@Art4 Art4 added this to the 1.9.0 milestone Sep 30, 2024
Copy link
Contributor

@Art4 Art4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

When a regex contains a capture group within another capture group followed by `?`,
and the outer group is not matched, the resulting matched text for both groups will be an empty string. (1)
Or, if there are no further matched capture groups, the indices for the groups will be simply omitted from the `$matches` array. (2)

We want missing components of the URI to be `null` so we had a conditionals for the aforementioned matches.

As of PHP 7.2, we can just use `PREG_UNMATCHED_AS_NULL` to get `null`s in the case (1):
https://www.php.net/manual/en/function.preg-match.php#refsect1-function.preg-match-parameters

If we raised minimum PHP version to 7.4, we would not even need the `isset` to handle (2) because from that version onward, the flag also disables the omission of tail unmatched groups:
https://www.php.net/manual/en/migration74.incompatible.php#migration74.incompatible.pcre

Switch the optional outer capture groups (1, 3, 6 and 8) to anonymous ones now that we are no longer using them for anything.
@jtojnar jtojnar force-pushed the wip/jtojnar/regex-null branch from 9a3ca60 to 176abc4 Compare September 30, 2024 18:52
@jtojnar jtojnar merged commit 176abc4 into master Sep 30, 2024
@jtojnar jtojnar deleted the wip/jtojnar/regex-null branch September 30, 2024 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants