Skip to content

Fix syntax-based URI normalization (fixes #262, alternative to #263)#265

Merged
hartwork merged 5 commits intomasterfrom
issue-262-fix-normalization
Sep 28, 2025
Merged

Fix syntax-based URI normalization (fixes #262, alternative to #263)#265
hartwork merged 5 commits intomasterfrom
issue-262-fix-normalization

Conversation

@hartwork
Copy link
Copy Markdown
Member

Fixes #262, alternative to #263

CC @TimWolla

@hartwork hartwork added this to the 0.9.10 milestone Sep 27, 2025
@hartwork hartwork added the bug Something isn't working label Sep 27, 2025
Copy link
Copy Markdown
Contributor

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I don't currently have a compiler ready and thus can't test this right now (will try to do tomorrow). But some thoughts based on looking at the diff and tests and having previously looked into this:

The // issue is only relevant when there is no authority. It might make sense to add a test case for http://example.com/.//foo and also to allow normalization in that case (I believe it should be http://example//foo). This would also alleviate my "concerns" (for lack a better word) that I raised in the issue: For a majority of URLs that users encounter (namely those with an authority), the path will be fully normalized.

@hartwork
Copy link
Copy Markdown
Member Author

[..] can't test this right now (will try to do tomorrow).

@TimWolla thank you!

The // issue is only relevant when there is no authority. It might make sense to add a test case for http://example.com/.//foo and also to allow normalization in that case (I believe it should be http://example//foo).

Interesting point, let me look into that case now…

Comment thread ChangeLog Outdated
- "/.//path1/path2"
- ".//path1/path2"
The fix is to not remove that dot segment.
Thanks to Tim Düsterhus for the report!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be appropriate to also thank @nyamsprod, who found and raised the issue for PHP. I just relayed the bug report after verifying it is not a bug in the PHP implementation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimWolla added 👍

@nyamsprod thanks for php/php-src#19897 ! 👍

@hartwork hartwork force-pushed the issue-262-fix-normalization branch from 3c8972e to baf5b8f Compare September 27, 2025 21:02
@hartwork
Copy link
Copy Markdown
Member Author

[..] can't test this right now (will try to do tomorrow).

@TimWolla thank you!

The // issue is only relevant when there is no authority. It might make sense to add a test case for http://example.com/.//foo and also to allow normalization in that case (I believe it should be http://example//foo).

Interesting point, let me look into that case now…

@TimWolla spot on, spot on!

I should note that current use of argument relative to internal function RemoveDotSegmentsEx leaves some open questions to investigate, in particular when the caller wants URI_TRUE and when URI_FALSE and why.

@hartwork hartwork requested a review from TimWolla September 27, 2025 21:07
@TimWolla
Copy link
Copy Markdown
Contributor

I should note that current use of argument relative to internal function RemoveDotSegmentsEx leaves some open questions to investigate, in particular when the caller wants URI_TRUE and when URI_FALSE and why.

It also appears this is not actually tested anywhere. I have replaced the if (relative) by if (1) and no tests failed. Consider adding the following test case (as a separate PR probably):

	EXPECT_TRUE(testNormalizeSyntaxHelper(
			L"scheme:./path1:/path2",
			L"scheme:path1:/path2",
			URI_NORMALIZE_PATH));

@hartwork
Copy link
Copy Markdown
Member Author

I should note that current use of argument relative to internal function RemoveDotSegmentsEx leaves some open questions to investigate, in particular when the caller wants URI_TRUE and when URI_FALSE and why.

It also appears this is not actually tested anywhere. I have replaced the if (relative) by if (1) and no tests failed.

@TimWolla I came to a similar conclusion, yes.

Consider adding the following test case (as a separate PR probably):

	EXPECT_TRUE(testNormalizeSyntaxHelper(
			L"scheme:./path1:/path2",
			L"scheme:path1:/path2",
			URI_NORMALIZE_PATH));

Adding that and one more with the next push…

@hartwork hartwork force-pushed the issue-262-fix-normalization branch from baf5b8f to e922ee1 Compare September 28, 2025 14:26
@hartwork hartwork requested a review from TimWolla September 28, 2025 14:26
@hartwork hartwork merged commit 08df3b2 into master Sep 28, 2025
8 checks passed
@hartwork hartwork deleted the issue-262-fix-normalization branch September 28, 2025 15:02
TimWolla added a commit to TimWolla/php-src that referenced this pull request Sep 29, 2025
TimWolla added a commit to php/php-src that referenced this pull request Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

scheme:/.//foo/bar is incorrectly normalized to scheme://foo/bar

2 participants