Skip to content

Conversation

@andrcuns
Copy link
Contributor

What are you trying to accomplish?

This prevents URI::DEFAULT_PARSER.regexp[:ABS_URI] returning nil if uri gem is updated to version beyond 1. Because gemspec does not define a range for uri gem, if a project uses dependabot-core and has a more recent version of uri, nil will be returned because DEFAULT_PARSER has changed starting from version 1 and does not define :ABS_URI key in regexp hash. This problem would probably also happen at some point if some transitive dependency updates the uri gem.

By using explicit parser, it should preserve the original behavior regardless of version used.

Anything you want to highlight for special attention from reviewers?

How will you know you've accomplished your goal?

I compared the outputs on older and newer versions or uri and tests should validate this change as well.

Checklist

  • I have run the complete test suite to ensure all tests and linters pass.
  • I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
  • I have written clear and descriptive commit messages.
  • I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
  • I have ensured that the code is well-documented and easy to understand.

@andrcuns andrcuns requested a review from a team as a code owner April 29, 2025 17:08
@github-actions github-actions bot added L: php:composer Issues and code for Composer L: python L: python:uv labels Apr 29, 2025
@sachin-sandhu
Copy link
Contributor

Hi @andrcuns , thank you for contributing to dependabot. We are looking into the PR.

Thanks !

@sachin-sandhu sachin-sandhu moved this from Ready to Scoping in Dependabot Apr 30, 2025
@thavaahariharangit
Copy link
Contributor

@andrcuns Can we have a RSpec test for this changes pls.

@thavaahariharangit thavaahariharangit self-assigned this Apr 30, 2025
@andrcuns
Copy link
Contributor Author

@thavaahariharangit What type of tests do you have in mind?

These lines are actually covered quite well by existing tests which would fail if the return value of this regex would not be as expected. I did a quick test with a broken regexp:

composer:

rspec ./spec/dependabot/composer/update_checker/version_resolver_spec.rb:268 # Dependabot::Composer::UpdateChecker::VersionResolver latest_resolvable_version with a missing vcs repository source raises a Dependabot::DependencyFileNotResolvable error
rspec ./spec/dependabot/composer/update_checker/version_resolver_spec.rb:290 # Dependabot::Composer::UpdateChecker::VersionResolver latest_resolvable_version with a missing git repository source raises a Dependabot::GitDependenciesNotReachable error
rspec ./spec/dependabot/composer/update_checker_spec.rb:658 # Dependabot::Composer::UpdateChecker#latest_resolvable_version with a git source dependency that's not the dependency we're checking that is unreachable raises a helpful error

python, direct class tests would fail and a 100+ more 😅 :

rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:101 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is enabled with a valid JSON response fetches data from JSON registry first and returns correct package releases
rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:120 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is enabled when JSON response is empty falls back to HTML registry and fetches versions correctly
rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:135 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is disabled fetches data from the HTML registry and returns package releases

uv, pretty much same as with python:

rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:120 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is enabled when JSON response is empty falls back to HTML registry and fetches versions correctly
rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:101 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is enabled with a valid JSON response fetches data from JSON registry first and returns correct package releases
rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:135 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is disabled fetches data from the HTML registry and returns package releases

I could add some basic unit test with hardcoding expected response to that method but I feel like it wouldn't actually add much value, there is ample existing coverage.

@thavaahariharangit
Copy link
Contributor

@andrcuns

Since GitHub Dependabot is an open-source project with a large number of contributors, it's crucial to ensure that any changes you make are accompanied by tests. If another contributor reverts your changes, it will be clear that they need to contact you for clarification.

Additionally, if your changes don't affect any existing tests and don't require new ones, it raises the question of why you're making these changes in the first place.

@andrcuns
Copy link
Contributor Author

andrcuns commented May 1, 2025

@thavaahariharangit

it's crucial to ensure that any changes you make are accompanied by tests.

That didn't really answer my question - What type of tests do you have in mind?. These lines of code are covered by a lot of existing tests, so it would be nice to understand what kind of additional tests you have in mind? Simple method unit test that checks the return value of the method or something else?

Additionally, if your changes don't affect any existing tests and don't require new ones, it raises the question of why you're making these changes in the first place.

I think description states very clearly why I am making this change. As soon as you update uri gem in your project, you will completely break uv, python and composer ecosystems. In my project which uses dependabot-core as a dependency that happened because this project does not define version range for uri gem and I consider that to be a bug. Alternative fix would be to define a range for uri gem, but a better fix would be to fix that in a way that doesn't add such limitation, don't you think?

@thavaahariharangit
Copy link
Contributor

thavaahariharangit commented May 1, 2025

@andrcuns

Based on my understanding:

  • DEFAULT_PARSER validates URLs against modern standards.
  • RFC2396_PARSER supports URLs adhering to older formats, providing broader compatibility.

In my opinion, the ideal solution would accommodate both parsers to ensure comprehensive handling of all URL formats.

However, if the decision is to use only RFC2396_PARSER, please provide a unit test showcasing scenarios where DEFAULT_PARSER fails. Additionally, it would be prudent to introduce a conditional check to dynamically select the appropriate parser based on the specific scenario.

@andrcuns
Copy link
Contributor Author

andrcuns commented May 1, 2025

@thavaahariharangit

Based on my understanding:

DEFAULT_PARSER validates URLs against modern standards.
RFC2396_PARSER supports URLs adhering to older formats, providing broader compatibility.

As things stand right now, DEFAULT_PARSER == RFC2396_PARSER. This is how the project works currently, it uses RFC2396_PARSER which is aliased via DEFAULT_PARSER, so dependabot-core is not using modern standards as things stand right now.

However, if you would update uri to version > 1, alias of DEFAULT_PARSER would change to this "modern standard" because starting from that version it points to RFC3986_PARSER which does not have a key name :ABS_URI.

My intent with this change was to ensure that no matter which version of uri gem is used, behavior remains unchanged which is not the case right now.

Additionally, it would be prudent to introduce a conditional check to dynamically select the appropriate parser based on the specific scenario.

With current implementation, there isn't an alternative scenario really, :ABS_URI exists only in RFC2396_PARSER. I'm not sure what the appropriate alternative would be in RFC3986_PARSER, I don 't have the capacity to make such a refactor and ensure that all usecases will work as expected.

However, if the decision is to use only RFC2396_PARSER, please provide a unit test showcasing scenarios where DEFAULT_PARSER fails. The only test that could be added is that if RFC3986_PARSER is used, the method will fail, but that seems kind of pointless.

I can't really provide such a test because for DEFAULT_PARSER to fail, version of uri gem needs to be updated.

Here is the point I'm trying to make across, output from current dev image.

With current version of uri 0.13.1:

[dependabot-core-dev] ~/uv $ bundle info uri
  * uri (0.13.1)
	Summary: URI is a module providing classes to handle Uniform Resource Identifiers
	Homepage: https://github.com/ruby/uri
	Documentation: https://ruby.github.io/uri/
	Source Code: https://github.com/ruby/uri
	Changelog: https://github.com/ruby/uri/releases
	Bug Tracker: https://github.com/ruby/uri/issues
	Path: /usr/local/lib/ruby/gems/3.3.0/gems/uri-0.13.1
	Default Gem: yes
[dependabot-core-dev] ~/uv $ bundle exec irb
irb(main):001> require "uri"
=> true
irb(main):002> URI::DEFAULT_PARSER
=> #<URI::RFC2396_Parser:0x0000ffff8aa09580>
irb(main):003> URI::DEFAULT_PARSER.regexp[:ABS_URI]
=>
/\A\s*+
        ([a-zA-Z][\-+.a-zA-Z\d]*):                           (?# 1: scheme)
        (?:
           ((?:[\-_.!~*'()a-zA-Z\d;?:@&=+$,]|%[a-fA-F\d]{2})(?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*)                    (?# 2: opaque)
        |
           (?:(?:
             \/\/(?:
                 (?:(?:((?:[\-_.!~*'()a-zA-Z\d;:&=+$,]|%[a-fA-F\d]{2})*)@)?        (?# 3: userinfo)
                   (?:((?:(?:[a-zA-Z0-9\-.]|%\h\h)+|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|\[(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:(?:[a-fA-F\d]{1,4}:)*[a-fA-F\d]{1,4})?::(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}))?)\]))(?::(\d*))?))? (?# 4: host, 5: port)
               |
                 ((?:[\-_.!~*'()a-zA-Z\d$,;:@&=+]|%[a-fA-F\d]{2})+)                 (?# 6: registry)
               )
             |
             (?!\/\/))                           (?# XXX: '\/\/' is the mark for hostport)
             (\/(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*(?:\/(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*)*)?                    (?# 7: path)
           )(?:\?((?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))?                 (?# 8: query)
        )
        (?:\#((?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))?                  (?# 9: fragment)
      \s*\z/x

After uri gem update:

[dependabot-core-dev] ~/uv $ bundle info uri
  * uri (1.0.3)
	Summary: URI is a module providing classes to handle Uniform Resource Identifiers
	Homepage: https://github.com/ruby/uri
	Documentation: https://ruby.github.io/uri/
	Source Code: https://github.com/ruby/uri
	Changelog: https://github.com/ruby/uri/releases
	Bug Tracker: https://github.com/ruby/uri/issues
	Path: /home/dependabot/dependabot-updater/vendor/ruby/3.3.0/gems/uri-1.0.3
	Reverse Dependencies:
		dependabot-common (0.310.0) depends on uri (~> 1.0)
[dependabot-core-dev] ~/uv $ bundle exec irb
irb(main):001> require "uri"
=> true
irb(main):002> URI::DEFAULT_PARSER
=> #<URI::RFC3986_Parser:0x0000ffff67ec5808>
irb(main):003> URI::DEFAULT_PARSER.regexp[:ABS_URI]
=> nil

Copy link
Contributor

@thavaahariharangit thavaahariharangit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thavaahariharangit thavaahariharangit merged commit 75a71d5 into dependabot:main May 2, 2025
76 of 78 checks passed
@github-project-automation github-project-automation bot moved this from Scoping to Done in Dependabot May 2, 2025
JamieMagee added a commit that referenced this pull request May 27, 2025
In Ruby 3.4 `URI::DEFAULT_PARSER` changed from `RFC2396` to `RFC3896`[1]. This change makes our use of `RFC2396` explicit instead of implicit.

This migration was partially completed in #12170. This change just finishes the migration.

This is a part of the Ruby 3.4 upgrade #11681

[1]: ruby/uri#107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants