-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Use explicit uri parser when fetching abs_uri regexp #12170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use explicit uri parser when fetching abs_uri regexp #12170
Conversation
|
Hi @andrcuns , thank you for contributing to dependabot. We are looking into the PR. Thanks ! |
|
@andrcuns Can we have a RSpec test for this changes pls. |
|
@thavaahariharangit What type of tests do you have in mind? These lines are actually covered quite well by existing tests which would fail if the return value of this regex would not be as expected. I did a quick test with a broken regexp: composer: rspec ./spec/dependabot/composer/update_checker/version_resolver_spec.rb:268 # Dependabot::Composer::UpdateChecker::VersionResolver latest_resolvable_version with a missing vcs repository source raises a Dependabot::DependencyFileNotResolvable error
rspec ./spec/dependabot/composer/update_checker/version_resolver_spec.rb:290 # Dependabot::Composer::UpdateChecker::VersionResolver latest_resolvable_version with a missing git repository source raises a Dependabot::GitDependenciesNotReachable error
rspec ./spec/dependabot/composer/update_checker_spec.rb:658 # Dependabot::Composer::UpdateChecker#latest_resolvable_version with a git source dependency that's not the dependency we're checking that is unreachable raises a helpful errorpython, direct class tests would fail and a 100+ more 😅 : rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:101 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is enabled with a valid JSON response fetches data from JSON registry first and returns correct package releases
rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:120 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is enabled when JSON response is empty falls back to HTML registry and fetches versions correctly
rspec ./spec/dependabot/python/package/package_details_fetcher_spec.rb:135 # Dependabot::Python::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_python is disabled fetches data from the HTML registry and returns package releasesuv, pretty much same as with python: rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:120 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is enabled when JSON response is empty falls back to HTML registry and fetches versions correctly
rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:101 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is enabled with a valid JSON response fetches data from JSON registry first and returns correct package releases
rspec ./spec/dependabot/uv/package/package_details_fetcher_spec.rb:135 # Dependabot::Uv::Package::PackageDetailsFetcher#fetch when enable_cooldown_for_uv is disabled fetches data from the HTML registry and returns package releasesI could add some basic unit test with hardcoding expected response to that method but I feel like it wouldn't actually add much value, there is ample existing coverage. |
|
Since GitHub Dependabot is an open-source project with a large number of contributors, it's crucial to ensure that any changes you make are accompanied by tests. If another contributor reverts your changes, it will be clear that they need to contact you for clarification. Additionally, if your changes don't affect any existing tests and don't require new ones, it raises the question of why you're making these changes in the first place. |
That didn't really answer my question -
I think description states very clearly why I am making this change. As soon as you update |
|
Based on my understanding:
In my opinion, the ideal solution would accommodate both parsers to ensure comprehensive handling of all URL formats. However, if the decision is to use only RFC2396_PARSER, please provide a unit test showcasing scenarios where DEFAULT_PARSER fails. Additionally, it would be prudent to introduce a conditional check to dynamically select the appropriate parser based on the specific scenario. |
|
Based on my understanding: As things stand right now, However, if you would update My intent with this change was to ensure that no matter which version of
With current implementation, there isn't an alternative scenario really,
I can't really provide such a test because for DEFAULT_PARSER to fail, version of Here is the point I'm trying to make across, output from current dev image. With current version of uri [dependabot-core-dev] ~/uv $ bundle info uri
* uri (0.13.1)
Summary: URI is a module providing classes to handle Uniform Resource Identifiers
Homepage: https://github.com/ruby/uri
Documentation: https://ruby.github.io/uri/
Source Code: https://github.com/ruby/uri
Changelog: https://github.com/ruby/uri/releases
Bug Tracker: https://github.com/ruby/uri/issues
Path: /usr/local/lib/ruby/gems/3.3.0/gems/uri-0.13.1
Default Gem: yes
[dependabot-core-dev] ~/uv $ bundle exec irb
irb(main):001> require "uri"
=> true
irb(main):002> URI::DEFAULT_PARSER
=> #<URI::RFC2396_Parser:0x0000ffff8aa09580>
irb(main):003> URI::DEFAULT_PARSER.regexp[:ABS_URI]
=>
/\A\s*+
([a-zA-Z][\-+.a-zA-Z\d]*): (?# 1: scheme)
(?:
((?:[\-_.!~*'()a-zA-Z\d;?:@&=+$,]|%[a-fA-F\d]{2})(?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*) (?# 2: opaque)
|
(?:(?:
\/\/(?:
(?:(?:((?:[\-_.!~*'()a-zA-Z\d;:&=+$,]|%[a-fA-F\d]{2})*)@)? (?# 3: userinfo)
(?:((?:(?:[a-zA-Z0-9\-.]|%\h\h)+|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|\[(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:(?:[a-fA-F\d]{1,4}:)*[a-fA-F\d]{1,4})?::(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}))?)\]))(?::(\d*))?))? (?# 4: host, 5: port)
|
((?:[\-_.!~*'()a-zA-Z\d$,;:@&=+]|%[a-fA-F\d]{2})+) (?# 6: registry)
)
|
(?!\/\/)) (?# XXX: '\/\/' is the mark for hostport)
(\/(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*(?:\/(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[\-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*)*)? (?# 7: path)
)(?:\?((?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))? (?# 8: query)
)
(?:\#((?:[\-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))? (?# 9: fragment)
\s*\z/xAfter [dependabot-core-dev] ~/uv $ bundle info uri
* uri (1.0.3)
Summary: URI is a module providing classes to handle Uniform Resource Identifiers
Homepage: https://github.com/ruby/uri
Documentation: https://ruby.github.io/uri/
Source Code: https://github.com/ruby/uri
Changelog: https://github.com/ruby/uri/releases
Bug Tracker: https://github.com/ruby/uri/issues
Path: /home/dependabot/dependabot-updater/vendor/ruby/3.3.0/gems/uri-1.0.3
Reverse Dependencies:
dependabot-common (0.310.0) depends on uri (~> 1.0)
[dependabot-core-dev] ~/uv $ bundle exec irb
irb(main):001> require "uri"
=> true
irb(main):002> URI::DEFAULT_PARSER
=> #<URI::RFC3986_Parser:0x0000ffff67ec5808>
irb(main):003> URI::DEFAULT_PARSER.regexp[:ABS_URI]
=> nil |
thavaahariharangit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In Ruby 3.4 `URI::DEFAULT_PARSER` changed from `RFC2396` to `RFC3896`[1]. This change makes our use of `RFC2396` explicit instead of implicit. This migration was partially completed in #12170. This change just finishes the migration. This is a part of the Ruby 3.4 upgrade #11681 [1]: ruby/uri#107
What are you trying to accomplish?
This prevents
URI::DEFAULT_PARSER.regexp[:ABS_URI]returning nil ifurigem is updated to version beyond1. Because gemspec does not define a range forurigem, if a project usesdependabot-coreand has a more recent version of uri, nil will be returned becauseDEFAULT_PARSERhas changed starting from version 1 and does not define:ABS_URIkey in regexp hash. This problem would probably also happen at some point if some transitive dependency updates theurigem.By using explicit parser, it should preserve the original behavior regardless of version used.
Anything you want to highlight for special attention from reviewers?
How will you know you've accomplished your goal?
I compared the outputs on older and newer versions or
uriand tests should validate this change as well.Checklist