Skip to content

feat 267: route.long_name should not contain route.short_name#1325

Merged
fredericsimard merged 22 commits intoMobilityData:masterfrom
JarvusInnovations:issue/267/route_long_name_contains_short_name
Mar 8, 2023
Merged

feat 267: route.long_name should not contain route.short_name#1325
fredericsimard merged 22 commits intoMobilityData:masterfrom
JarvusInnovations:issue/267/route_long_name_contains_short_name

Conversation

@briandonahue
Copy link
Copy Markdown
Contributor

@briandonahue briandonahue commented Feb 2, 2023

Summary:

Closes #267. Based on implementation suggestion by @aababilov here

check that long name does not start from the short name followed by ' ', '-' or '('.

Expected behavior:

Should generate WARNING level notice that route.long_name should not contain route.short_name as referenced in Best Practices: https://gtfs.org/schedule/best-practices/#routestxt

Please make sure these boxes are checked before submitting your pull request - thanks!

@isabelle-dr isabelle-dr linked an issue Feb 10, 2023 that may be closed by this pull request
Copy link
Copy Markdown
Contributor

@isabelle-dr isabelle-dr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @briandonahue !
Your new validator notice replaces https://github.com/MobilityData/gtfs-validator/blob/master/RULES.md#route_short_and_long_name_equal. Can you remove it in this PR, so that we can close #267?

We also need the update in RULES.md :)

@briandonahue briandonahue marked this pull request as draft February 10, 2023 20:27
@briandonahue
Copy link
Copy Markdown
Contributor Author

@isabelle-dr changing this to a draft to avoid accidental merging - realizing we could use a few more test cases and document good/bad data examples

@briandonahue
Copy link
Copy Markdown
Contributor Author

@bdferris-v2 Any thoughts on using parameterized tests where they make sense? For this where there are many ways the short_name can be contained in the long_name, it seems like a perfect candidate. Plus some way off day in the future you might be able to use tests params as sample good/bad data in the generated documentation :)

I see this library which looks useful, but wondering if you had any other recommendations or concerns with doing this.

@briandonahue
Copy link
Copy Markdown
Contributor Author

briandonahue commented Feb 10, 2023

@isabelle-dr @bdferris-v2 I've made some significant changes. Documenting the rule made me realize it was missing some use cases and that it may not be perfectly clear and/or covering all potential issues.

So far, I've updated to account for short/long are equal (case insensitive), long cannot start with short and be followed by a space, or disallowed character (, -. I added ) and am wondering if : or other characters should be added?

Here are my current test cases for bad data:

           // short, long
            {"L1", "L1 Long Name"},
            {"L1", "L1-Long Name"},
            {"L1", "L1- Long Name"},
            {"L1", "L1 - Long Name"},
            {"L1", "L1)Long Name"},
            {"L1", "L1) Long Name"},
            {"L1", "L1 ) Long Name"},
            {"L1", "L1(Long Name)"},
            {"L1", "L1 (Long Name)"},
            {"L1", "L1( Long Name)"},

Please let me know if any should be removed or added.

@briandonahue briandonahue marked this pull request as ready for review February 10, 2023 22:55
Copy link
Copy Markdown

@julianharty julianharty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough of this codebase to approve this PR. The code looks clean, I've added some comments/suggestions. Let me know if I can help further.

Copy link
Copy Markdown
Contributor

@isabelle-dr isabelle-dr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments in-line, thank you so much for working on this @briandonahue!
A great addition to this validator :)

@briandonahue briandonahue changed the title feat: route.long_name should not contain route.short_name feat 267: route.long_name should not contain route.short_name Feb 23, 2023
Copy link
Copy Markdown
Collaborator

@bdferris-v2 bdferris-v2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo the small changes requested, this LGTM.

@briandonahue
Copy link
Copy Markdown
Contributor Author

@bdferris-v2 Made those fixes, thanks.

@isabelle-dr
Copy link
Copy Markdown
Contributor

One last small change before we merge :)

@bdferris-v2
Copy link
Copy Markdown
Collaborator

@isabelle-dr because the acceptance test workflow only has permission to post the comment with the report summary if the PR is coming from a branch of the gtfs-validator repo itself, as opposed to a fork of the repo, as is the case with this PR (and many coming from external contributors). This is discussed in more detail here. Unfortunately, I've not seen a good way around this.

@davidgamez
Copy link
Copy Markdown
Member

@bdferris-v2 Any thoughts on using parameterized tests where they make sense? For this where there are many ways the short_name can be contained in the long_name, it seems like a perfect candidate. Plus some way off day in the future you might be able to use tests params as sample good/bad data in the generated documentation :)

I see this library which looks useful, but wondering if you had any other recommendations or concerns with doing this.

That's a good call. Definitely the bad list test case is a great example for parameterizing a test. We can have this "feature" for free if we decide to upgrade the JUnit version. JUnit 5 Parameterized Tests
Let's keep the topic alive and have the conversion out of scope of this PR.

Copy link
Copy Markdown
Member

@davidgamez davidgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@briandonahue
Copy link
Copy Markdown
Contributor Author

@isabelle-dr because the acceptance test workflow only has permission to post the comment with the report summary if the PR is coming from a branch of the gtfs-validator repo itself, as opposed to a fork of the repo, as is the case with this PR (and many coming from external contributors). This is discussed in more detail here. Unfortunately, I've not seen a good way around this.

@isabelle-dr @bdferris-v2 This was created before I had access to the main repo. Should I close this PR and open a new one from a local branch?

@bdferris-v2
Copy link
Copy Markdown
Collaborator

Two things. The acceptance test still ran and you can access the logs in the Checks tab to see the result:

✅ Rule acceptance tests passed.
New Errors: 0 out of 1419 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1419 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
0 out of 1419 sources (~0 %) are corrupted.
Commit: 4a956ff
Download the full acceptance test report here (report will disappear after 90 days).
✅ Rule acceptance tests passed.

That said, per #1349, the acceptance test only tracks changes to ERROR notices, not warnings. As this PR only changes a warning notice, it wouldn't show up as a change in the acceptance test. That's something I'd like to fix.

@isabelle-dr
Copy link
Copy Markdown
Contributor

@briandonahue good to merge?

@briandonahue
Copy link
Copy Markdown
Contributor Author

@briandonahue good to merge?

@isabelle-dr Yes! 😄

@fredericsimard fredericsimard merged commit aeaecb7 into MobilityData:master Mar 8, 2023
@briandonahue briandonahue deleted the issue/267/route_long_name_contains_short_name branch March 8, 2023 16:34
ryon pushed a commit to JarvusInnovations/gtfs-validator that referenced this pull request Apr 1, 2023
…tyData#1325)

* fix: 267 - route.long_name should not contain route.short_name

* formatting

* Remove unnecessary comment

* Remove unescaped regex, do startsWith check then regex on remainder

* update RULES.md

* update RULES.md examples

* Add more test examples and remove redundant warning

* remove redundant test data

* fix typo

* Remove redundant rule

* 'should not', not 'can not'

* simplify test data

* Missed a spot to remove redundant rule

* Update RULES.md

Better examples!

Co-authored-by: isabelle-dr <[email protected]>

* Missed a spot to remove redundant rule

* Make requested updates to rules docs

* Address PR feedback

* Correct field name

---------

Co-authored-by: Kevin Clough <[email protected]>
Co-authored-by: isabelle-dr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Smarter check for if route_long_name contains route_short_name

7 participants