fix: ShapeIncreasingDistanceValidator#1083
Conversation
- additional unit test - update documentation to match latest development
| return shape.shapePtLon() == otherShape.shapePtLon() | ||
| && shape.shapePtLat() == otherShape.shapePtLat(); |
There was a problem hiding this comment.
One of the errors we've been seeing a lot is when there are two shape points that are very close to each other (<1 meter apart) and the shape distance was equal. It could've been that the shape distance units in a particular feed were in kilometers and that due to a rounding process the distance was equal and therefore probably an OK value. It would be great if this check here could return true if the coordinates are less than 1 meter apart.
There was a problem hiding this comment.
One question is why 1 meter, why not 80cm or 1.20m?
There was a problem hiding this comment.
1 meter is arbitrary honestly. Maybe @barbeau can provide some more educated insights into GPS precision and measurement.
RULES.md
Outdated
|
|
||
| <a name="EqualShapeDistanceNotice"/> | ||
|
|
||
| When sorted by `shape.shape_pt_sequence`, two consecutive shape points must not have equal values for `shape_dist_traveled` if their GPS coordinates are the same. |
There was a problem hiding this comment.
I think this should read "if their GPS coordinates are not the same". It's an error to have equal distances with different coordinates, it's a warning to have equal distances and equal coordinates.
There was a problem hiding this comment.
My only other concern is that at least for our feed, which generates equal distances every stop pair, we get 10s of thousands of these notices per validation. I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
There was a problem hiding this comment.
I think this should read "if their GPS coordinates are not the same".
👍🏾
It's an error to have equal distances with different coordinates, it's a warning to have equal distances and equal coordinates.
Or if said coordinates are actually less more than 1 meter away according to #1083 (comment).
My only other concern is that at least for our feed, which generates equal distances every stop pair, we get 10s of thousands of these notices per validation. I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
As of now a maximum of 1000 of notices are exported by type and severity, and the validation report gives information about the total number of notices per type and severity.
I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
I see this as separate from the bug fixed here. We could allow users to reduce this limit in a separate PR by making the limit user configurable. cc @isabelle-dr.
- additional unit test - update documentation to match latest development
|
Some stats: As of now:
Before the fix:
Also, with this fix no additional error type is introduced (i.e. no dataset go from valid to invalid). cc @isabelle-dr |
|
Hello, A few thoughts/questions about the original issue and the modification of
Here are other options we could look at:
This makes a lot of sense. As @lionel-nj mentioned, I'd be interested to look at what would be the best threshold to use.
|
I'm in favor of having two distinct rules with different error levels.
I like option 2.
I suppose this could technically be OK in a 3-dimensional sense if there was a transit vehicle that traveled along a pattern that involved an elevator with different "stops" at different levels. However, I don't think that currently exists anywhere yet, so it may be best to keep this as an error.
I'll defer to @barbeau's expertise here.
Yes, this should be an error, but I'd be surprised if it occurred in the wild. |
This would be ideal (i.e., create the threshold based on a percentile based on real data), but to my knowledge not easy to calculate at the moment without a lot of manual effort. One way to approach this is based on precision of decimal places for latitude and longitude. From http://wiki.gis.com/wiki/index.php/Decimal_degrees:
So, if you cap the number of decimal places to 5 the smallest difference between values you can represent is 1.11 meters. Therefore, it would be possible for two values less than 1.11 meters apart to be capped/rounded to the same latitude and longitude value. So perhaps 1.11 meters is a good value? That should capture any "same" values that result from precision/rounding issues at 5 decimal places. That's also better accuracy that you'd typically see with an off-the-shelf consumer-grade GNSS unit capturing a vehicle path in motion. |
I prefer option 2 here.
We have historically scheduled holds at stops as two consecutive departures from the same stop, so those would have shown up as equal distance in stop_times. I believe we now schedule those holds as actual holds with arrival and departure times or post-process them to merge the pair of stops into an arrival and departure time for one stop. The spec seems to allow our old practice, so we'd expect that equal stop time distances would be allowed, if not encouraged.
1.11 m is fine, but realistically, anything less that 13 m (length of our regular buses) is basically the same from an operations perspective.
If the shape contains a loop then you could definitely have two different shape points with the same coordinates and different |
|
Thanks for the comments on this PR 🙏
Good point. I was thinking of consecutive stops there, which would be an error I think. Let's keep this as a potential candidate for later if we see a need.
Opened #1084 to follow up. So in conclusion, we agree on the option 2:
|
- create EqualShapeDistanceSameCoordinatesNotice - create EqualShapeDistanceDiffCoordinatesNotice - update docstrings
- update unit tests
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
| return shape.shapePtLon() != otherShape.shapePtLon() | ||
| && shape.shapePtLat() != otherShape.shapePtLat() |
There was a problem hiding this comment.
You can have different coordinates while either lon XOR lat is identical. Shouldn't this be !(shape.shapePtLon() == otherShape.shapePtLon() && shape.shapePtLat() == otherShape.shapePtLat())
There was a problem hiding this comment.
Shouldn't this be !(shape.shapePtLon() == otherShape.shapePtLon() && shape.shapePtLat() == otherShape.shapePtLat())
Thanks for flagging that! Modified in 574cd7b.
| if (MAX_DISTANCE_SHAPEPOINTS_METERS | ||
| < getDistanceMeters(curr.shapePtLatLon(), prev.shapePtLatLon())) { | ||
| noticeContainer.addValidationNotice( | ||
| new EqualShapeDistanceDiffCoordinatesNotice(prev, curr)); | ||
| } | ||
| } else if (curr.shapePtLat() == prev.shapePtLat() | ||
| && curr.shapePtLon() == prev.shapePtLon()) { | ||
| // equal shape_dist_traveled and same coordinates | ||
| noticeContainer.addValidationNotice( | ||
| new DecreasingOrEqualShapeDistanceNotice( | ||
| curr.shapeId(), | ||
| curr.csvRowNumber(), | ||
| curr.shapeDistTraveled(), | ||
| curr.shapePtSequence(), | ||
| prev.csvRowNumber(), | ||
| prev.shapeDistTraveled(), | ||
| prev.shapePtSequence())); | ||
| new EqualShapeDistanceSameCoordinatesNotice(prev, curr)); |
There was a problem hiding this comment.
What if coordinates are different and getDistanceMeters(curr.shapePtLatLon(), prev.shapePtLatLon()) <= MAX_DISTANCE_SHAPEPOINTS_METERS ?
There was a problem hiding this comment.
In this case, no notice should be triggered, see:
|
This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: |
maximearmstrong
left a comment
There was a problem hiding this comment.
Some questions and comments in-line before approval :)
.../main/java/org/mobilitydata/gtfsvalidator/validator/StopTimeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Maxime Armstrong <[email protected]>
…ilityData/gtfs-validator into fix-ShapeIncreasingDistanceValidator
|
This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: |
.../main/java/org/mobilitydata/gtfsvalidator/validator/StopTimeIncreasingDistanceValidator.java
Show resolved
Hide resolved
| } | ||
|
|
||
| @Test | ||
| public void noShapeDistTravelled_shouldNotGenerateNotice() { |
|
Thank you for your contribution ! @evansiroky @botanize @maximearmstrong @isabelle-dr |
| double shapePtLon, | ||
| int shapePtSequence, | ||
| double shapeDistTraveled) { | ||
| Double shapeDistTraveled) { |
There was a problem hiding this comment.
Sorry for coming in late here, but is there a reason this was changed from double to Double?
There was a problem hiding this comment.
This was done in order to support testing agains null value.
closes #1070
Summary:
This PR provides support to: make warning-level notice when an equal
shape_dist_traveledvalue is encountered and the previous row in the sequence had the same GPS coordinate.Expected behavior:
DecreasingShapeDistanceNoticeas anERROR: should be issued whenshape_dist_traveleddecreases between two consecutive shape points (based onshape_pt_sequence);EqualShapeDistanceDiffCoordinatesNoticeas anERROR: should be issued whenshape_dist_traveledis equal between two consecutives shape points that have different GPS coordinates;EqualShapeDistanceSameCoordinatesNoticeas aWARNING: should be issued whenshape_dist_traveledis equal between two consecutives shape points that have the same GPS coordinates.@evansiroky @botanize your comments and feedback are more than welcome :)
Please make sure these boxes are checked before submitting your pull request - thanks!
gradle testto make sure you didn't break anything[ ] Include screenshot(s) showing how this pull request works and fixes the issue(s)