Skip to content

Proposal to update GTFS Schedule to RFC 2119#277

Merged
sccmcca merged 22 commits intogoogle:masterfrom
MobilityData:rfc2119
Aug 27, 2021
Merged

Proposal to update GTFS Schedule to RFC 2119#277
sccmcca merged 22 commits intogoogle:masterfrom
MobilityData:rfc2119

Conversation

@sccmcca
Copy link
Copy Markdown
Contributor

@sccmcca sccmcca commented Jun 3, 2021

As discussed in #273, this PR contains a proposal to update GTFS Schedule to RFC 2119 in order to disambiguate interpretations and make the spec easier for producing, consuming, and creating products (i.e., validators) around GTFS. Additionally, these changes aim to be useful when articulating obligations in contract requirements.

Note that BCP 14 (containing RFC 2119 and RFC 8174) has not been implemented in this initial proposal. Thus, I have not CAPITALIZED the keywords as seen in GBFS. This has the purpose of making the initial proposal easier to review with pre-existing instances of lowercase RFC 2119 keywords, but we can decide if the CAPITALIZATION is useful and if this proposal should be updated to BCP 14, or alternative styling as seen in IMDF (i.e., only capitalizing "MUST"s, "SHOULD"s, and "MAY"s).

The PR contains separate commits for each applicable section of the GTFS Schedule reference to ease review (the changes are mostly light).

See all the most up-to-date changes in the "Files changed" tab. Feel free to provide in-line comments and any actionable feedback if any sections have mistakenly altered interpretations or otherwise.

Thanks!

Edits

  • 2021-08-03: Round of revisions based on feedback from @isabelle-dr and @aababilov, in commit 3875914.
  • 2021-06-22: Round of revisions based on feedback from @abyrd, in commit fba6020.
  • 2021-06-15: Minor "may vs can" corrections based on feedback, in commit 630bd5d.
  • 2021-06-09: Round of revisions based on community feedback, in commit 73bc764 .
  • 2021-06-03: Commit fb61255 catches some instances where "can" is suitable instead of "may". Generally, "can" is used to denote optional usage by a data consumer or producer, and "may" is used to denote optional behaviour of the specification.

scmcca added 16 commits June 3, 2021 12:13
Changed heading to "File Conventions" from "File Requirements" since, by RFC 2119 definition, this section also contains recommendation.
- Corrected TOC heading
- Corrected tabs, carriage returns or new lines to "must not"
- Typo fix in "Date" field type
- Differentiated "may" and "can"
@google-cla google-cla bot added the cla: yes label Jun 3, 2021
Copy link
Copy Markdown
Contributor

@timMillet timMillet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 note, otherwise LGTM!

@timMillet timMillet requested a review from barbeau June 4, 2021 20:23
Copy link
Copy Markdown

@lionel-nj lionel-nj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @scmcca for working on that! I added a few comments that actually may be addressed in a separate PR according to your workflow :)

@isabelle-dr
Copy link
Copy Markdown
Collaborator

isabelle-dr commented Jun 25, 2021

I think the two sections below could be part of this review as well.
"For times occurring after midnight, enter the time as a value greater than 24:00:00" in Field Types under Time.
"Timezone names never contain the space character but may contain an underscore." in Field Types under Timezone.

@sccmcca
Copy link
Copy Markdown
Contributor Author

sccmcca commented Jul 2, 2021

@isabelle-dr

I think the two sections below could be part of this review as well.
"For times occurring after midnight, enter the time as a value greater than 24:00:00" in Field Types under Time.

I can think of a clarification to make this sentence less instructional and more descriptive:

"Values greater than 24:00:00 are allowed to define times occurring after midnight for the service day on which the trip schedule begins."

What is also missing from the "Time" definition are invalid values. There is no maximum time value described. perhaps after some value like +12 or +24 hours above 24:00:00 a warning is issued in the validator? Any other suggestions?

"Timezone names never contain the space character but may contain an underscore." in Field Types under Timezone.

I'd like to reduce the "Timezone" description more generally to:

"TZ timezone as described in https://www.iana.org/time-zones. Refer to http://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid values.
Example: Asia/Tokyo, America/Los_Angeles or Africa/Cairo."

@isabelle-dr
Copy link
Copy Markdown
Collaborator

@scmcca

What is also missing from the "Time" definition are invalid values. There is no maximum time value described. perhaps after some value like +12 or +24 hours above 24:00:00 a warning is issued in the validator? Any other suggestions?

Currently, the Canonical GTFS Validator accepts values in formats H:MM:SS, HH:MM:SS, as well as HHH:MM:SS, and emits an ERROR otherwise.
There have been discussions on having a separate WARNING if a time field contains a value above a certain threshold. Since this would be an addition to the specification, it could happen in a separate PR or added to the Best Practices if we see a need for such a threshold.

I think those two suggestions work perfectly in the context of this PR!

As per isabelle-dr and aababilov feedback.
@sccmcca sccmcca linked an issue Aug 16, 2021 that may be closed by this pull request
@abyrd
Copy link
Copy Markdown

abyrd commented Aug 17, 2021

@isabelle-dr

I think the two sections below could be part of this review as well.
"For times occurring after midnight, enter the time as a value greater than 24:00:00" in Field Types under Time.

I can think of a clarification to make this sentence less instructional and more descriptive:

"Values greater than 24:00:00 are allowed to define times occurring after midnight for the service day on which the trip schedule begins."

As I understand it, trips may begin after 24:00:00, i.e. times over 24:00:00 are not restricted to trips beginning before 24:00:00. This is because some agencies internally define an overnight service break (when no service runs, maintenance can be done, etc.) and the next service day only begins after that overnight service break. If the service break is 3-5AM for example, a final trip of the day from 2-2:30 AM could reasonably be coded on the previous day using 26:00:00-26:30:00. This pull request is described as being focused on the use of certain specific terms, so I think it would be best to avoid any potential changes to the meaning of the specification.

What is also missing from the "Time" definition are invalid values. There is no maximum time value described. perhaps after some value like +12 or +24 hours above 24:00:00 a warning is issued in the validator? Any other suggestions?

I agree it is reasonable to issue warnings on strange values of the time field. However there are services that run for several days and will exceed these limits - notably some ferries in Norway that are known to be represented in GTFS data. The values that are considered "unusual" are somewhat subjective and might be different in different validation contexts. To me this seems like good material for the "best practices" guide, with the specification addressing only more objective limits.

I'd like to reduce the "Timezone" description more generally to:

"TZ timezone as described in https://www.iana.org/time-zones. Refer to http://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid values.
Example: Asia/Tokyo, America/Los_Angeles or Africa/Cairo."

This wording seems good to me, but this change seems out of scope for the current pull request. Given that some people may decide to read, comment, and vote on the PR based on its title and initial description, it might be better to make such changes in another PR.

@sccmcca
Copy link
Copy Markdown
Contributor Author

sccmcca commented Aug 17, 2021

@abyrd @skinkie

Fair points. I'll revert the changes made to the 24:00:00+ and TZ Timezone descriptions.

After that do you think we're good to go?

scmcca added 2 commits August 17, 2021 15:11
Revert stop_name clarification for separate PR.
@sccmcca
Copy link
Copy Markdown
Contributor Author

sccmcca commented Aug 19, 2021

Thanks to everyone involved in the revisions. This was a lot of work! We seem to have reached a stable consensus.

As per the Specification Amendment Process, I am calling a vote to update GTFS Schedule to RFC 2119 as outlined in this PR.

Voting will end on 2021-08-26 at 23:59:59 UTC.

@sccmcca sccmcca added the Vote to Test Community votes to determine whether the proposal is ready for testing. label Aug 19, 2021
@skinkie
Copy link
Copy Markdown
Contributor

skinkie commented Aug 19, 2021

+1 (OpenGeo)

@westontrillium
Copy link
Copy Markdown
Contributor

westontrillium commented Aug 19, 2021

+1 from Trillium

@abyrd
Copy link
Copy Markdown

abyrd commented Aug 20, 2021

+1 (Conveyal)

@mgilligan
Copy link
Copy Markdown

+1, TriMet

@e-lo
Copy link
Copy Markdown

e-lo commented Aug 20, 2021

+1 Cal-ITP

cc: @mcplanner, @thekaveman, @hunterowens

@flocsy
Copy link
Copy Markdown
Contributor

flocsy commented Aug 22, 2021

+1 Moovit

@sccmcca
Copy link
Copy Markdown
Contributor Author

sccmcca commented Aug 27, 2021

The vote ended on 2021-08-26 at 23:59:59 UTC with 6 votes in favor and 0 votes opposed. As per the Specification Amendment Process, this proposal passes!

Big thanks to everyone involved in bringing RFC 2119 to GTFS Schedule.

@sccmcca sccmcca merged commit 74c415f into google:master Aug 27, 2021
@sccmcca sccmcca removed the Vote to Test Community votes to determine whether the proposal is ready for testing. label Aug 27, 2021
@e-lo
Copy link
Copy Markdown

e-lo commented Aug 27, 2021

A BIG GIANT thanks to @scmcca for leading the process to get to an RFC 2119 compliant spec 🍾 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update GTFS to RFC 2119