You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 21, 2024. It is now read-only.
📝 This falls under the category of: we (Cal-ITP) know this is an issue where there is great confusion and many practices but don't 100% know what the best practice should be....although I can verify that the current published best-practice is out-of-date and insufficient.
Current Relevant Best Practice
One GTFS dataset should contain current and upcoming service (sometimes called a “merged” dataset). Google transitfeed tool's merge function can be used to create a merged dataset from two different GTFS feeds.
At any time, the published GTFS dataset should be valid for at least the next 7 days, and ideally for as long as the operator is confident that the schedule will continue to be operated.
If possible, the GTFS dataset should cover at least the next 30 days of service.
Needs
Reference currently relevant tools: the transitfeeds library has (pretty much, mostly) been deprecated and we shouldn't have a best practice which references it or encourages agencies to use it to solve this issue.
Accommodate large feeds: Some agencies with very large and complex feeds have been asked by some GTFS consumers to split their feed in two because of its size - just for a single service_id. Adding multiple service_ids would likely overwhelm their system. Additionally, many feeds hosted using github infrastructure require files < 100MB and adding duplicate service_ids would surpass this.
Don't risk validation of currently-posted feed: One of the main reasons for posting future service in advance is to make sure that any issues/problems with the future service data is identified in advance. Posting (relatively) untested data within the same feed risks rendering the whole feed invalid. While this might not be as big of an issue for the large consumers who consume the data on a specific schedule, it isn't friendly to the accessibility of the feed overall. This could also be especially important as agencies work on adding new features/attributes to their feed or update their business processes for dataset production.
Solutions Considered
Continue to suggest a merged feed with difference service_id s for:
datasets which don't add new fields/features or dramatically new services, either of which might risk validation of the dataset containing current service.
small/medium feeds which would produce merged datasets < XXX Mb ? (not sure what this number should be?).
? Are there current/relevant tools for merging datasets with all current attributes/files?
📝 This falls under the category of: we (Cal-ITP) know this is an issue where there is great confusion and many practices but don't 100% know what the best practice should be....although I can verify that the current published best-practice is out-of-date and insufficient.
Current Relevant Best Practice
Needs
transitfeedslibrary has (pretty much, mostly) been deprecated and we shouldn't have a best practice which references it or encourages agencies to use it to solve this issue.service_id. Adding multipleservice_ids would likely overwhelm their system. Additionally, many feeds hosted using github infrastructure require files < 100MB and adding duplicateservice_ids would surpass this.Solutions Considered
Continue to suggest a merged feed with difference
service_ids for:? Are there current/relevant tools for merging datasets with all current attributes/files?
Otherwise, we've suggested the use of a separate permalink URL for future service. For example Los Angeles Metro publishes their “future-service” feed to a different Git “branch” enabling a permalink download at
https://gitlab.com/LACMTA/gtfs_bus/-/blob/future-service/gtfs_bus.zipThoughts here?
@gcamp @Cristhian-HA @scmcca @westontrillium and others!