Skip to content

Investigating issues with parsing Flex feeds  #1767

@emmambd

Description

@emmambd

What's the problem?

Out of the 4 Flex feeds that we have for testing purposes for #1721, 3 have failed to run through the validator without parsing issues.

I took a look at 51 Flex v2 feeds, including ones that don't conform to the official spec yet, for the sake of trying to better understand this problem. 50% fail to fully parse, and all but 1 of the feeds that failed have an issue with parsing stop_times.txt.

Outstanding questions

  • How common are stop_times.txt parsing failures with the validator now, just looking at the GTFS Schedule feeds currently in the Mobility Database?
  • How big are the stop_times.txt files for the feeds that fail?
    1KB, 12KB, 2.4MB
  • What changes would we need to make to ensure that Flex feeds can be validated successfully? Are there incremental changes that are feasible, or do we need a major infrastructure change, e.g the one suggested in feat: Column-based storage for GTFS entities #1747?
    No major infra change needed. We need to remove errors like UNKNOWN_COLUMN from UNPARSABLE_ROWS. However, this might not be necessary because we are adding the Flex rules. Explore running validation on feeds with unknown_column notices #1770

This is a critical set of questions to answer before we pursue more work on #1721

Metadata

Metadata

Assignees

Labels

flexRules and rule changes related to GTFS-Flex.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions