Skip to content

📄🚀 – Aggregated Ridership Table #244

@evansiroky

Description

@evansiroky

Describe the feature you want and how it meets your needs or solves a problem

  1. Caltrans Transit Data Quality Team Use Case

As a data collector on the Caltrans Transit Data Quality Team,
I want to collect aggregated ridership datasets in a standardized format from each agency,
so that I don't have to write 200 data conversion scripts for all 200 agencies in the state,
so that I can quickly publish aggregated ridership data for all transit agencies in the state.

  1. Transit Planner Use Case

As a transit planner,
I want to quickly gather aggregated stop-level ridership data from multiple transit agencies,
So that I can perform transit planning analyses

  1. Location Intelligence or Geospatial Analytics

As a curator of data about locations,
I want to gather stop-level ridership data from multiple transit agencies,
So that I can display insights on the amount of transit riders at a certain location.

Describe the solution you'd like

The current TIDES specification does not allow a way to represent aggregated ridership data very well. Instead it focuses at most to a single day of ridership data in Stop Visits.

In various use cases, it can be helpful to merely have a rough estimate of a "typical" ridership at either a transit stop or route, etc. The methodology of converting TIDES or any ridership data from events to an "average" or "typical" value may involve a heavy amount of data cleaning, but is commonly shown in a variety of places.

Most notably, the NTD presents Unlinked Passenger Trips (UPT) in several of their data products, although these data products are always only aggregated to the agency level. One notable example within the NTD's data products is the NTD Annual Data - Service (by Mode and Time Period) which includes the columns time_period, time_service_begins and time_service_ends. The time_period has possible values of:

  • Average Typical Weekday
  • Average Weekday - Midday
  • Average Weekday - AM Peak
  • Average Weekday - PM Peak
  • Average Weekday - Other
  • Annual Total
  • Average Typical Sunday
  • Average Typical Saturday

Another option that exists and may be a possible option to pursue is the GTFS-RIDE ridership.txt file. This file has quite a bit of flexibility, but lacks a way to indicate what kind of aggregation or averaging if any has been applied.

Describe alternatives you've considered

  • Use the GTFS-RIDE ridership.txt - related issue: Add FAQ about GTFS-Ride compatibility/supplementability #102
  • TIDES could add another table that further summarizes ridership data across time periods and perhaps adds qualifiers in that the data is either a "total" count, "averaged" or perhaps other options like "median".

Additional context and sample data

See comments about NTD data. Also, Caltrans has been collecting some ridership data. See example here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    📄 specPertains to the specification itself🚀 featureAdds a new feature - to spec or code

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions