You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the feature you want and how it meets your needs or solves a problem
Caltrans Transit Data Quality Team Use Case
As a data collector on the Caltrans Transit Data Quality Team,
I want to collect aggregated ridership datasets in a standardized format from each agency,
so that I don't have to write 200 data conversion scripts for all 200 agencies in the state,
so that I can quickly publish aggregated ridership data for all transit agencies in the state.
Transit Planner Use Case
As a transit planner,
I want to quickly gather aggregated stop-level ridership data from multiple transit agencies,
So that I can perform transit planning analyses
Location Intelligence or Geospatial Analytics
As a curator of data about locations,
I want to gather stop-level ridership data from multiple transit agencies,
So that I can display insights on the amount of transit riders at a certain location.
Describe the solution you'd like
The current TIDES specification does not allow a way to represent aggregated ridership data very well. Instead it focuses at most to a single day of ridership data in Stop Visits.
In various use cases, it can be helpful to merely have a rough estimate of a "typical" ridership at either a transit stop or route, etc. The methodology of converting TIDES or any ridership data from events to an "average" or "typical" value may involve a heavy amount of data cleaning, but is commonly shown in a variety of places.
Most notably, the NTD presents Unlinked Passenger Trips (UPT) in several of their data products, although these data products are always only aggregated to the agency level. One notable example within the NTD's data products is the NTD Annual Data - Service (by Mode and Time Period) which includes the columns time_period, time_service_begins and time_service_ends. The time_period has possible values of:
Average Typical Weekday
Average Weekday - Midday
Average Weekday - AM Peak
Average Weekday - PM Peak
Average Weekday - Other
Annual Total
Average Typical Sunday
Average Typical Saturday
Another option that exists and may be a possible option to pursue is the GTFS-RIDE ridership.txt file. This file has quite a bit of flexibility, but lacks a way to indicate what kind of aggregation or averaging if any has been applied.
TIDES could add another table that further summarizes ridership data across time periods and perhaps adds qualifiers in that the data is either a "total" count, "averaged" or perhaps other options like "median".
Additional context and sample data
See comments about NTD data. Also, Caltrans has been collecting some ridership data. See example here.
Describe the feature you want and how it meets your needs or solves a problem
As a data collector on the Caltrans Transit Data Quality Team,
I want to collect aggregated ridership datasets in a standardized format from each agency,
so that I don't have to write 200 data conversion scripts for all 200 agencies in the state,
so that I can quickly publish aggregated ridership data for all transit agencies in the state.
As a transit planner,
I want to quickly gather aggregated stop-level ridership data from multiple transit agencies,
So that I can perform transit planning analyses
As a curator of data about locations,
I want to gather stop-level ridership data from multiple transit agencies,
So that I can display insights on the amount of transit riders at a certain location.
Describe the solution you'd like
The current TIDES specification does not allow a way to represent aggregated ridership data very well. Instead it focuses at most to a single day of ridership data in Stop Visits.
In various use cases, it can be helpful to merely have a rough estimate of a "typical" ridership at either a transit stop or route, etc. The methodology of converting TIDES or any ridership data from events to an "average" or "typical" value may involve a heavy amount of data cleaning, but is commonly shown in a variety of places.
Most notably, the NTD presents Unlinked Passenger Trips (UPT) in several of their data products, although these data products are always only aggregated to the agency level. One notable example within the NTD's data products is the NTD Annual Data - Service (by Mode and Time Period) which includes the columns
time_period,time_service_beginsandtime_service_ends. Thetime_periodhas possible values of:Another option that exists and may be a possible option to pursue is the GTFS-RIDE ridership.txt file. This file has quite a bit of flexibility, but lacks a way to indicate what kind of aggregation or averaging if any has been applied.
Describe alternatives you've considered
Additional context and sample data
See comments about NTD data. Also, Caltrans has been collecting some ridership data. See example here.