Skip to content

🚀💻 – Validate sample data against TIDES frictionless specification #40

@e-lo

Description

@e-lo

User Stories

Describe the feature you want and how it meets your needs or solves a problem

  1. As a transit agency, I'd like to know how to add my sample data.
  2. As a transit agency, I want to know if my data and the scripts that support them conforms to the latests TIDES spec so that I can make any necessary fixes.
  3. As a TIDES contributor, I'd like to know how proposed spec changes affect actual data so I can evaluate their ROI.
  4. As a transit agency or transit technology developer, I'd like to easily review the sample data, scripts, and the context in which they were developed.
  5. As a smaller transit agency or one that isn't yet automatically generating data, I'd like a template folder that can be used for quickly prototyping my TIDES data.
  6. As a TIDES maintainer, I'd like templates to automatically update based on updates to the spec

Proposed Solution

List of Solutions for (relevant user story)

(Checking box indicates consensus achieved on approach)

  • Documentation which articulates a desired example folder structure (1)
  • GH Workflow for validation of /TIDES folders (2,4)
  • Automatic documentation generation for example data (4)
  • Fill template directory datapackage.json data with REPLACEME (or similar)
  • Template folder with required and suggested directories and files (including datapackage.json) which can be used to quickly generate new examples (5)
  • Auto-generation of templates based on updates to the spec (6)

BONUS (or potentially another issue/PR):

  • Issue generation for failed validations

Proposed example directory structure:

/samples
    /agency-name     # Unique agency name
        datapackage.json  # Basic example information such as agency-name, CAD-AVL vendor, spec version, data maintainer (and their GH handle) 
        /TIDES       # Data formatted in TIDES standard (in the future, we could have subfolders for versions if necessary)
        /raw    # Raw input data
        /scripts     # Scripts that turn data-raw to TIDES

Consensus Building

General Agreement

  • User stories 1-4 are useful and their proposed implementation in PR generally agreed upon

To Discuss

1 - Sources

How should data sources be documented?

Context

  • We are interested in understanding/daylighting/documenting the products or components the data came from.
  • datapackage.json allows users to specify where the data came from in the sources field.
  • sources can be specified at the data-packageor resourcelevel.
  • There can be more than one source listed in sources.

Options

  1. resource-level: can relate different resources (i.e. fares vs APC) to different sources (preferred by @e-lo)
  2. datapackage-level: simplifying and reducing the data that must be entered and replicated (preferred by @botanize)
  3. allow option for either: potential compromise (I think fine with both @botanize and @e-lo , but is less opinionated)

Discussed in the unresolved PR comments

note: this would only affect our documentation and template (if used, see below) datapackage.json since we are not developing (at this time) a datapackage profile which would validate this data.

2 - Template Files

Should we have template files (csvs and datapackage.json) and if so, is it useful to have code that auto-generates them based on changes to the spec?

Context

  1. As a smaller transit agency or one that isn't yet automatically generating data, I'd like a template folder that can be used for quickly prototyping my TIDES data.
  2. As a TIDES maintainer, I'd like templates to automatically update based on updates to the spec

Options

  1. that are auto-generated from the spec? (currently implemented in PR, preferred by @e-lo )
  2. as static files
  3. datapackage.json documented as static text in the README.md, no csv templates (preferred by @botanize)

Metadata

Metadata

Assignees

No one assigned

    Labels

    💬 discussIssue to be discussed at Contributors meeting💻 codePertains to the infrastructure code🚀 featureAdds a new feature - to spec or code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions