Add documentation of sample data structure#100
Add documentation of sample data structure#100e-lo wants to merge 25 commits intoTIDES-transit:mainfrom
Conversation
Use this instead because it is more configurable/extensible than the redirects.
1. Add example data (imperfect for now) generated by script `create_example` in data/example/data 2. Update validate-data GH action to refer to correct repos.
- Linted - Auto-generated tabular-data-package datapackage.json from script - Removed random data and the scripts to generate it, kept csv templates - Added readmes and documentation about each example in directory -
- Rendered samples as a table w/out readmes - Renamed data folders to samples and TIDES - Added metadata to datapackage.json about vendors, github handles, NTDID, agency name and exposed in documentation - Updated documentation for consistency - Renamed 'example' to 'template' to be more consistent with what it is
- changed workflow title to fix link with spaces - moved datapackage.json up one directory, updated docs, updated source refs - added missing --schema flag
per request
- Removes refs to template - Removes refs to github workflows
botanize
left a comment
There was a problem hiding this comment.
I see a few places we could make some nice usability improvements, let me know what you think about my suggestions.
| Once this is created, mapping the data files to the schema, simply run: | ||
|
|
||
| ```sh | ||
| frictionless validate datapackage.json | ||
| ``` |
There was a problem hiding this comment.
It seems that this just validates the datapackage.json file, not the contents of the datapackage?
| ```bash | ||
| pip install frictionless | ||
| frictionless validate path/to/your/datapackage.json | ||
| ``` |
There was a problem hiding this comment.
again, when I run this it seems to only validate the datapackage.json file, not the datapackage itself? Or maybe it's just because I'm getting a nonsense error, package-error The data package has an error: license requires "path" or "name": {'name': 'Apache-2.0'}, does this work for you?
There was a problem hiding this comment.
ahhh - i thought you were talking before about it SHOULD only validate the datapackage file and not the resourcs. Will investigate.
Notable: - added product_version as optional field - made name optional in datapackage.json
botanize
left a comment
There was a problem hiding this comment.
a couple of consistency issues
samples/README.md
Outdated
| | **Field** | **Description** | **Required** | | ||
| | --------- | --------------- | ------------ | | ||
| | `name` | Short sluggable name used to refer to data in this file. | Required | | ||
| | `name` | Short [sluggable](https://en.wikipedia.org/wiki/Clean_URL#Slug) name used to refer to data in this file. `name` must be unique within this datapackage. | Required | |
There was a problem hiding this comment.
Is this sluggable, or does it need to be a slug?
samples/README.md
Outdated
| | --------- | --------------- | ------------ | | ||
| | `title` | A human-readable title. | Required | | ||
| | `name` | Identifier string as a URL-friendly slug. | Required | | ||
| | `name` | Short identifier string as a [URL-friendly slug](https://en.wikipedia.org/wiki/Clean_URL#Slug). | Recommended | |
There was a problem hiding this comment.
Same question as line 44 below, plus these descriptions should match
|
Big question, should the samples live in their own repo, e.g., TIDES-samples? We could still load them in for testing as a git submodule. It would make regular contributions and maintenance on the TIDES-transit repo slightly easier by avoiding large data transfers with pulls. |
I would say - let's start simpler (this PR which we have been discussing for months) and move to more complex if/when we need to. |
- add documentation - fix `main.py` dict lookup to support sources - vendor nesting
|
@botanize I think I addressed your requests but for getting the validate statement to work to validate the whole package (which I haven't figured out yet/haven't had time to dive into). Do you think we could merge this in advance of getting that working so that the sample structure is documented? |
Dismissing in light of individual labor constraints and a contributor group mandate to merge a few of these outstanding PRs
SorenSpicknall
left a comment
There was a problem hiding this comment.
Approving in its current state, though we'll still want to follow up on the issue with validation discussed in the PR conversation.
|
Closing because replaced by #162 |
Pull Request
This PR adds documentation of desired samples structure and generation for sample.
This reflects the splitting up of the original PR reflecting Issue #40 (PR #75) per the suggestion made by @botanize and includes: