Skip to content

Add documentation of sample data structure#100

Closed
e-lo wants to merge 25 commits intoTIDES-transit:mainfrom
e-lo:pr1-sample-data
Closed

Add documentation of sample data structure#100
e-lo wants to merge 25 commits intoTIDES-transit:mainfrom
e-lo:pr1-sample-data

Conversation

@e-lo
Copy link
Copy Markdown
Contributor

@e-lo e-lo commented Jan 7, 2023

Pull Request

This PR adds documentation of desired samples structure and generation for sample.

This reflects the splitting up of the original PR reflecting Issue #40 (PR #75) per the suggestion made by @botanize and includes:

  • Pull-requests must address an existing issue
  • Update relevant documentation.

e-lo added 19 commits September 26, 2022 15:41
Use this instead because it is more configurable/extensible than the redirects.
1. Add example data (imperfect for now) generated by script `create_example` in data/example/data
2. Update validate-data GH action to refer to correct repos.
- Linted
- Auto-generated tabular-data-package datapackage.json from script
- Removed random data and the scripts to generate it, kept csv templates
- Added readmes and documentation about each example in directory
-
- Rendered samples as a table w/out readmes
- Renamed data folders to samples and TIDES
- Added metadata to datapackage.json about vendors, github handles, NTDID, agency name and exposed in documentation
- Updated documentation for consistency
- Renamed 'example' to 'template' to be more consistent with what it is
- changed workflow title to fix link with spaces
- moved datapackage.json up one directory, updated docs, updated source refs
- added missing --schema flag
- Removes refs to template
- Removes refs to github workflows
@e-lo e-lo added the 📙 docs Elaborating or updating the documentation – inline or otherwise label Jan 7, 2023
@e-lo e-lo requested a review from botanize January 7, 2023 21:23
@e-lo e-lo self-assigned this Jan 7, 2023
Copy link
Copy Markdown
Contributor

@botanize botanize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few places we could make some nice usability improvements, let me know what you think about my suggestions.

Comment on lines +32 to +36
Once this is created, mapping the data files to the schema, simply run:

```sh
frictionless validate datapackage.json
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this just validates the datapackage.json file, not the contents of the datapackage?

Copy link
Copy Markdown
Contributor Author

@e-lo e-lo Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigating...

Comment on lines +62 to +65
```bash
pip install frictionless
frictionless validate path/to/your/datapackage.json
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, when I run this it seems to only validate the datapackage.json file, not the datapackage itself? Or maybe it's just because I'm getting a nonsense error, package-error The data package has an error: license requires "path" or "name": {'name': 'Apache-2.0'}, does this work for you?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh - i thought you were talking before about it SHOULD only validate the datapackage file and not the resourcs. Will investigate.

e-lo added 3 commits January 30, 2023 08:30
Notable:
- added product_version as optional field
- made name optional in datapackage.json
Copy link
Copy Markdown
Contributor

@botanize botanize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of consistency issues

| **Field** | **Description** | **Required** |
| --------- | --------------- | ------------ |
| `name` | Short sluggable name used to refer to data in this file. | Required |
| `name` | Short [sluggable](https://en.wikipedia.org/wiki/Clean_URL#Slug) name used to refer to data in this file. `name` must be unique within this datapackage. | Required |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sluggable, or does it need to be a slug?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be sluggable.

| --------- | --------------- | ------------ |
| `title` | A human-readable title. | Required |
| `name` | Identifier string as a URL-friendly slug. | Required |
| `name` | Short identifier string as a [URL-friendly slug](https://en.wikipedia.org/wiki/Clean_URL#Slug). | Recommended |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as line 44 below, plus these descriptions should match

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made more consistent.

@botanize
Copy link
Copy Markdown
Contributor

botanize commented Feb 1, 2023

Big question, should the samples live in their own repo, e.g., TIDES-samples? We could still load them in for testing as a git submodule. It would make regular contributions and maintenance on the TIDES-transit repo slightly easier by avoiding large data transfers with pulls.

@e-lo
Copy link
Copy Markdown
Contributor Author

e-lo commented Feb 1, 2023

Big question, should the samples live in their own repo, e.g., TIDES-samples?

I would say - let's start simpler (this PR which we have been discussing for months) and move to more complex if/when we need to.

e-lo added 2 commits March 27, 2023 10:42
- add documentation
- fix `main.py` dict lookup to support sources - vendor nesting
@e-lo e-lo requested a review from botanize March 27, 2023 17:49
@e-lo
Copy link
Copy Markdown
Contributor Author

e-lo commented Mar 27, 2023

@botanize I think I addressed your requests but for getting the validate statement to work to validate the whole package (which I haven't figured out yet/haven't had time to dive into). Do you think we could merge this in advance of getting that working so that the sample structure is documented?

@e-lo e-lo changed the title Pr1 sample data Add documentation of sample data structure May 3, 2023
@e-lo e-lo requested a review from SorenSpicknall June 14, 2023 18:44
@SorenSpicknall SorenSpicknall dismissed botanize’s stale review June 16, 2023 20:56

Dismissing in light of individual labor constraints and a contributor group mandate to merge a few of these outstanding PRs

Copy link
Copy Markdown
Contributor

@SorenSpicknall SorenSpicknall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving in its current state, though we'll still want to follow up on the issue with validation discussed in the PR conversation.

@e-lo
Copy link
Copy Markdown
Contributor Author

e-lo commented Jul 24, 2023

Closing because replaced by #162

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

📙 docs Elaborating or updating the documentation – inline or otherwise

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants