Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Task: For raw ATN **trajectory** data sets, finalize and implement automated NCEI data archive pathway #41

Open
6 of 12 tasks
MathewBiddle opened this issue May 1, 2024 · 26 comments
Assignees
Labels
ATN Issues relating to the Animal Telemetry Network enhancement New feature or request

Comments

@MathewBiddle
Copy link
Contributor

MathewBiddle commented May 1, 2024

Who is requesting this?

@ioos/marine-life, and the ATN community

What is being requested?

What is the requested deadline and why?

No response

What is the current status quo (i.e., what happens if this does not get done)?

ATN tajectory data are not archived at NCEI and PARR requirements are not met.

What indicates this is done (i.e., how do we know this is complete)?

Provide a description or any other important information.

xref:

@MathewBiddle
Copy link
Contributor Author

What does "and archive administrative tools" mean in the title?

@laurabrenskelle laurabrenskelle added the ATN Issues relating to the Animal Telemetry Network label May 2, 2024
@MathewBiddle MathewBiddle changed the title New Task: For raw ATN trajectory data sets, finalize and implement automated NCEI data archive pathway and archive administrative tools. New Task: For raw ATN trajectory data sets, finalize and implement automated NCEI data archive pathway May 2, 2024
@MathewBiddle
Copy link
Contributor Author

@iamchrisser @mmckinzie where are we with this automation? Are the file structures settled?

I'd like to release v1.0 of the template, update the website, and stick a fork in the automation task so we can move to mobilizing these data to obis and move on to the profile stream.

@MathewBiddle
Copy link
Contributor Author

Getting close. Have a manual process built for now but working on automating. Ran into some issues with seaname mappings.

@mmckinzie
Copy link

Still trying to resolve issue with :seanames and update :acknowledgement attribute. Files are still being manually generated. Automation process still needs to be developed and tested.

@mmckinzie mmckinzie added the enhancement New feature or request label Jul 11, 2024
@MathewBiddle
Copy link
Contributor Author

pipeline is being documented in ioos/ioos-atn-data#80

@MathewBiddle
Copy link
Contributor Author

@MathewBiddle
Copy link
Contributor Author

I have ported the documentation over to the website https://ioos.github.io/ioos-atn-data/atn-archive.html

@mmckinzie
Copy link

Still waiting on the next batch of test files from Axiom to confirm :seanames and :acknowledgement issue has been appropriately resolved.

@MathewBiddle
Copy link
Contributor Author

@lmshowalter3 any updates on a new batch of test files?

@MathewBiddle MathewBiddle changed the title New Task: For raw ATN trajectory data sets, finalize and implement automated NCEI data archive pathway New Task: For raw ATN **trajectory** data sets, finalize and implement automated NCEI data archive pathway Aug 2, 2024
@mmckinzie
Copy link

The next batch of test files are now available. :seanames and :acknowledgement appear to be updated correctly. Will send to NCEI for review.

I also notice a new minor issue in the :citation. It should follow the following format: [creator_last_name, creator_first_name]; [contributor_last name(s), contributor_first name(s)]. [year]. [animal_common_name] ([animal_scientific_name]) location data from a [instrument] (ptt id [ptt_id]) deployed in the [sea_name] from [deployment_state_datetime] to [deployment_end_datetime], deployment id [deployment_id]. [Dataset]. [publisher_name].

Presently files list :publisher_institution instead of :publisher_name.

@mmckinzie mmckinzie removed their assignment Aug 5, 2024
@MathewBiddle
Copy link
Contributor Author

NCEI just tested the new files. Expect feedback from them soon.

@mmckinzie
Copy link

NCEI tested/reviewed files. Typo still present in select :seanames. This issue has been subsequently resolved by Axiom and they will generate a new batch of test files to verify.

It was also confirmed that Axiom derived :seanames now match NCEI derived list. Both are now using all positions to determine sea names regardless of quality. This approach may need to be revisited in the future as we may want to consider at least removing the 'Z' locations prior to generating list of sea names to prevent erroneous seas from being included in dataset titles, abstracts and citations.

NCEI also recommended that we add :place as a global attribute to template to capture more localized or specific locations or bodies of water not captured in the seaname list. These would become searchable keywords at NCEI. Relatively easy to add :place to the template but it would require additional development work by Axiom to be able to harvest appropriate metadata from the ADR (deployment_location field) and RW (Arbitrary Keywords, type listed as 'place'). More generalized locations listed in project titles (e.g., Mid-Atlantic, US West Coast, Mexico) would get added to the RW arbitrary keywords by ATN DC and the more specific deployment by deployment locations (e.g., Virgina Beach, VA; Guadalupe Island; LA-Long Beach Harbor) provided by PI would be added to ADR metadata.

We'll need to decide if this recommendation is a priority and let Axiom know how we want to proceed. Simply add :place to template but leave blank for now or fully develop so attribute can be properly populated.

@MathewBiddle
Copy link
Contributor Author

As the :place name attribute recommendation does not break the NCEI archive process and is more of a "nice to have" I suggest we leave it off for now. Sticking in an empty attribute is fine, low hanging fruit (I assume - correct me if I'm wrong). And it allows NCEI to code up using it in the metadata, when we decide it's worth the effort to populate it.

So, add it to the list of things to be addressed in the next go around and celebrate victory.

@mmckinzie please create an issue at https://github.com/ioos/ioos-atn-data/issues for the :place enhancement to the template.

@mmckinzie
Copy link

issue ticket has been created regarding :place attribute for a v2 enhancement, no more issues found or flagged within submitted v1 test files by NCEI. Recent test files will be pushed to the production folder. Ready to move on to fully automating and operationalizing pipeline to NCEI. Future generated NCEI files can go straight into the prod folder, no need to stage in test folder.

@iwensu0313 iwensu0313 assigned iwensu0313 and unassigned lmshowalter3 Oct 4, 2024
@conniekot
Copy link

conniekot commented Oct 8, 2024

Notes from ATN DAC Data Coordinator call:

  • NCEI Infrastructure is down due to Hurricane, likely a few weeks before everything is back up and running and NCEI will be able to pick-up files from production folder and index files under ATN collection
  • Ready for Axiom to begin operationalizing and automating pipeline from ADR to NCEI production folder for satellite telemetry trajectory files
  • For now, manually generate NCEI files and push to production folder, email NCEI as new files appear
    -- OK to move ADR project 16 files from test directory to prod folder - In Progress
    -- Next batch of files will be from Baird: Odontocete tagging in the main HI (ADR project 32); Followed by DiGiovanni: Harbor Seal Tagging and Tracking in Virginia (ADR project 91) and London: Ribbon and spotted seals in Bering sea (ADR project 240). - Meg to ping Ian/Trello when ready

@MathewBiddle
Copy link
Contributor Author

@conniekot thanks for the update. I don't think you intended to link to issue #5, instead reference to some request #5?

ATN should continue progress on this while NCEI is on hold. NCEI would prefer to pick up a batch of packages anyways to fully test for all the variability in the files. It looks like there is a good list of items to continue work on, so let's keep pushing forward.

@iwensu0313
Copy link

iwensu0313 commented Oct 18, 2024

@mmckinzie test files are now in prod for NCEI to pick up whenever they are back up and running https://ncei.axiomdatascience.com/atn/prod/ (let me know if that looks good)

@mmckinzie
Copy link

@iwensu0313 Great, thanks for letting me know. I will send an email to NCEI notifying them that new files are available in the production folder.

@mmckinzie
Copy link

Next steps towards full automation of the NCEI pipeline by Axiom to ensure its fully operational will be to resolve issue or devise a more appropriate approach for populating the creator/contributor global attributes within the satellite trajectory NetCDF files over harvesting these metadata attributes from the RW project level-metadata → contacts → citation folks. The current approach conflicts with the DACs data management/submmsion guidelines allowing data contributors to have distinct 'project', 'deployment' and 'dataset' level metadata as well as has implications for metadata presently contained in the portal project pages.

@MathewBiddle
Copy link
Contributor Author

MathewBiddle commented Oct 24, 2024

Three more packages have been archived! 🎉

For reference, here are the others that have been archived:

I believe the search below should always return the appropriate records from NCEI:
https://data.noaa.gov/onestop/collections?q=%22Integrated%20Ocean%20Observing%20System%20(IOOS)%20Animal%20Telemetry%20Network%22

Note that NOAA OneStop indexing from NCEI records might not be very frequent. So, records might be at NCEI but not appearing in OneStop just yet. @laurabrenskelle might know more about that.

@mmckinzie In response to your comment above. Can you outline what the current approach is vs what you are proposing needs to be change? I don't quite follow the details above. What is the conflict? What is needed to be fixed? Can you summarize what the current "manual" process is for generating these packages?

edit: edited to add a third package archived.

@laurabrenskelle
Copy link
Contributor

@MathewBiddle WAFs are set up to be indexed on I believe a daily, weekly, monthly, or on demand frequency at NCEI. It looks like the IOOS WAF is currently set to process weekly. It does say it is currently processing, but it also says the last time this WAF was processed was in August. I will reach out to the appropriate NCEI team to ask why this might be. Also, if weekly is not enough, we can change the indexing frequency, especially if we anticipate more updates on a regular basis soon.

@MathewBiddle
Copy link
Contributor Author

Can we say that ATN is now "begin operationally submitting non-embargoed deployments to NCEI"? I think so.

@MathewBiddle
Copy link
Contributor Author

@iamchrisser
Copy link

Thanks for writing up that news story. I agree that we can say that we're operational.

Nice work, everyone!

@mmckinzie
Copy link

I also agree pathway is operational although not yet fully automated.

@MathewBiddle thanks for drafting the news story, I have not yet had a chance to review it or provide edits/comments, but I will get to it tomorrow 10/29.

@mmckinzie
Copy link

@MathewBiddle Overall I think the new story looks good. I made a few minor suggestions and edits as well as added a few comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATN Issues relating to the Animal Telemetry Network enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

7 participants