-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support workflow to deposit corpora into BiodiversityPMC #310
Comments
An example of a BiodiversityPMC query provided by Julien G. on 2024-12-03 - https://biodiversitypmc.sibils.org/?query=Halacarid%20mites&tab=plazi#results-section with attached screenshot taken on 2024-12-03 Still looking for example that show the (meta-) data associated with these records in the search results. |
Please see included json export of the search results associated with example https://biodiversitypmc.sibils.org/?query=Halacarid%20mites&tab=plazi#results-section . with first 100 lines being: {
"medline": [
{
"_index": "sibils_med24_v4.1.5.4",
"_id": "30314157",
"_score": 39.582214,
"_ignored": [
"abstract.keyword",
"annotations_str.keyword"
],
"_source": {
"title": "A checklist of epibiont suctorian and peritrich ciliates (Ciliophora) on halacarid and hydrachnid mites (Acari: Halacaridae Hydrachnidia).",
"abstract": "Based on published records and original data, a list of the epibiont suctorian and peritrich ciliates (Ciliophora) on halacarid and hydrachnid mites is presented. Altogether 13 suctorian and 10 peritrich species from hydrachnid and halacarid mites were listed. From this list, six suctorian and one peritrich species have been reported from halacarid mites, while four suctorian and four peritrich species were found on hydrachnid mites determined up to species level. The remaining specimens were determined upto the generic level. The halacarid and hydrachnid species do not share any suctorian and peritrich species and some of the ciliate species are specific to certain taxonomic groups of the hosts.The host specificity of both suctorian and peritrich ciliates, localization on the host body and environment are discussed. Some ciliate species specific to hydrachnid mites prefer lotic or lentic habitats. In most cases, both suctorian and peritrich ciliates prefer only marine or only fresh water bodies. It was also mentioned that both suctorian and peritrich ciliates have not distinct preferences in localization on their host body.",
"journal": "Zootaxa",
"authors": "Chatterjee Tapas|Dovgal Igor|PeŠiĆ Vladimir|Zawal Andrzej",
"entrez_date": "2018-10-14",
"pmid": "30314157",
"mesh_terms": "D000818:Animals|D016798:Ciliophora|D017753:Ecosystem|D005618:Fresh Water|D008925:Mites",
"sup_mesh_terms": "",
"chemicals": "",
"publication_types": "Journal Article",
"keywords": "suctorians, peritrichs, halacarids, hydrachnids, epibiosis, host, specificity, localization, environment, Acari",
"pmcid": "",
"doi": "10.11646/zootaxa.4457.3.4",
"annotations_str": "mesh mesh_D016798|agriculture agrovoc_c_1618|agriculture agrovoc_c_57|mesh mesh_D057189|species ott_150772|species ott_302424|species ncbitaxon_full_6933|species ncbitaxon_full_5878|mesh mesh_D008925|species ncbitaxon_full_92068|species ott_69277|agriculture agrovoc_c_d1532eb7|species ott_5343665|species ott_804400|chemical pubchemmesh_9576412|species ncbitaxon_full_94797|mesh mesh_D000053|species ott_302424|mesh mesh_D008925|mesh mesh_D016798|agriculture agrovoc_c_1618|mesh mesh_D011996|agriculture agrovoc_c_d1532eb7|species ott_5343665|species ncbitaxon_full_5878|mesh mesh_D008925|agriculture agrovoc_c_d1532eb7|mesh mesh_D008925|mesh mesh_D008925|agriculture agrovoc_c_d1532eb7|agriculture agrovoc_c_d1532eb7|conceptual_entity covocconceptualentities_CE_37|biotic_interaction robiext_ROBI_000072|biotic_interaction robiext_ROBI_000072|conceptual_entity covocconceptualentities_CE_77|agriculture agrovoc_c_2593|agriculture agrovoc_c_3673|species ott_302424|biological_process go_bp_GO:0051179|mesh mesh_D058507|environment envo_ENVO_01000254|biotic_interaction robiext_ROBI_000043|conceptual_entity covocconceptualentities_CE_37|biotic_interaction robiext_ROBI_000072|mesh mesh_D004777|biotic_interaction robiext_ROBI_000015|mesh mesh_D017753|mesh mesh_D008925|agriculture agrovoc_c_3456|agriculture agrovoc_c_d1532eb7|species ott_302424|environment envo_ENVO_00002011|agriculture agrovoc_c_50164|agriculture agrovoc_c_3102|biotic_interaction robiext_ROBI_000072|species ott_302424|biological_process go_bp_GO:0051179|conceptual_entity covocconceptualentities_CE_37|mesh mesh_D000818|species ott_5343665|mesh mesh_D016798|agriculture agrovoc_c_1618|species ncbitaxon_full_5878|environment envo_ENVO_01001110|mesh mesh_D017753|mesh mesh_D005618|mesh mesh_D008925|agriculture agrovoc_c_d1532eb7|agriculture agrovoc_c_48277bfc|environment envo_ENVO_01000254|agriculture agrovoc_c_57|species ott_16124|species ott_150772|biological_process go_bp_GO:0051179|species ncbitaxon_full_6933|conceptual_entity covocconceptualentities_CE_77|species ncbitaxon_full_6021|mesh mesh_D058507|biotic_interaction robiext_ROBI_000072|chemical pubchemmesh_9576412|species ott_1003210|mesh mesh_D004777|mesh mesh_D000053|agriculture agrovoc_c_2593|species ncbitaxon_full_39462|conceptual_entity covocconceptualentities_CE_37",
"annotations_material": "mesh|D016798|Ciliophora|ciliophora||agrovoc|c_1618|Ciliophora|ciliophora||agrovoc|c_57|Acari|acari||mesh|D057189|Checklist|checklist||ott|150772|Acari|acari||ott|302424|Ciliophora|ciliates||ncbitaxon_full|6933|Acari|acari||ncbitaxon_full|5878|Ciliophora|ciliophora||mesh|D008925|Mites|mites||ncbitaxon_full|92068|Hydracarina|hydrachnidia||ott|69277|Hydracarina|hydrachnidia||agrovoc|c_d1532eb7|mites|mites||ott|5343665|Ciliophora|ciliophora||ott|804400|Halacaridae|halacaridae||pubchemmesh|9576412|Fenpyroximate|acari||ncbitaxon_full|94797|Halacaridae|halacaridae||mesh|D000053|Acari|acari||mesh|D011996|Records|records||covocconceptualentities|CE_37|host|host||robiext|ROBI_000072|host|host|hosts||covocconceptualentities|CE_77|Specificity|specificity||agrovoc|c_2593|environment|environment||agrovoc|c_3673|hosts|hosts||go_bp|GO:0051179|localization|localization||mesh|D058507|Host Specificity|host specificity|host, specificity||envo|ENVO_01000254|environmental system|environment||robiext|ROBI_000043|parent to|share||mesh|D004777|Environment|environment||robiext|ROBI_000015|habitat|habitats||mesh|D017753|Ecosystem|habitats|ecosystem||agrovoc|c_3456|habitats|habitats||envo|ENVO_00002011|fresh water|fresh water||agrovoc|c_50164|bodies|bodies||agrovoc|c_3102|freshwater|fresh water||mesh|D000818|Animals|animals||envo|ENVO_01001110|ecosystem|ecosystem||mesh|D005618|Fresh Water|fresh water||agrovoc|c_48277bfc|epibiosis|epibiosis||ott|16124|Peritrichia|peritrichs||ncbitaxon_full|6021|Peritrichia|peritrichs||ott|1003210|Suctoria|suctorians||ncbitaxon_full|39462|Suctoria|suctorians"
},
"processed_facets": {
"cellosaurus": [],
"species": [],
"ott": [
"Acari",
"Ciliophora",
"Hydracarina",
"Halacaridae",
"Peritrichia",
"Suctoria"
],
"robi": [],
"journal": [
"Zootaxa"
],
"publication_types": [
"Journal Article"
],
"article_type": [],
"subset": [],
"chemicals": [],
"mesh_terms": [
"D000818:Animals",
"D016798:Ciliophora",
"D017753:Ecosystem",
"D005618:Fresh Water",
"D008925:Mites"
],
"ext": [],
"licence": [],
"language": []
}
},
{
"_index": "sibils_med24_v4.1.5.4",
"_id": "36095786",
"_score": 38.474808,
"_ignored": [
"abstract.keyword",
"annotations_str.keyword"
],
"_source": {
"title": "An annotated checklist of halacarid mites (Acari, Halacaridae) from India.",
"abstract": "A compilation of halacarid mite species (Halacaridae) from India has been carried out based on published records. Indian halacarid records were mostly found among algal habitats, some are also reported associated with pneumatophores and halophytes. Copidognathus is the most dominant genus with 19 species reported from the Indian coast. Reports of halacarid mites from Indian interstitial habitats are mostly doubtful and needs verification. Acarochelopodia delamarei, Copidognathus fabricii, Scaptognathus hallezi, Simognathus minutus should be excluded from Indian record. The real diversity and distribution of Indian halacarid fauna is far from being complete and future investigations may reveal further new taxa.",
"journal": "Zootaxa",
"authors": "Chatterjee Tapas",
"entrez_date": "2022-09-12",
"pmid": "36095786",
"mesh_terms": "D000818:Animals|D017753:Ecosystem|D007194:India|D008925:Mites",
"sup_mesh_terms": "",
"chemicals": "",
"publication_types": "Journal Article",
"keywords": "",
"pmcid": "",
"doi": "10.11646/zootaxa.5141.4.1",
"annotations_str": "mesh mesh_D008925|mesh mesh_D007194|mesh mesh_D057189|agriculture agrovoc_c_57|species ncbitaxon_full_94797|chemical pubchemmesh_9576412|agriculture agrovoc_c_d1532eb7|species ott_150772|species ncbitaxon_full_6933|mesh mesh_D000053|species ott_804400|species ott_804400|species ncbitaxon_full_94797|mesh mesh_D011996|mesh mesh_D007194|mesh mesh_D008925|agriculture agrovoc_c_25112|agriculture agrovoc_c_3456|mesh mesh_D011996|biotic_interaction robiext_ROBI_000015|mesh mesh_D017753|mesh mesh_D055051|species ncbitaxon_full_1027280|species ott_819122|environment envo_ENVO_01000687|mesh mesh_D006301|mesh mesh_D058028|agriculture agrovoc_c_3456|biotic_interaction robiext_ROBI_000015|mesh mesh_D017753|agriculture agrovoc_c_d1532eb7|mesh mesh_D008925|species ott_3534487|species ncbitaxon_full_1027280|species ott_3534875|species ott_3535348|species ncbitaxon_full_2060451|species ncbitaxon_full_2060448|species ott_3535176|agriculture agrovoc_c_2821|conceptual_entity covocconceptualentities_CE_19|agriculture agrovoc_c_15ab5afd|mesh mesh_D000818|environment envo_ENVO_01001110|mesh mesh_D017753|mesh mesh_D007194|mesh mesh_D008925|agriculture agrovoc_c_d1532eb7",
"annotations_material": "mesh|D008925|Mites|mites|mite||mesh|D007194|India|india||mesh|D057189|Checklist|checklist||agrovoc|c_57|Acari|acari||ncbitaxon_full|94797|Halacaridae|halacaridae||pubchemmesh|9576412|Fenpyroximate|acari||agrovoc|c_d1532eb7|mites|mites||ott|150772|Acari|acari||ncbitaxon_full|6933|Acari|acari||mesh|D000053|Acari|acari||ott|804400|Halacaridae|halacaridae||mesh|D011996|Records|records||agrovoc|c_25112|halophytes|halophytes||agrovoc|c_3456|habitats|habitats||robiext|ROBI_000015|habitat|habitats||mesh|D017753|Ecosystem|habitats|ecosystem||mesh|D055051|Salt-Tolerant Plants|halophytes||ncbitaxon_full|1027280|Copidognathus|copidognathus||ott|819122|Copidognathus|copidognathus||envo|ENVO_01000687|coast|coast||mesh|D006301|Health Services Needs and Demand|needs||mesh|D058028|Research Report|reports||ott|3534487|Copidognathus fabricii|copidognathus fabricii||ott|3534875|Acarochelopodia delamarei|acarochelopodia delamarei||ott|3535348|Simognathus minutus|simognathus minutus||ncbitaxon_full|2060451|Simognathus|simognathus||ncbitaxon_full|2060448|Scaptognathus|scaptognathus||ott|3535176|Scaptognathus hallezi|scaptognathus hallezi||agrovoc|c_2821|fauna|fauna||covocconceptualentities|CE_19|distribution|distribution||agrovoc|c_15ab5afd|new taxa|new taxa||mesh|D000818|Animals|animals||envo|ENVO_01001110|ecosystem|ecosystem"
},
"processed_facets": {
"cellosaurus": [],
"species": [],
"ott": [
"Acari",
"Halacaridae",
"Copidognathus",
"Copidognathus fabricii",
"Acarochelopodia delamarei",
"Simognathus minutus",
"Scaptognathus hallezi"
],
"robi": [],
"journal": [
"Zootaxa" |
Please see included csv export of the search results associated with example https://biodiversitypmc.sibils.org/?query=Halacarid%20mites&tab=plazi#results-section . with apparent malformed line on line 4 cat sibils_2024-12-03-10h30.csv | mlr --icsvlite --omd --ifs ';' cat producing:
|
@myrmoteras suggests to create a csv file like the following:
|
The underlying workflow could be as depicted in the attached diagram.
With this, we have create another derived data product which includes a json and csv file that contains all the deposits made within a specific upload activity of a version of a literature corpus. In other words, we'd like to have a feature that help export an entire Zenodo community into a format that is compatible with Zotero, Endnote etc. |
Note that a reference list can be download from https://batlit.org/refs.csv .
screenshot below using https://batlit.org/refs on 2024-12-05 |
@myrmoteras said
The underlying (literature) dataset for the bhl-blr, taxodros and batlit Zenodo communities are available as versioned and signed data packages. So, the deposits in the Zenodo communities are a reflection of the versioned literature datasets (corpora): the originals are found elsewhere. And, these deposits are not version controlled, because the metadata can be edited without leaving much of a trace.
This is why I propose to implement the following workflow, similar to already existing workflows (e.g., taxodros -> Zenodo, batlit -> Zenodo), where the taxodros, batlit, and bhl-blr corpora are translated into BiodiversityPMC speak. This way, we have a controlled workflow (i.e. working from versioned sources) which we run without having to worry about Web API rate limiting, or other constraints (e.g., search results restricted to 10k).
Happy to discuss more, please feel free to schedule a meeting.
Curious to hear your thoughts!
The text was updated successfully, but these errors were encountered: