Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overwrite last_observed in checklist_indicators by alientaxa_cube #70

Open
mvarewyck opened this issue Dec 21, 2023 · 7 comments
Open
Assignees

Comments

@mvarewyck
Copy link
Collaborator

For the main table, it is based on the file "data_input_checklist_indicators.tsv" which seems to have last observed year 2022:

> exotenData[grepl("Oryctolagus cuniculus", exotenData$scientificName), c("scientificName", "last_observed", "locality")]
                           scientificName last_observed
1: Oryctolagus cuniculus (Linnaeus, 1758)          2022
2: Oryctolagus cuniculus (Linnaeus, 1758)          2022
3: Oryctolagus cuniculus (Linnaeus, 1758)          2022
4: Oryctolagus cuniculus (Linnaeus, 1758)          2022
                         locality
1:                         België
2: Brussels Hoofdstedelijk Gewest
3:                     Vlaanderen
4:                       Wallonië

The checklist will probably lagg behind the cube since it is not data driven. Maybe we can overwrite the last_observed based on the be_alientaxa_cube.csv.

Originally posted by @SanderDevisscher in #69 (comment)

@mvarewyck
Copy link
Collaborator Author

I had a quick look:

  • the file be_alientaxa_cube.csv doesn't contain locality information. The timeseries file contains this info, so optionally we copy the last year from there? @SanderDevisscher
  • we need to check whether the newly copied value for last year is actually an improvement: not introducing NA values or overwrite with an earlier year
  • we need to make sure the indicators file is created after the helper file for last year is updated. Order in aspbo?

@SanderDevisscher
Copy link
Collaborator

I had a quick look:

  • the file be_alientaxa_cube.csv doesn't contain locality information. The timeseries file contains this info, so optionally we copy the last year from there? @SanderDevisscher

The timeseries is based on be_alientaxa_cube so ok for me

  • we need to check whether the newly copied value for last year is actually an improvement: not introducing NA values or overwrite with an earlier year

Indeed we should only overwrite the last observed year when the new year is larger and not NA otherwise maintain the checklist value.

  • we need to make sure the indicators file is created after the helper file for last year is updated. Order in aspbo?

current workflow:

  1. data_input_checklist_indicators.tsv is updated every 1st of the month (see get_griis_checklist.yaml)
  2. PR of 1 tiggers: be_alientaxa_cube to be downloaded from zenodo as part of update_indicators_preprocessing.yaml which also creates the timeseries.
  3. PR of 1 & PR of 2 triggers upload_files_processing.yaml

What do you think needs to change ?

@SanderDevisscher
Copy link
Collaborator

offcourse new year should not be in the future as well 😅

@mvarewyck
Copy link
Collaborator Author

What do you think needs to change ?

Workflow looks okay. I will update the code in createTabularData() to incorporate the info from timeseries.

@mvarewyck
Copy link
Collaborator Author

@mvarewyck issue #70 seems to be persistent, example: Cyprinus carpio (141117232)

@mvarewyck mvarewyck reopened this Jul 18, 2024
@mvarewyck
Copy link
Collaborator Author

@mvarewyck issue #70 seems to be persistent, example: Cyprinus carpio (141117232)

Current data for Cyprinus carpio has indeed recent observations for some regions, but not for belgium/flanders

> exotenData[exotenData$species == "Cyprinus carpio", c("locality", "nubKey", "scientificName", "first_observed", "last_observed")]
   locality  nubKey                 scientificName first_observed last_observed
     <char>   <int>                         <char>          <int>         <num>
1:   België 4286975 Cyprinus carpio Linnaeus, 1758           1201          1201
2: brussels 4286975 Cyprinus carpio Linnaeus, 1758           2008          2021
3: flanders 4286975 Cyprinus carpio Linnaeus, 1758           1201          1201
4: wallonia 4286975 Cyprinus carpio Linnaeus, 1758           2007          2022

In the timeseries data (where we get more recent 'last observed' dates from), I can't find this species. I think it already includes species introduced before 1950 (see no. of rows) @soriadelva @SanderDevisscher but I can't explain why this species is not there

> dim(timeseries)
[1] 51397730       12
> timeseries[grep("4286975", as.character(timeseries$taxonKey)), ]
Empty data.table (0 rows and 12 cols): taxonKey,year,eea_cell_code,obs,pa_obs,cobs...

@SanderDevisscher
Copy link
Collaborator

The alienTaxa cube is being reworked (see inbo/aspbo#202) to include species with observations after 1950 independent of their introduction date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants