Messaging should be printed #43

stemangiola · 2025-03-13T03:35:17Z

Thanks for this package

The messaging about msigdbdf is only printed if library(msigdbr) is called

library(msigdbr )
For full functionality, please install the 'msigdbdf' package with:
install.packages('msigdbdf', repos = 'https://igordot.r-universe.dev')

However, if the function is referenced with the package prefix (library is never called)

> msigdbr::msigdbr() 
The 'msigdbdf' package must be installed to access the full dataset.
# A tibble: 11,597 × 19
   gene_symbol ncbi_gene ensembl_gene    db_gene_symbol db_ncbi_gene
   <chr>       <chr>     <chr>           <chr>          <chr>       
 1 ABCC4       10257     ENSG00000125257 ABCC4          10257       
 2 ABTB3       121551    ENSG00000151136 ABTB3          121551      
 3 ADAMTSL3    57188     ENSG00000156218 ADAMTSL3       57188       
 4 ANKRD13A    88455     ENSG00000076513 ANKRD13A       88455       
 5 ATL1        51062     ENSG00000198513 ATL1           51062       
 6 B4GALNT3    283358    ENSG00000139044 B4GALNT3       283358      
 7 CA10        56934     ENSG00000154975 CA10           56934       
 8 CACNB1      782       ENSG00000067191 CACNB1         782         
 9 CAMK4       814       ENSG00000152495 CAMK4          814         
10 CCDC106     29903     ENSG00000173581 CCDC106        29903       
# ℹ 11,587 more rows
# ℹ 14 more variables: db_ensembl_gene <chr>, source_gene <chr>, gs_id <chr>,
#   gs_name <chr>, gs_collection <chr>, gs_subcollection <chr>,
#   gs_collection_name <chr>, gs_description <chr>, gs_source_species <chr>,
#   gs_pmid <chr>, gs_geoid <chr>, gs_url <chr>, db_version <chr>,
#   db_target_species <chr>
# ℹ Use `print(n = ...)` to see more rows

The instructions to install the package are never printed.

msigdbr/R/msigdbr-check-data.R

Line 10 in 3058edd

message("The 'msigdbdf' package must be installed to access the full dataset.")

I think the instructions for installing the package should be printed on any occasion.

Thanks a lot.

The text was updated successfully, but these errors were encountered:

igordot · 2025-03-13T14:42:06Z

Thank you for the suggestion. I was debating how to best handle the messaging. A more extreme approach I considered was not including any data and returning an error, but I decided against that that since it would break some of the reverse dependencies. Regardless, I agree it's a good idea to make the message more clear. This was posted on CRAN yesterday, so I would like to give it a couple of days before pushing an update in case anything else comes up.

In general, the new version is obviously a big change from the previous CRAN releases and there were some intermediate attempts to provide a more seamless experience that did not pass CRAN review. Apologies about any complications.

stemangiola · 2025-03-13T21:12:16Z

What is the reason for not including the dependency for the data? The experiment hub of the bioconductor allows downloading data when needed. So, the data dependency package can be gracefully added to the DESCRIPTION.

igordot · 2025-03-14T02:50:11Z

My original plan was to keep this as a single package on CRAN. This worked well for several years. Eventually I hit the size limit. I tried to submit it as a data-only package and it was not approved. I decided to host the data package on R-universe. The caveat with non-CRAN dependencies is that they need be optional (DESCRIPTION Suggests field) so they are not installed by default.

ExperimentHub is an interesting option. I have not looked into it extensively, but I believe it would still require an extra step to install the data. More importantly, the data is tied to a specific Bioconductor release. Then the most recent release is only available on the latest Bioconductor release which in turn is tied to the latest R release.

stemangiola · 2025-03-14T02:54:03Z

Given the scope of this software I would think Bioconductor is the right place. Definitely, you can trigger data download automatically when whatever function is called from a mother package.

Plus ExperimentHub make so you have all the caching framework for free.

The way it is now is a bit cumbersome :)

igordot added the enhancement New feature or request label Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Messaging should be printed #43

Messaging should be printed #43

stemangiola commented Mar 13, 2025

igordot commented Mar 13, 2025

stemangiola commented Mar 13, 2025

igordot commented Mar 14, 2025

stemangiola commented Mar 14, 2025

Messaging should be printed #43

Messaging should be printed #43

Comments

stemangiola commented Mar 13, 2025

igordot commented Mar 13, 2025

stemangiola commented Mar 13, 2025

igordot commented Mar 14, 2025

stemangiola commented Mar 14, 2025