Skip to content

15 studies encompassing 24 different datasets (matrices) from GEO breast cancer gene expression experiments. Treatment information and outcomes data is also curated for each patient (datasets were carefully chosen based upon whether I could determine each sample/patient's treatment regimen, and whether the samples had at least one outcomes varia…

Notifications You must be signed in to change notification settings

kplaney/curatedBreastCancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

curatedBreastCancer

-NOTE: as of 2016, I have an updated script for the gene expression processing. Please see the breastProcessAndGeneFeatures_script.R in the Concide repo for accessing this new updated processing function. The same funcitons are indeed on the curatedBreastData repo, but not in the Bioconductor package, as I am in the process of updating this package submission. I plan to remove the processing functions from the curatedBreastData package and just point users to the Coincide package, as Bioconductor encourages developers to keep code out of database packages exactly to avoid this issue of trying to update code quicker.

-The only difference is a very small bug that crashes on one dataset (so it won't produce incorrect output, it just stops) as it was a minor ,drop=FALSE indexing issue. +I released a slightly updated processing code in Bioconductor 3.4 in the fall of 2016. The main difference is a very small bug that crashes on one dataset (so it won't produce incorrect output, it just stops) as it was a minor ,drop=FALSE indexing issue. I am in the process of fixing the DFS_months_or_MIN months variables for GSE16446, as it is in days, not months (you can just divide the days by 28 to get months, but this is only needed for this one specific study.)

The GSE.... R files are mainly very early records of how I processed these files, but are not intended to be directly re-run. However, I do have notes on the different datasets if you need more specifics than is provided in the corresponding publications, https://www.ncbi.nlm.nih.gov/pubmed/24303324 and https://www.ncbi.nlm.nih.gov/pubmed/26961683 (the latter has some more details in the supplementary methods section).

About

15 studies encompassing 24 different datasets (matrices) from GEO breast cancer gene expression experiments. Treatment information and outcomes data is also curated for each patient (datasets were carefully chosen based upon whether I could determine each sample/patient's treatment regimen, and whether the samples had at least one outcomes varia…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published