Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype for storing single-cell data #1020

Draft
wants to merge 482 commits into
base: development
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
482 commits
Select commit Hold shift + click to select a range
7a98c94
Add WebEntityUrlBuilder and RestEntityUrlBuilder
arteymix Oct 26, 2024
7a010d5
Make URLs generated by OntologyController are relative to the context…
arteymix Oct 27, 2024
cb8680f
Allow generating URLs for FactorValue and Characteristic
arteymix Oct 27, 2024
6cb296e
Move URL builders component declarations to the XML config
arteymix Oct 27, 2024
1b84c99
rest: Fix test assuming TSV is the default, MEX is
arteymix Oct 27, 2024
2479d6d
rest: Few more improvements for locked files
arteymix Oct 28, 2024
c137776
Add a CLI for detecting QT from data
arteymix Oct 28, 2024
680ef93
Make sure that a converted lock is not released twice
arteymix Oct 28, 2024
c6fa792
rest: Add examples for design and MEX outputs
arteymix Oct 28, 2024
e2dc7f7
Add a .mex suffix to MEX-structured directories
arteymix Oct 28, 2024
7471c00
Update completion scripts
arteymix Oct 28, 2024
dd9e0f2
Add the necessary fields and logic for storing single-cell sparsity m…
arteymix Oct 29, 2024
96abedc
Improve script for updating REST API docs examples
arteymix Oct 29, 2024
46e6d3b
cli: Print a link to the experiment and add more checks when setting …
arteymix Oct 29, 2024
e7dd185
Add a dev script for updating the test database
arteymix Oct 29, 2024
4ed5901
Add a dev script to deploy the CLI
arteymix Oct 29, 2024
ca1b313
rest: Update REST docs examples
arteymix Oct 30, 2024
368f64a
Ignore .envrc and Jupyter notebook checkpoints
arteymix Oct 30, 2024
a1d98f8
Remove all .csvignore files
arteymix Oct 30, 2024
9aaa67c
rest: Allow the MexMatrixBundler to pre-calculate the size of te archive
arteymix Oct 30, 2024
0b2d93e
rest: Add missing MEX test files for DatasetsWebServiceTest
arteymix Oct 30, 2024
497cec8
Fix SingleCellExpressionExperimentAggregatorServiceTest
arteymix Oct 29, 2024
2639c38
Always regenerate the version file before launch or deploying
arteymix Oct 30, 2024
098d2b6
Fix behavior for log1p and log-scale data (fix #1274)
arteymix Oct 31, 2024
e848c1b
Add methods for obtaining the columns of a SingleCellExpressionDataMa…
arteymix Nov 1, 2024
5a73241
Update maven-site-plugin to 3.21
arteymix Nov 1, 2024
df47515
Add support for detecting log-transformed counts
arteymix Nov 4, 2024
f555487
Update jackson to 2.18.1
arteymix Nov 4, 2024
38c1b97
Update plugins
arteymix Nov 4, 2024
0a1bf4b
Implement log2cpm computation for aggregating count data
arteymix Nov 5, 2024
a493b2a
Various single-cell work
arteymix Nov 5, 2024
7a083e9
Add a task executor to limit the nmber of tasks that generate data files
arteymix Nov 3, 2024
9e43425
Fix aggregation for last sample and implement sparsity metrics
arteymix Nov 5, 2024
289f31e
Fix tests and two minor bugs in EE service
arteymix Nov 6, 2024
385bbd2
Few more cleanups for DataVector usage and remove unused primitive types
arteymix Nov 6, 2024
7f57646
Fix detection of single-cell data and inclusion of samples in GeoConv…
arteymix Nov 6, 2024
b67f3c4
Only cache skipped archive if we're looking for MEX data
arteymix Nov 6, 2024
06c5a33
Fix MatrixConversionTest using setDataAsDoubles() before setting a QT
arteymix Nov 6, 2024
53a82d0
More cleanups
arteymix Nov 6, 2024
29b2ea2
Re-enable incremental compilation
arteymix Nov 7, 2024
3165185
Improve elements mapping (fix #1234)
arteymix Nov 7, 2024
24277fd
Fix lazy initialization error when checking a dataset accession
arteymix Nov 7, 2024
fedf83a
Remove empty RestEntityUrl file
arteymix Nov 7, 2024
9ae8892
Don't execute deployment script in a terminal
arteymix Nov 7, 2024
c729d0e
Fix more tests
arteymix Nov 7, 2024
1b2470e
Fix exceeding packet size for MySQL by streaming cell IDs
arteymix Nov 7, 2024
8f6f50b
Replace the pipe with a pair if piped streams
arteymix Nov 8, 2024
cc9f32e
Consider the scale type when computing sparsity metrics
arteymix Nov 8, 2024
076e6e1
Improve handling of layered AnnData files
arteymix Nov 8, 2024
9d4b947
Numerious cleanups for Gemma Web [WIP]
arteymix Nov 8, 2024
e3e85c4
Replace JAWR with Webpack
arteymix Nov 8, 2024
f087c19
Merge branch 'feature-single-cell-webpack' into feature-single-cell-u…
arteymix Nov 15, 2024
1e38dee
Set the JSP trim whitespace directive in web.xml
arteymix Nov 15, 2024
89aa08b
Few improvements for the static assets server
arteymix Nov 15, 2024
de6bff5
Add IntelliJ scripts for building and serving static assets
arteymix Nov 15, 2024
7bf620e
Few more improvements for the subset page
arteymix Nov 15, 2024
5461ae6
Fix source mappings in web.xml
arteymix Nov 15, 2024
266dd4d
Fix EE set page
arteymix Nov 15, 2024
39fe57c
Fix visibility of sprintf and visualizeDiffExpressionHandler
arteymix Nov 15, 2024
f45ef66
Merge pull request #1288 from PavlidisLab/feature-single-cell-ui-work
arteymix Nov 15, 2024
016f899
Fix the lineplot and flotr2
arteymix Nov 15, 2024
490771a
Move the JavascriptLogger into the gemma-lib bundle since it requires…
arteymix Nov 15, 2024
99e6320
Fix incorrect assertion in ExperimentalDesignWriter
arteymix Nov 15, 2024
5f8e207
Generate DWR client code
arteymix Nov 18, 2024
96fdd61
More frontend cleanups
arteymix Nov 19, 2024
85485be
Allow edition of single-cell QTs
arteymix Nov 21, 2024
1953fb6
Allow overriding name, type and scale type when loading SC vectors
arteymix Nov 25, 2024
6807ad7
Fix NPE error on the homepage when a Taxon common name is null
arteymix Nov 26, 2024
81353cf
Few improvements for scripts
arteymix Nov 27, 2024
7ec756d
Add a frontend warning when the static asset server is not running
arteymix Nov 27, 2024
56e8a20
Add sparsity heatmap, subset expression data heatmap and other single…
arteymix Nov 27, 2024
2703956
Handle unmapped genes in heatmap
arteymix Dec 2, 2024
2a8835a
Allow anonymous users to query EE subsets by bioassays
arteymix Dec 2, 2024
875f334
Add missing AbstractHeatMapTag class
arteymix Dec 2, 2024
615e990
Add a CliAuthenticationAware interface to decouple the REST API clien…
arteymix Dec 3, 2024
8614551
Remove all @Component annotations from gemma-cli
arteymix Dec 10, 2024
507a3fb
Fix type signature for locateQuantitationType()
arteymix Dec 10, 2024
e78cd91
Update the link to Jenkins in pom.xml
arteymix Dec 12, 2024
7a52af3
Add an interface for single-cell detector that can operate with the s…
arteymix Dec 13, 2024
a86a218
Breakdown StartupListener into ApplicationContextInitializer
arteymix Dec 16, 2024
eaf17a5
Adapt the DiffExAnalyzer to handle pre-existing subset structure
arteymix Dec 2, 2024
bea9e07
Allow aggregating single-cell data when a factor value is missing
arteymix Dec 19, 2024
c329254
Fix missing mock in MexSingleCellDataLoaderPersistenceTest
arteymix Dec 19, 2024
aac87ff
Make injection of CliAuthenticationAware components optional
arteymix Dec 19, 2024
c6091ca
Add missing mocks in ExpressionExperimentQCControllerTest
arteymix Dec 19, 2024
807f233
Mark more tests as slow based on IntelliJ profiling
arteymix Dec 19, 2024
f53367c
Ignore SocketTimeoutException in assumeThatResourceIsAvailable()
arteymix Dec 7, 2024
b4f6142
Allow for missing sample_id column in generic cell-level metadata par…
arteymix Dec 21, 2024
e34358a
Merge pull request #1312 from PavlidisLab/feature-single-cell-metadat…
arteymix Dec 22, 2024
ffc3974
Pick an existing set of subsets if possible
arteymix Dec 26, 2024
08aeb24
Cleanup for .gitignore
arteymix Dec 27, 2024
584061b
Add CLI tools for viewing and manipulating metadata changelogs
arteymix Jan 7, 2025
e8f785c
Add more flexibility for loading cell type assignments
arteymix Jan 7, 2025
e737d45
Indicate the version of HDF5 in gemma-cli --version
arteymix Jan 7, 2025
e1e0e7a
Update completion scripts
arteymix Jan 7, 2025
9e26ee7
Make it clear that the changelog is meant for experiment metadata
arteymix Jan 7, 2025
bfb66dc
Prefill the message when adding metadata file and use Java 8 time API
arteymix Jan 8, 2025
b9e04d5
Remove HDF5 version from --version
arteymix Jan 8, 2025
b50b990
Fix missing changes for b9e04d5 and bfb66d
arteymix Jan 8, 2025
db3d5ca
Update completion scripts
arteymix Jan 8, 2025
59511df
Use consistent project name for SonarQube
arteymix Jan 8, 2025
ae508f5
Rename appendToChangelog to addChangelogEntry
arteymix Jan 10, 2025
82ef947
Add a sanity check for getH5Version()
arteymix Jan 10, 2025
cd7b23a
Add an option to skip download and only retrieve detected single-cell…
arteymix Jan 10, 2025
6482436
Fix collision of completion function names for bash
arteymix Jan 13, 2025
28c6a4f
Add a clean phase before rebuilding Gemma for generating completions
arteymix Jan 13, 2025
95a5353
Fill-in the EF category's value and value URI for cell type
arteymix Jan 13, 2025
2529abc
Remove change to Gemma CLI template run configuration
arteymix Jan 13, 2025
871ca73
Fix slow queries with distinct on CTA, CLC and characteristics
arteymix Jan 14, 2025
d05ed77
Various work to make loading CTA more flexible
arteymix Jan 10, 2025
4750c21
Update completions
arteymix Jan 15, 2025
b02833b
Merge branch 'feature-single-cell' into feature-single-cell-de
arteymix Jan 15, 2025
4651329
Always indicate the strategy used for mapping BAs to sample name
arteymix Jan 15, 2025
14e8236
Retrieve data from GEO for MexSingleCellDataLoaderTest.testGSE141552()
arteymix Jan 15, 2025
7d0dc02
Add a writer for CTA/CLC and endpoints in the REST API
arteymix Jan 15, 2025
bc809c9
Fix mangling of gemma-cli completion function for bash
arteymix Jan 16, 2025
b191d0c
Add missing mock to RNASeqDataAddCliTest
arteymix Jan 16, 2025
1de2880
Merge remote-tracking branch 'origin/feature-single-cell' into featur…
arteymix Jan 16, 2025
4fce89b
Add a check for free memory for transposing AnnData in tests
arteymix Jan 17, 2025
85501d9
Fix sorting of BMs in tests and missing factor type
arteymix Jan 17, 2025
dba2e6b
Add a check for free memory for transposing AnnData in tests
arteymix Jan 17, 2025
bc9932f
Add matchOne() and matchAll() variants that can map multiple identifi…
arteymix Jan 16, 2025
96e8531
Fix query streaming when fetch size is a multiple of the number of re…
arteymix Jan 17, 2025
25205f6
Update completions
arteymix Jan 17, 2025
a728658
Always clean before deploying CLIs or dev servers
arteymix Jan 17, 2025
823d91b
Only build production assets when the release profile is active
arteymix Jan 17, 2025
2dbc1b5
Merge branch 'feature-single-cell' into feature-single-cell-de
arteymix Jan 17, 2025
a4b654f
Indicate progress when aggregating vectors
arteymix Jan 20, 2025
08421c6
Keep the cell type as an index to make faster comparison when aggrega…
arteymix Jan 20, 2025
3f2b06a
Remove BADs after vectors to avoid constraint violations
arteymix Jan 20, 2025
618417d
Use integers to quickly check which cells have which characteristics …
arteymix Jan 20, 2025
fe7185d
Reduce logging frequency when aggregating
arteymix Jan 20, 2025
b06863c
Merge branch 'feature-single-cell' into feature-single-cell-de
arteymix Jan 20, 2025
64f5ac3
Fix DEA CLI
arteymix Jan 22, 2025
81cd64a
Add more transformation tools and improve help menu
arteymix Jan 23, 2025
4628018
Reuse logic for detecting AnnData parameters and add support for raw.X
arteymix Jan 23, 2025
7c3060b
Fix splitting of experiment (fix #1267)
arteymix Jan 17, 2025
e45c968
Fix abbreviation of strings containing multibyte characters
arteymix Jan 24, 2025
26430c7
Remove incomplete AnnData test
arteymix Jan 25, 2025
5c1c2d8
Simplify abbreviating a string with a suffix
arteymix Jan 25, 2025
13e6e9d
Add charset parameter to StringUtils
arteymix Jan 27, 2025
2574ca6
Remove empty test
arteymix Jan 27, 2025
f3fcd6d
Use the new atomic deployment strategy for the CLI
arteymix Jan 20, 2025
737564c
Update completions
arteymix Jan 27, 2025
4c076b9
ci: Make sure that the build directory does not already exist when de…
arteymix Jan 27, 2025
be8280a
Make DE resilient to certain subset failures
arteymix Jan 27, 2025
c5a4d55
Detect and perform necessary transformation on AnnData file to load them
arteymix Jan 29, 2025
6471540
Detect MEX files with empty cells automatically
arteymix Jan 29, 2025
d44b516
ci: Use --link-dest when deploying te CLI to speep up the process
arteymix Jan 29, 2025
42f00bb
Use a relative path for the log4j dev configuration
arteymix Jan 30, 2025
f00fc9e
Add options for transposing and using (or not) raw.X
arteymix Jan 30, 2025
ba83e80
Use assay and sample description for matching external identifiers as…
arteymix Jan 30, 2025
4a8135f
Improve behavior for loader autoconfiguration
arteymix Jan 30, 2025
0ebe99a
Add a script and an IntelliJ config for deloying a CLI manually
arteymix Jan 30, 2025
3140ba7
Add an IntelliJ config for updating completion scripts
arteymix Jan 30, 2025
414feb3
Let the loader configurer deal with defaults
arteymix Jan 31, 2025
2b2779d
Reorganize single-cell classes in a better package structure
arteymix Jan 31, 2025
afa2ca4
Add descriptive statistics for single-cell data
arteymix Jan 31, 2025
8653976
Use a hashmap to speed up cell-level characteristic extraction for An…
arteymix Jan 31, 2025
609998f
Fix new name/type/scale type not applying without a -qtName option
arteymix Jan 31, 2025
bd0e96c
Add a new gemma.scratch.dir option for intermediate sc transformation…
arteymix Jan 31, 2025
d3a58d5
Only show unmatched sample identifiers in the debug logs
arteymix Jan 31, 2025
a0b4289
Make replacing single-cell vectors more flexible
arteymix Jan 31, 2025
0dd72ba
Fix javadoc errors and warnings
arteymix Jan 31, 2025
230f8ad
Make sure we don't transpose again if data was already transposed on-…
arteymix Feb 1, 2025
e7f4559
Add DWR converter for ProtocolValueObject
arteymix Feb 1, 2025
456ed07
cli: Only compress output file if extension is .gz in getDataMatrix
arteymix Feb 4, 2025
127e26e
Fix visualization diff. ex results by subset
arteymix Feb 4, 2025
39eae82
Use unboxed doubles and double vectors whenever possible
arteymix Feb 5, 2025
a2218fb
cli: Add a --renaming-file option
arteymix Feb 5, 2025
059e0f7
Always truncate trailing spaces when abbreviating
arteymix Feb 5, 2025
aaea36f
Produce consistent message when a sample name or cell type factor is …
arteymix Feb 5, 2025
7f1e98d
Handle NaNs in single-cell descriptive statistics
arteymix Feb 5, 2025
46b9c67
Add a test case for GSE244451
arteymix Feb 5, 2025
a9c0283
Add missing package-info.java in aggregate package
arteymix Feb 5, 2025
0088353
Fix incorrect format for writing cell-level characteristics
arteymix Feb 5, 2025
a4666f5
Provide an internal implementation of the static asset server
arteymix Feb 6, 2025
2ee709f
Reuse conversion logic for sampling data matrices
arteymix Feb 6, 2025
bca3c3f
Allow empty slices in H5Dataset
arteymix Feb 6, 2025
db76b1e
Add method in H5Dataset to retrieve individual scalars
arteymix Feb 6, 2025
4623aeb
Fix bug in hasOptionValue() and improve error messages
arteymix Feb 6, 2025
fb76620
Fix StaticAssetResolverTest and remove unused methods
arteymix Feb 6, 2025
8062808
Indicate the number of assigned cells
arteymix Feb 7, 2025
ec1c763
Make the path to npm configurable
arteymix Feb 7, 2025
6ee4bbc
rest: Indicate the number of assigned cells
arteymix Feb 7, 2025
3d3aaa6
rest: Indicate the number of cell since the cellIds are not guarantee…
arteymix Feb 7, 2025
49ce042
rest: Make it possible to exclude large fields single-cell metadata
arteymix Feb 7, 2025
47d6792
rest: Add examples for CTAs and CLCs
arteymix Feb 7, 2025
1aaa14b
Fix check for truncated AnnData file
arteymix Feb 7, 2025
b9b274a
Use GEMMA_USERNAME and GEMMA_PASSWORD for generating DWR client
arteymix Feb 7, 2025
b946e46
Make sure that experiments are thawed lightly when loading SC data
arteymix Feb 10, 2025
baf11ed
Support sequencing metadata for single-cell RNA-Seq (fix #1330)
arteymix Feb 10, 2025
b85e8f5
Add basic capabilities for aggregating single-cell data by sample
arteymix Feb 10, 2025
9902ae3
Add deployment scripts for Web and CLI
arteymix Feb 4, 2025
890c5e9
Make library size adjustment optional when aggregating
arteymix Feb 11, 2025
ff56861
Add methods for retrieving and expecting a single QT and assay from a…
arteymix Feb 11, 2025
ed34777
Print full QT details when suggesting possible values
arteymix Feb 11, 2025
8fde615
rest: Adjust logic for refreshing vectors from cache
arteymix Feb 11, 2025
32cdafc
Refresh processed vectors cache when aggregating single-cell data
arteymix Feb 11, 2025
7d0bc99
Prevent splitting when factor values are unmapped by a cell type
arteymix Feb 11, 2025
2fe6518
Support re-aggregation of single-cell data
arteymix Feb 12, 2025
7ef4fd5
Add long version for options in diffExAnalyze
arteymix Feb 13, 2025
8360a3f
cli: Fix --help for enumerated options
arteymix Feb 13, 2025
65ce1b1
Update completions
arteymix Feb 13, 2025
f75398b
Suppress audit logging in tests
arteymix Feb 18, 2025
19e5baa
Fix BuildInfoTest when running in a different TZ than America/Vancouver
arteymix Feb 18, 2025
a979a93
Allow numberOfAssignedCells to be null in validation
arteymix Feb 19, 2025
12327d9
Don't waste time re-generating the version file
arteymix Feb 19, 2025
9dc6a78
Don't make BioAssayDimension and SingleCellDimension describable
arteymix Feb 19, 2025
f006332
Rename 1.31.11 migration
arteymix Feb 19, 2025
cd08da2
Fix SingleCellIntegrationTest
arteymix Feb 19, 2025
acea62e
cli: Print possible identifiers in EntityLocator in more cases
arteymix Feb 19, 2025
431d452
Don't make it an error to deploy a CLI that already exists
arteymix Feb 20, 2025
748ee7e
Fix javadoc error
arteymix Feb 20, 2025
67a220b
Add PubMed 2025 DTD
arteymix Feb 20, 2025
83c7253
Fix more tests
arteymix Feb 20, 2025
2d9cb8f
Try sample-level files before merging series one for downloading MEX
arteymix Feb 21, 2025
f10d003
Improve descriptive for data vectors
arteymix Feb 21, 2025
4b42b77
Add minimal support for floats, ints and longs single-cell vectors
arteymix Feb 23, 2025
47699d9
Convert vectors to double when generating single-cell matrices
arteymix Feb 24, 2025
f7fe028
Fix MeanVarianceServiceTest
arteymix Feb 24, 2025
25139c0
web: Fix and cleanup globals we use in the frontend
arteymix Feb 24, 2025
2e6ef27
Update completions
arteymix Feb 24, 2025
4f0f954
Always make sure that an experiment is thawed before post-processing it
arteymix Feb 25, 2025
d95d158
Make sure that EEs are thawed when removing raw and processed vectors
arteymix Feb 25, 2025
25418b0
Fix other places where the BAD name is used
arteymix Feb 26, 2025
fc6cbe5
Add boxplots for cell-level data
arteymix Feb 26, 2025
0f1b7bf
Implement aggregation by CLC from the getSingleCellDataMatrix CLI (fi…
arteymix Feb 26, 2025
dab3f78
Minor renaming for SingleCellDataBoxplot
arteymix Feb 26, 2025
c427d96
Report the correct number of vectors when aggregating from the CLI
arteymix Feb 26, 2025
c234d69
Fix options for aggregating and honor fetchSize
arteymix Feb 26, 2025
99cdad6
Fix missing mocks in tests
arteymix Feb 26, 2025
7fce772
Add methods to obtain and stream single-cell vectors without cellIDs …
arteymix Feb 27, 2025
f0adf81
Prepare for pavlab-starter-parent 1.2.18 and baseCode 1.1.24
arteymix Feb 27, 2025
b325eb8
Improve conversion of GEO data
arteymix Feb 28, 2025
cadfd21
Handle 'null' and other missing value indicators in GEO data
arteymix Feb 28, 2025
4b035cf
Bump pavlab-starter-parent to 1.2.19-SNAPSHOT
arteymix Feb 28, 2025
133259a
Parse present/absent indicators
arteymix Feb 28, 2025
14f14cd
Replace all usages of commons-lang with commons-lang3 or equivalent
arteymix Feb 28, 2025
26f10ca
Eliminate FeedReaderController and rome-fetcher
arteymix Feb 28, 2025
fa16931
Update dependencies
arteymix Feb 28, 2025
a1b0831
Remove common prefix when when displaying subset names
arteymix Mar 1, 2025
6d0621f
Make sure we use DoubleArrayList.size() or trimToSize() when accessin…
arteymix Mar 2, 2025
ae2ffd3
Add support for all byte array types and add a charset parameter to C…
arteymix Mar 3, 2025
29e606e
Replace sizeWithoutMissingValues usage in ComBat with more efficient …
arteymix Mar 3, 2025
24f0502
Downgrade jackson to 2.18.2 to satisfy swagger
arteymix Mar 3, 2025
378b382
Add missing kotlin-stdlib-jdk8 dependency for slack
arteymix Mar 3, 2025
daee693
cli: Fix parsing of --format option for getSingleCellDataMatrix
arteymix Mar 3, 2025
fef485c
Add options to deleteSingleCellData to remove CTAs and CLCs
arteymix Mar 3, 2025
deea578
Make single-cell sparsity heatmaps render as placeholders if there is…
arteymix Mar 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.ProcessedExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
import ubic.gemma.persistence.util.ChannelUtils;
Expand All @@ -50,7 +51,7 @@ public class ExpressionDataMatrixBuilder {
private static final Log log = LogFactory.getLog( ExpressionDataMatrixBuilder.class.getName() );
private final Map<ArrayDesign, BioAssayDimension> dimMap = new HashMap<>();
private final Map<QuantitationType, Integer> numMissingValues = new HashMap<>();
private Collection<DesignElementDataVector> vectors;
private Collection<BulkExpressionDataVector> vectors;
private ExpressionExperiment expressionExperiment;
private Collection<ProcessedExpressionDataVector> processedDataVectors = new HashSet<>();
private QuantitationTypeData dat = null;
Expand All @@ -59,7 +60,7 @@ public class ExpressionDataMatrixBuilder {
/**
* @param vectors collection of vectors. They should be thawed first.
*/
public ExpressionDataMatrixBuilder( Collection<? extends DesignElementDataVector> vectors ) {
public ExpressionDataMatrixBuilder( Collection<? extends BulkExpressionDataVector> vectors ) {
if ( vectors == null || vectors.size() == 0 )
throw new IllegalArgumentException( "No vectors" );
this.vectors = new HashSet<>();
Expand All @@ -75,7 +76,7 @@ public ExpressionDataMatrixBuilder( Collection<? extends DesignElementDataVector
}

public ExpressionDataMatrixBuilder( Collection<ProcessedExpressionDataVector> processedVectors,
Collection<? extends DesignElementDataVector> otherVectors ) {
Collection<? extends BulkExpressionDataVector> otherVectors ) {
this.vectors = new HashSet<>();
this.vectors.addAll( otherVectors );
this.processedDataVectors = processedVectors;
Expand All @@ -87,7 +88,7 @@ public ExpressionDataMatrixBuilder( Collection<ProcessedExpressionDataVector> pr
* @param vectors raw vectors
* @return matrix of appropriate type.
*/
public static ExpressionDataMatrix<?> getMatrix( Collection<? extends DesignElementDataVector> vectors ) {
public static ExpressionDataMatrix<?> getMatrix( Collection<? extends BulkExpressionDataVector> vectors ) {
if ( vectors == null || vectors.isEmpty() )
throw new IllegalArgumentException( "No vectors" );
PrimitiveType representation = vectors.iterator().next().getQuantitationType().getRepresentation();
Expand All @@ -100,7 +101,7 @@ public static ExpressionDataMatrix<?> getMatrix( Collection<? extends DesignElem
* @return matrix of appropriate type.
*/
private static ExpressionDataMatrix<?> getMatrix( PrimitiveType representation,
Collection<? extends DesignElementDataVector> vectors ) {
Collection<? extends BulkExpressionDataVector> vectors ) {
ExpressionDataMatrix<?> expressionDataMatrix;
if ( representation.equals( PrimitiveType.DOUBLE ) ) {
expressionDataMatrix = new ExpressionDataDoubleMatrix( vectors );
Expand Down Expand Up @@ -291,7 +292,7 @@ public List<BioAssayDimension> getBioAssayDimensions() {

ExpressionDataMatrixBuilder.log.debug( "Checking all vectors to get bioAssayDimensions" );
Collection<BioAssayDimension> dimensions = new HashSet<>();
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
ArrayDesign adUsed = this.arrayDesignForVector( vector );
if ( !dimMap.containsKey( adUsed ) ) {
dimMap.put( adUsed, vector.getBioAssayDimension() );
Expand Down Expand Up @@ -421,7 +422,7 @@ public List<QuantitationType> getPreferredQTypes() {
}

for ( BioAssayDimension dimension : dimensions ) {
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( !vector.getBioAssayDimension().equals( dimension ) )
continue;

Expand Down Expand Up @@ -566,7 +567,7 @@ private List<QuantitationType> getMissingValueQTypes() {
List<BioAssayDimension> dimensions = this.getBioAssayDimensions();

for ( BioAssayDimension dim : dimensions ) {
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {

if ( !vector.getBioAssayDimension().equals( dim ) )
continue;
Expand All @@ -591,13 +592,13 @@ private List<QuantitationType> getMissingValueQTypes() {
/**
* @return The 'preferred' data vectors - NOT the processed data vectors!
*/
private Collection<DesignElementDataVector> getPreferredDataVectors() {
Collection<DesignElementDataVector> result = new HashSet<>();
private Collection<BulkExpressionDataVector> getPreferredDataVectors() {
Collection<BulkExpressionDataVector> result = new HashSet<>();

List<BioAssayDimension> dimensions = this.getBioAssayDimensions();
List<QuantitationType> qtypes = this.getPreferredQTypes();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( !( vector instanceof ProcessedExpressionDataVector ) && dimensions
.contains( vector.getBioAssayDimension() ) && qtypes.contains( vector.getQuantitationType() ) )
result.add( vector );
Expand All @@ -620,7 +621,7 @@ private Collection<ProcessedExpressionDataVector> getProcessedDataVectors() {
List<BioAssayDimension> dimensions = this.getBioAssayDimensions();
List<QuantitationType> qtypes = this.getPreferredQTypes();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( vector instanceof ProcessedExpressionDataVector && dimensions.contains( vector.getBioAssayDimension() )
&& qtypes.contains( vector.getQuantitationType() ) )
result.add( ( ProcessedExpressionDataVector ) vector );
Expand All @@ -644,7 +645,7 @@ private QuantitationTypeData getQuantitationTypesNeeded() {

Collection<QuantitationType> checkedQts = new HashSet<>();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {

BioAssayDimension dim = vector.getBioAssayDimension();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,7 @@
import ubic.gemma.core.datastructure.matrix.ExpressionDataMatrixRowElement;
import ubic.gemma.model.common.auditAndSecurity.eventType.MissingValueAnalysisEvent;
import ubic.gemma.model.common.quantitationtype.*;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.ProcessedExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.RawExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.*;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
import ubic.gemma.persistence.service.common.auditAndSecurity.AuditTrailService;
Expand Down Expand Up @@ -122,7 +119,7 @@ public Collection<RawExpressionDataVector> computeMissingValues( ExpressionExper
timer.stop();
this.logTimeInfo( timer, procVectors.size() + rawVectors.size() );

Collection<? extends DesignElementDataVector> builderVectors = new HashSet<>(
Collection<? extends BulkExpressionDataVector> builderVectors = new HashSet<>(
rawVectors.isEmpty() ? procVectors : rawVectors );

ExpressionDataMatrixBuilder builder = new ExpressionDataMatrixBuilder( builderVectors );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
import ubic.gemma.model.common.quantitationtype.QuantitationType;
import ubic.gemma.model.expression.arrayDesign.ArrayDesign;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.*;
import ubic.gemma.model.genome.Taxon;
Expand Down Expand Up @@ -481,7 +482,7 @@ public File writeOrLocateDataFile( QuantitationType type, boolean forceWrite ) {
ExpressionDataFileServiceImpl.log
.info( "Creating new quantitation type expression data file: " + f.getName() );

Collection<DesignElementDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<BulkExpressionDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<ArrayDesign> arrayDesigns = this.getArrayDesigns( vectors );
Map<CompositeSequence, String[]> geneAnnotations = this.getGeneAnnotationsAsStringsByProbe( arrayDesigns );

Expand Down Expand Up @@ -564,7 +565,7 @@ public File writeOrLocateJSONDataFile( QuantitationType type, boolean forceWrite

ExpressionDataFileServiceImpl.log.info( "Creating new quantitation type JSON data file: " + f.getName() );

Collection<DesignElementDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<BulkExpressionDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );

if ( vectors.size() == 0 ) {
ExpressionDataFileServiceImpl.log.warn( "No vectors for " + type );
Expand Down Expand Up @@ -1173,7 +1174,7 @@ private File writeDesignMatrix( File file, ExpressionExperiment expressionExperi
return file;
}

private void writeJson( File file, Collection<DesignElementDataVector> vectors ) throws IOException {
private void writeJson( File file, Collection<BulkExpressionDataVector> vectors ) throws IOException {
ExpressionDataMatrix<?> expressionDataMatrix = ExpressionDataMatrixBuilder.getMatrix( vectors );
try ( Writer writer = new OutputStreamWriter( new GZIPOutputStream( new FileOutputStream( file ) ) ) ) {
MatrixWriter matrixWriter = new MatrixWriter();
Expand Down Expand Up @@ -1211,7 +1212,7 @@ private void writeMatrix( File file, Map<CompositeSequence, String[]> geneAnnota

}

private void writeVectors( File file, Collection<DesignElementDataVector> vectors,
private void writeVectors( File file, Collection<BulkExpressionDataVector> vectors,
Map<CompositeSequence, String[]> geneAnnotations ) throws IOException {
ExpressionDataMatrix<?> expressionDataMatrix = ExpressionDataMatrixBuilder.getMatrix( vectors );
this.writeMatrix( file, geneAnnotations, expressionDataMatrix );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
import ubic.gemma.model.expression.arrayDesign.ArrayDesign;
import ubic.gemma.model.expression.bioAssay.BioAssay;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.RawExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.biomaterial.BioMaterial;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
Expand Down Expand Up @@ -204,7 +204,7 @@ public ExpressionDataMatrixRowElement getRowElement( int index ) {
}

@SuppressWarnings("unused") // useful interface
protected abstract void vectorsToMatrix( Collection<? extends DesignElementDataVector> vectors );
protected abstract void vectorsToMatrix( Collection<? extends BulkExpressionDataVector> vectors );

int getColumnIndex( BioAssay bioAssay ) {
return columnAssayMap.get( bioAssay );
Expand Down Expand Up @@ -368,11 +368,11 @@ int setUpColumnElements() {
/**
* Selects all the vectors passed in (uses them to initialize the data)
*/
void selectVectors( Collection<? extends DesignElementDataVector> vectors ) {
void selectVectors( Collection<? extends BulkExpressionDataVector> vectors ) {
QuantitationType quantitationType = null;
int i = 0;
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
for ( DesignElementDataVector vector : sorted ) {
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
for ( BulkExpressionDataVector vector : sorted ) {
if ( this.expressionExperiment == null )
this.expressionExperiment = vector.getExpressionExperiment();
QuantitationType vectorQuantitationType = vector.getQuantitationType();
Expand All @@ -397,14 +397,14 @@ void selectVectors( Collection<? extends DesignElementDataVector> vectors ) {

}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
Collection<QuantitationType> qTypes ) {
this.quantitationTypes.addAll( qTypes );

Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int i = 0;

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( qTypes.contains( vectorQuantitationType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -421,14 +421,14 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
List<QuantitationType> qTypes ) {
this.quantitationTypes.addAll( qTypes );
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int rowIndex = 0;
for ( QuantitationType soughtType : qTypes ) {
for ( DesignElementDataVector vector : sorted ) {
for ( BulkExpressionDataVector vector : sorted ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( vectorQuantitationType.equals( soughtType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -446,14 +446,14 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
QuantitationType quantitationType ) {
this.quantitationTypes.add( quantitationType );

Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int i = 0;

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( vectorQuantitationType.equals( quantitationType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -470,18 +470,18 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( ExpressionExperiment ee, QuantitationType quantitationType ) {
Collection<BulkExpressionDataVector> selectVectors( ExpressionExperiment ee, QuantitationType quantitationType ) {
Collection<RawExpressionDataVector> vectors = ee.getRawExpressionDataVectors();
return this.selectVectors( quantitationType, vectors );
}

private Collection<DesignElementDataVector> selectVectors( QuantitationType quantitationType,
Collection<? extends DesignElementDataVector> vectors ) {
Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
private Collection<BulkExpressionDataVector> selectVectors( QuantitationType quantitationType,
Collection<? extends BulkExpressionDataVector> vectors ) {
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
this.quantitationTypes.add( quantitationType );
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
int i = 0;
for ( DesignElementDataVector vector : sorted ) {
for ( BulkExpressionDataVector vector : sorted ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( this.expressionExperiment == null )
this.expressionExperiment = vector.getExpressionExperiment();
Expand Down Expand Up @@ -512,12 +512,12 @@ private void getBioMaterialGroupsForAssays( Map<BioMaterial, Collection<BioAssay
}
}

private List<DesignElementDataVector> sortVectorsByDesignElement(
Collection<? extends DesignElementDataVector> vectors ) {
List<DesignElementDataVector> vectorSort = new ArrayList<>( vectors );
Comparator<DesignElementDataVector> cmp = Comparator
.comparing( ( DesignElementDataVector vector ) -> vector.getDesignElement().getName(), Comparator.nullsLast( Comparator.naturalOrder() ) )
.thenComparing( ( DesignElementDataVector vector ) -> vector.getDesignElement().getId() );
private List<BulkExpressionDataVector> sortVectorsByDesignElement(
Collection<? extends BulkExpressionDataVector> vectors ) {
List<BulkExpressionDataVector> vectorSort = new ArrayList<>( vectors );
Comparator<BulkExpressionDataVector> cmp = Comparator
.comparing( ( BulkExpressionDataVector vector ) -> vector.getDesignElement().getName(), Comparator.nullsLast( Comparator.naturalOrder() ) )
.thenComparing( ( BulkExpressionDataVector vector ) -> vector.getDesignElement().getId() );
vectorSort.sort( cmp );
return vectorSort;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

import ubic.gemma.model.expression.bioAssay.BioAssay;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;

import java.util.Collection;
Expand Down Expand Up @@ -130,7 +130,7 @@ public void set( int row, int column, Object value ) {
}

@Override
protected void vectorsToMatrix( Collection<? extends DesignElementDataVector> vectors ) {
protected void vectorsToMatrix( Collection<? extends BulkExpressionDataVector> vectors ) {
throw new UnsupportedOperationException();
}

Expand Down
Loading