Releases · Alipsa/matrix

31 Jan 19:05

perNyfelt

matrix-xchart-0.2.3

bd88543

Matrix Xchart 0.2.3

add @CompileStatic to all 17 classes for performance and type safety (100% static compilation, no @CompileDynamic needed)
complete empty test methods in HistogramChartTest (testDensityHistogram, testFrequencyHistogramCustom)
add comprehensive GroovyDoc to all abstract classes (AbstractChart, AbstractXYChart, AbstractCategoryChart)
add comprehensive edge case tests for improved test coverage
replace Math calls with NumberExtension methods for idiomatic Groovy (Math.sqrt() → .sqrt())
expand GroovyDoc on all public methods with detailed parameter descriptions and examples
remove commented println debug statements
fix IndexOutOfBoundsException in AbstractChart.makeFillTransparent() method
fix multiple calculation issues in chart rendering
update README version references and complete StickChart description

Assets 2

31 Jan 18:57

perNyfelt

matrix-tablesaw-0.2.2

eba98b7

Matrix Tablesaw 0.2.2

Build Configuration Improvements

Added compileTestJava configuration with deprecation and unchecked warnings enabled
Added -Xlint:unchecked flag to compileGroovy for improved Groovy code quality checks
Build configuration now consistent with matrix-core module standards

Code Quality Improvements

Added @SuppressWarnings("unchecked") annotation to TableUtil.createColumn() method to properly handle intentional unchecked generic casts
Removed duplicate BigDecimalColumn type check in classForColumnType() method (dead code removal)

Bug Fixes & Improvements

BigDecimalColumn enhancements:
- Fixed asBytes() method to use UTF-8 encoding explicitly instead of platform default charset, ensuring consistent byte representation across all platforms
- Cleaned up asBytes() method documentation and removed outdated TODO comments
- Extended toBigDecimal() method to handle additional Number subtypes:
  - BigDecimal - now returns the value as-is without conversion (prevents precision loss from unnecessary double conversion)
  - AtomicInteger - converted via get() for precision
  - AtomicLong - converted via get() for precision
  - DoubleAccumulator - converted via doubleValue()
- Added comprehensive Javadoc explaining conversion behavior for all Number types
- Improved test coverage to properly exercise toBigDecimal(Number) conversion path for BigDecimal inputs
- Updated test assertions to use UTF-8 encoding for deterministic byte array comparisons

Documentation

Enhanced Javadoc/Groovydoc documentation:
- BigDecimalColumnFormatter - Added comprehensive class documentation with usage examples, documented all factory methods, constructors, and formatting methods
- GdataFrameJoiner - Added class documentation explaining join types, documented all join method variants with parameter descriptions
- Verified existing documentation in BigDecimalComparator, XlsxWriteOptions, and GdataFrameReader
All public APIs now have production-quality documentation

Testing

Added JaCoCo code coverage reporting infrastructure
- Current coverage: 54% instruction coverage, 58% branch coverage
- Coverage thresholds: 50% overall, 15% per class (baseline to prevent regression)
- Coverage reports available in HTML and XML formats
- Excluded low-coverage infrastructure classes from strict requirements
Added test coverage for atomic type conversions in BigDecimalColumn
Added test for BigDecimal precision preservation
All 85 tests passing (2 new tests added)

Deprecations

Deprecated OdsReadOptions.builder(Reader reader) - ODS is a binary format, not text-based
Deprecated OdsReadOptions.builderFromString(String contents) - ODS is a binary format, not text-based
Deprecated OdsReadOptions.builderFromUrl(String url) - ODS is a binary format, not text-based
Note: These deprecated methods will be removed in v0.3.0

Dependency Updates

com.github.miachm.sods:SODS [1.6.8 -> 1.7.0]
org.apache.poi:poi-ooxml [5.4.1 -> 5.5.1]

Assets 2

31 Jan 18:49

perNyfelt

matrix-sql-2.3.0

721e090

Matrix Sql 2.3.0

add option to control whether column names are quoted when creating a table
add an execute method to MatrixSql to run arbitrary sql (update, delete, insert etc.)
MatrixSqlFactory.create attempts to infer and set the JDBC driver when missing
MatrixSql connection lifecycle fixes (reconnect after close)
safer, prepared-statement updates with match-column validation
ResultSet improvements: updateRow is a no-op for detached sets, null-safe primitive/stream getters, strict unwrap contract
close metadata ResultSets for table discovery utilities
Dependency upgrades:
- commons-io:commons-io [2.20.0 -> 2.21.0]
- se.alipsa.groovy:data-utils [2.0.3 -> 2.0.4]
- se.alipsa:maven-3.9.11-utils [1.0.0 -> 1.1.0]

Assets 2

31 Jan 17:31

perNyfelt

matrix-spreadsheet-2.3.0

5e1dd50

Matrix Spreadsheet 2.3.0

Major architectural refactoring with significant performance improvements

Breaking Changes

removed POI and SODS implementations - FastExcel is now the single XLSX backend, FastOds is the single ODS backend
removed ExcelImplementation and OdsImplementation enums - no implementation selection needed
explicitly reject legacy .xls files (XLSX only)
deprecated SpreadsheetExporter in favor of SpreadsheetWriter

New Features

add append/replace support for existing XLSX and ODS files (preserves sheets and metadata)
add flexible start position support when writing data (e.g., write to cell B5)
add map-based multi-sheet API: writeSheets(Map<String, Position>)
add new ODS streaming writer/appender with table attributes/column reuse for styling
add profiling support for ODS operations via -Dmatrix.spreadsheet.ods.profile=true
XLSX append now inherits sheetFormatPr, column widths, page margins, and fixes relId collisions
add comprehensive sheet name sanitization with automatic de-duplication

Performance Improvements

ODS read performance: 65-80% faster (medium files: 4.86s → 1.43s, large files: 262s → 53s)
switched to Aalto StAX parser for 64% speedup
adaptive row capacity sizing to minimize ArrayList resizing
type-aware value extraction with switch dispatch
optimized trailing empty row detection
Null Object pattern for profiling eliminates branching overhead (~14% improvement)

Bug Fixes

fix missing return statement in ValueExtractor.getDouble() (percentage parsing)
fix 1-based sheet indexing consistency across all importers
fix sheet name collision prevention (sanitization could cause silent data loss)
fix invalid sheet number handling in URL imports
fix null row guards in FExcelReader
fix percentage parsing to be locale-independent
fix race condition in SpreadsheetExporter static field sharing

Code Quality

add column count validation to all write methods
add robust cleanup for temp files and XML stream resources
add XXE protection with hardened XML parsing
replace 15+ println statements with proper logging
remove ~500 lines of dead code (POI, SODS implementations)
extract duplicate header building logic (DRY improvements)
comprehensive test coverage: 79.74% (105 tests passing)
add benchmarking suite for performance validation

Dependencies

remove org.apache.poi:poi and org.apache.poi:poi-ooxml
remove com.github.miachm.sods:SODS
remove org.apache.logging.log4j:log4j-api (migrated to matrix-core Logger)
add com.fasterxml:aalto-xml 1.3.4 (high-performance StAX parser)
upgrade com.github.javaparser:javaparser-core [3.26.4 -> 3.27.0]
migrate from log4j to matrix-core Logger (supports slf4j if present, otherwise System.out/err)

Assets 2

31 Jan 17:18

perNyfelt

matrix-smile-0.1.0

2cd664d

Matrix Smile 0.1.0

Initial release providing comprehensive integration between Matrix and Smile (Statistical Machine Intelligence and Learning Engine).

Core Features

DataframeConverter: Bidirectional conversion between Matrix and Smile DataFrame with support for 18 data types
SmileUtil: Pandas-like utility functions for data exploration and manipulation
- Statistical summary (describe), column information (info), frequency tables
- Sampling (random, by count, by fraction, with seed)
- Head/tail operations, null detection and counting
Gsmile Extension Module: Natural Groovy syntax extensions for Matrix and DataFrame
- Matrix extensions: toSmileDataFrame(), smileDescribe(), smileSample()
- DataFrame extensions: toMatrix(), subscript operators (getAt), filtering, iteration
- Comprehensive test coverage (24 extension method tests)

Machine Learning Wrappers

SmileClassifier: Wrappers for classification algorithms
- Logistic Regression, Decision Trees, Random Forest, Gradient Boosted Trees
- Support Vector Machines, K-Nearest Neighbors, Naive Bayes, AdaBoost
- Model training, prediction, and evaluation with confusion matrices
SmileRegression: Wrappers for regression algorithms
- Linear Regression, Ridge Regression, LASSO, Elastic Net
- Regression Trees, Gradient Boosted Trees, Random Forest
- Model fitting, prediction, and RMSE calculation
SmileCluster: Wrappers for clustering algorithms
- K-Means, Hierarchical Clustering, DBSCAN, DENCLUE, CLARANS
- Cluster assignment and centroids calculation
SmileDimensionality: Dimensionality reduction techniques
- PCA (Principal Component Analysis), MDS (Multidimensional Scaling)
- t-SNE (t-Distributed Stochastic Neighbor Embedding)

Statistical Analysis (SmileStats)

Probability Distributions:
- Discrete: Binomial, Geometric, Poisson, Hypergeometric
- Continuous: Normal, Exponential, Gamma, Beta, Chi-Squared, T, F, Weibull
- PDF, CDF, quantile, and random sample generation
Hypothesis Testing:
- t-tests (one-sample, two-sample, paired)
- Chi-squared test, F-test, Kolmogorov-Smirnov test
- Correlation tests (Pearson, Spearman, Kendall) with significance testing
Correlation Analysis:
- Correlation matrices with p-values
- Support for Pearson, Spearman, and Kendall correlation methods
Random Sampling: Generate random samples from various distributions

Feature Engineering (SmileFeatures)

Data Loading: Load datasets from Smile's built-in data repository
Feature Scaling:
- StandardScaler (z-score normalization with fit/transform workflow)
- MinMaxScaler (range normalization)
- MaxAbsScaler (maximum absolute value scaling)
- RobustScaler (median and IQR-based scaling)
Feature Encoding:
- One-hot encoding for categorical variables
- Label encoding for ordinal variables
Feature Selection:
- Sum, difference, product, ratio feature creation
Imputation: Missing value handling with mean, median, mode, or constant strategies

Code Quality

Comprehensive @CompileStatic annotation throughout for type safety and performance
Modern Groovy 5.0+ switch expression syntax (arrow operators)
Extensive GroovyDoc documentation (207 JavaDoc blocks)
Comprehensive test coverage (274 tests across 10 test files, 100% test file coverage)
Idiomatic Groovy code (as double instead of .doubleValue(), NumberExtension usage)

Dependencies

com.github.haifengl:smile-core 4.4.2
Requires Java 21 (Smile 4.x not compatible with Java 22+)
Requires Groovy 5.0+ (for modern switch expression syntax)

Assets 2

31 Jan 17:08

perNyfelt

matrix-parquet-0.4.0

ca09366

Matrix Parquet 0.4.0

remove parquet-carpet dependency (MatrixCarpetIO) - now using native Parquet implementation
add support for nested structures: structs (POJOs, maps) and repeated fields (arrays)
add URL, Path, InputStream, and byte[] input support to MatrixParquetReader (API consistency with matrix-csv and matrix-json)
add BigDecimal precision and scale control in MatrixParquetWriter write methods
add in-memory write support via InMemoryOutputFile and InMemoryPositionOutputStream (eliminates temporary files)
add timezone support for timestamp handling (optional parameter in reader/writer methods)
MatrixParquetWriter can now write to either a file or directory (using matrix name for filename)
use matrixName as Parquet schema name if present
add @CompileStatic to MatrixParquetReader for performance and type safety
add comprehensive input validation to both reader and writer (null checks, empty matrix, file existence)
add safeFileName sanitization for directory targets (strips path separators and unsafe characters)
fix bug: time precision schema/implementation mismatch (now uses MICROS for timestamps, MILLIS for time)
fix bug: BigDecimal schema inference incorrectly set minimum scale to 2
extract magic strings to constants for maintainability
add comprehensive GroovyDoc to MatrixParquetReader and MatrixParquetWriter
add extensive test coverage including edge cases, validation, and round-trip verification
cache reflection metadata for struct handling (performance optimization)
upgrade dependencies
- org.apache.hadoop:hadoop-common [3.4.1 -> 3.4.2]
- org.apache.hadoop:hadoop-mapreduce-client-core [3.4.1 -> 3.4.2]
- org.apache.parquet:parquet-column [1.15.2 -> 1.16.0]
- org.apache.parquet:parquet-hadoop [1.15.2 -> 1.16.0]

Assets 2

31 Jan 16:57

perNyfelt

matrix-json-2.1.2

58dc8a1

Matrix Json 2.1.2

deprecate JsonImporter and JsonExporter in favor of JsonReader and JsonWriter
change implementation to use Jackson streaming API instead of JsonSlurper for improved memory efficiency (O(1) memory regardless of JSON size)
add duplicate key detection in flatten() to prevent silent data loss
add URL and Path support to JsonImporter (matching matrix-csv API)
add static export methods to JsonExporter for API consistency
add comprehensive test coverage for edge cases (empty arrays, single rows, null handling)
add input validation to prevent writing empty or null matrices
fix JsonImporter mutation and iteration assumptions
fix TOCTOU race condition in JsonWriter file creation
replace broad exception catches with specific exception types
upgrade dependencies
- com.fasterxml.jackson.core:jackson-core [2.20.0 -> 2.20.1]
- com.fasterxml.jackson.core:jackson-databind [2.20.0 -> 2.20.1]

Assets 2

31 Jan 16:49

perNyfelt

matrix-gsheets-0.1.1

372ba32

Matrix Gsheets 0.1.1

Move actual implementation for GsheetsReader and GsheetsWriter and utility methods to GsUtil so that GsImporter and GsExporter are just empty wrappers.

Assets 2

31 Jan 19:31

perNyfelt

matrix-groovy-ext-0.1.0

4906a48

Matrix Groovy Ext 0.1.0

Initial version

Number extensions allowing for more idiomatic groovy code.

Assets 2

31 Jan 14:47

perNyfelt

matrix-csv-2.2.2

329466c

Matrix csv 2.2.2

deprecate CsvImporter and CsvExporter in favor of CsvReader and CsvWriter
upgrade commons-csv from 1.14.0 to 1.14.1
fix bug: empty CSV files now handled correctly (no more IndexOutOfBoundsException)
fix typos in error messages ("extected" → "expected")
add comprehensive test coverage for edge cases (empty CSV, header-only, single row/column, mismatched columns)
add null validation to CsvExporter methods
add GroovyDoc documentation to CsvImporter and CsvExporter classes and methods

Assets 2

Releases: Alipsa/matrix

Matrix Xchart 0.2.3

Uh oh!

Matrix Tablesaw 0.2.2

Build Configuration Improvements

Code Quality Improvements

Bug Fixes & Improvements

Documentation

Testing

Deprecations

Dependency Updates

Uh oh!

Matrix Sql 2.3.0

Uh oh!

Matrix Spreadsheet 2.3.0

Breaking Changes

New Features

Performance Improvements

Bug Fixes

Code Quality

Dependencies

Uh oh!

Matrix Smile 0.1.0

Core Features

Machine Learning Wrappers

Statistical Analysis (SmileStats)

Feature Engineering (SmileFeatures)

Code Quality

Dependencies

Uh oh!

Matrix Parquet 0.4.0

Uh oh!

Matrix Json 2.1.2

Uh oh!

Matrix Gsheets 0.1.1

Uh oh!

Matrix Groovy Ext 0.1.0

Uh oh!

Matrix csv 2.2.2

Uh oh!