Proposal: cleanup of dense and structured matrix types #43983
Replies: 11 comments
-
Yes please. This would be a major improvement re: #8001. Doesn't address the brokenness of sparse at all, but one step at a time. Banded will cover many of the dense structured types currently in lapack as special cases, and the algebra of banded matrix operations will be cool to see done right. For some reason I thought it had already been done in Julia, but apparently not. For many of the reductions and binary operators I think adding a banded type will necessitate rethinking how the generic fallback implementations are done, in order for those operations not to be painfully slow for narrow-banded matrices. This will likely expose many of the same issues as currently exist for sparse matrices. We should perhaps consider an iterator-based protocol for more generic operations on matrices. That way the fallback for the iterator protocol could be a dense |
Beta Was this translation helpful? Give feedback.
-
I found this nice guide to the LAPACK types. The only note I really have is that Hessenberg |
Beta Was this translation helpful? Give feedback.
-
True. Anecdotally from @andreasnoack's experiments it seems like the only real gains in going through LAPACK and BLAS are in routines that can take advantage of BLAS3 (matrix-matrix) kernels. It's difficult to see how BLAS3 can be used for banded algebra.
Thanks, fixed. |
Beta Was this translation helpful? Give feedback.
-
@jiahao, I don't understand this comment:
Since a |
Beta Was this translation helpful? Give feedback.
-
Assuming I've read the diagram correctly, banded storage stores nonzeros column by column, so the sub-/super-/diagonals would be stored as vectors of stride |
Beta Was this translation helpful? Give feedback.
-
Ah, I see; I misread it. This is annoying, although I suppose we could transpose the data on the fly when it is needed. |
Beta Was this translation helpful? Give feedback.
-
My pull request #9779 relates to this issue. The way I have been thinking about the As far as I can see, this is not the case for I think it is okay to make the assumption that a Hessenberg matrix is lower Hessenberg as I haven't seen an application of lower Hessenberg matrices. Hence, the type doesn't need an If it is necessary to introduce a trapezoid matrix type, I think we should follow the approach in #9779. This will result in more types compared to the present solution, but it is easier to work with the versions without the parameters in a way that insures good type inference. |
Beta Was this translation helpful? Give feedback.
-
When working with FixedArrays I noted that currently - and this is also what you propose - Diagonal is a storage and not a symmetry property (for example Diagonal{T} is parametrized with eltype, different from e.g. Triangular{T, AbstractMatrix{T}}.) A different possibility is to have a parametrization Diagonal{T,AbstractVector{T}} and have Diagonal behave closer to Triangular matrices. |
Beta Was this translation helpful? Give feedback.
-
@mschauer this issue was focused on LAPACK, where |
Beta Was this translation helpful? Give feedback.
-
I was thinking especially about fixed-size diagonal matrices. But also the diagonals of subarrays you mention is a nice example. Further the 2d-Laplacian operator induces a tridiagonal matrix of tridiagonal matrices of which for example the Cholesky factor is a bidiagonal matrix of bidiagonal matrices. These these structured matrices of structured matrices are a recurring pattern. As example take the discrete Poisson equation, which could be directly solved by the abstract fallback version of the tridiagonal matrix solution algorithm. |
Beta Was this translation helpful? Give feedback.
-
It's really a shame that we didn't get this for 1.0. Is there any way we could still do this without breaking everyone's code? |
Beta Was this translation helpful? Give feedback.
-
The goal of this proposal to replace the existing zoo of non-sparse matrix types (i.e. not
AbstractSparse
) with the following 9 types:Matrix
(Conventional storage)Banded{n,m,order}
(Band storage: am x n
banded matrix with bandwidth(kl, ku)
is stored in a(kl+ku+1) x n
matrix iforder == :row
and an x (kl+lu+1)
matrix iforder == :col
)Packed
(Packed storage)RFP
(rectangular full packed; see LAWNS 199)uplo
label referencing whether the upper or lower part of the matrix is stored and sometimes anisunit
label referring to whether the diagonal is all ones or not.Hermitian{T,AbstractMatrix{T},UpLo}
Hessenberg{T,AbstractMatrix{T},UpLo}
Symmetric{T,AbstractMatrix{T},UpLo}
Triangular{T,AbstractMatrix{T},UpLo,IsUnit}
Trapezoid{T,AbstractMatrix{T},UpLo,IsUnit}
(essentially a generalization ofTriangular
to rectangular matrices)which I argue will be sufficient to wrap the vast majority of LAPACK functionality going forward.
My thoughts about this proposal:
This proposal reduces the number of existing structured matrix types from 9 to appropriate compositions of 3 symmetry label types (
Hermitian
,Symmetric
,Triangular
) and 2 storage types (Banded
,RFP
).Bidiagonal{T}
Banded{1,0,:col} or {0,1,:col}
Diagonal{T}
Banded{0,0,:col/:row}
Hermitian{T,S<:AbstractMatrix{T},UpLo}
SymTridiagonal{T}
Symmetric{Banded{1,0,:col}}
SymmetricRFP{T}
Symmetric{RFP{T}}
Symmetric{T,S<:AbstractMatrix{T}}
TriangularRFP{T}
Triangular{RFP{T}}
Triangular{T,S<:AbstractMatrix{T},UpLo,IsUnit}
Tridiagonal{T}
Banded{1,1,:col}
Redefining
Bidiagonal
,Tridiagonal
andSymTridiagonal
in terms ofBanded
provides a consistent interface for storing and retrieving super-/sub-/diagonals.The
Banded
matrix type is closed under the ring of matrix addition and multiplication, which is not true ofUnion(Diagonal,Bidiagonal,Tridiagonal,SymTridiagonal)
. A cute side-effect of the algebraic closure is that the type parametersm
andn
specifying the number of sup-/sub-diagonals obey a (max-plus) tropical semiring algebra which mirrors the underlying matrix algebra:You can exploit part of the algebra of
m,n
by defining+
using diagonal dispatch and a new promotion rule:and similarly for
-
.The full proposal allows representations of all 29 LAPACK matrix types (as of v3.5.0) by appropriate compositions of the aforementioned 9 types, and tuples where necessary to represent generalized problems:
BB
BD
Bidiagonal{:L or :U}
Bidiagonal{:uplo}
is atypealias
forBanded{1,0,:col} or {0,1,:col}
DI
Diagonal
Diagonal
is atypealias
forBanded{0,0,:col)
(theorder
parameter is mostly irrelevant)GB
Banded{n,m}
GE
(,OR
,UN
)Matrix
GG
(Matrix, Matrix)
GT
Tridiagonal
Tridiagonal
is atypealias
forBanded{1,1,:col}
HB
(,PB
)Hermitian{Banded{n,m}}
HE
(,PO
,PS
)Hermitian{T, Matrix{T}}
HG
(Hessenberg, Triangular)
HP
Hermitian{Packed}
HS
Hessenberg{Matrix}
SB
(,PB
)Symmetric{Banded{n,0}}
SF
(,PF
)SymmetricRFP
Symmetric{RFP}
SP
Symmetric{Packed}
ST
,PT
SymTridiagonal
Symmetric{Banded{1,0,:col}}
SY
,PO
Symmetric{T, Matrix{T}}
TB
Triangular{Banded{n,0} or {0,n}}
TF
TriangularRFP
Triangular{RFP}
TG
(Triangular, Triangular)
TP
(,PP
,OP
,UP
)Triangular{Packed}
TR
Triangular{T, Matrix{T}}
TZ
Trapezoid{Matrix}
It is necessary to have two different representations of banded matrices since two memory layouts are required. The LAPACK storage format for band storage stores a
m x n
banded matrix with bandwidth(kl, ku)
as(kl+ku+1) x n
matrix; however, LAPACK's routines for tridiagonal/bidiagonal matrices expect to be passed the super-/sub-/diagonals as separate vectors. We cannot store these in the correct memory order using justBanded{m,n,:row}
, but it is possible to do so withBanded{m,n,:col}
.Banded{m,n,:row}
andBanded{m,n,:col)
as two separate types (sayBanded
andBandedCol
, but if we do that then there is no unique representation ofDiagonal
; bothBanded{0,0}
andBandedCol{0,0}
would have equivalent memory mappings. One possible solution is to make theDiagonal
constructor consistently create one of them. For functions computing onDiagonal
matrices, it would be possible to makeDiagonal
atypealias
ofUnion(Banded{0,0},BandedCol{0,0})
. However, theDiagonal
constructor still needs to preferentially select one or the other. For this reason it seems likeBanded{m,n,:row/:col}
is a better choice.Symmetric
,Hermitian
,Hessenberg
andTriangular
labels modify the storage requirements; composing these types with aBanded(m,n)
matrix makes eitherm
orn
semantically meaningless and used only to describe the storage of irrelevant elements.I've specifically ignored the need to annotate matrices with spectral properties like positive semi-/definiteness. There is no need to introduce a special matrix type for the purposes of dispatching onto LAPACK routines of the
P?
type, since they can all be dispatched fromCholesky
factorization objects.PosDef
orPosSemiDef
type along the lines ofHermitian
, for example, but then the composition of the two types of labels likeSymmetric{PosDef{M}}
orPosDef{Symmetric{M}}
produces new types that are distinguishable by their order of composition, even though the order of labeling ought not to make a difference.The
OP
,UP
,OR
andUN
matrix types are special matrix types resulting from QR factorizations and are never used in any other context. Dispatching onto their appropriate LAPACK routines can be handled entirely from within factorization objects likeQR*
andHessenberg
and there should be no need to expose them as new matrix types. Similarly,BB
is only ever used as part of the representation of the result of the cosine-sine decomposition, and can be dispatched on entirely through a hypothetical newCS
factorization object.This proposal turns matrix types likeThe type parameters give a minimum bound on the size of the matrix but are insufficient to determine the matrix dimensions exactly.Hessenberg
andTridiagonal
into immutable fixed size matrices, since constructing them as special cases ofBanded{m,n}
means that the type parametersm
andn
are completely sufficient to determine the size of the matrix, which means that the matrices must have known dimensions at the time of construction. This could be a major breaking change, although I don't have a good feel for how extensively these special matrix types are used in the wild.Ideally,
UpLo
andorder
should be enums with just two allowed values each ((:U
,L
) and (:row
,:col
)) but it seems much clearer to represent them with symbols instead ofBool
s since we don't have enums (yet).I haven't worked out the details yet, but I suspect that this reorganization of the matrix types will lead to correspondingly cleaner matrix
Factorization
objects and type hierarchy.The proposal here is the result of extensive discussion with @simonbyrne and @andreasnoack.
Beta Was this translation helpful? Give feedback.
All reactions