TensorFactors

TensorFactors.colnormalize!
TensorFactors.colnormalize!
TensorFactors.cp_als
TensorFactors.cp_als
TensorFactors.cp_contract
TensorFactors.cp_factors_to_flat
TensorFactors.cp_loss
TensorFactors.cp_loss
TensorFactors.cp_loss
TensorFactors.cp_loss_fg!
TensorFactors.cp_loss_grad!
TensorFactors.cp_opt
TensorFactors.flat_to_cp_factors
TensorFactors.mttkrp!
TensorFactors.ttm
TensorFactors.tucker_contract
TensorFactors.tucker_hosvd
TensorFactors.unfold_mode

TensorFactors.colnormalize! — Method

colnormalize!(A::AbstractMatrix{T}) where {T <: Number}

Normalizes each column of matrix A in place to have unit Euclidean norm.

For each column r, this method computes its Euclidean norm. If the norm is positive, the column is rescaled in place so that its 2-norm becomes one(T).

This overload is useful when only column normalization is needed and no separate weight vector is maintained.

Arguments

A: Input matrix whose columns are normalized in place.

Returns

A: The normalized matrix A.

source

TensorFactors.colnormalize! — Method

colnormalize!(
    A::AbstractMatrix{T},
    lambda::AbstractVector{T},
) where {T <: Number}

Normalizes each column of matrix A in place and absorbs the corresponding column norms into the weight vector lambda.

For each column r, this method computes its Euclidean norm. If the norm is positive, the column is rescaled to have unit norm, and lambda[r] is multiplied by the original column norm.

This routine is commonly used in tensor factorization algorithms to separate per-column scaling factors from factor matrices while keeping the product represented by A and lambda unchanged.

Arguments

A: Input matrix whose columns are normalized in place.
lambda: Vector of column weights. Its length must equal size(A, 2).

Returns

A, lambda: The normalized matrix A and the updated weight vector lambda.

source

TensorFactors.cp_als — Method

cp_als(
    X::AbstractArray{T, N},
    cp_rank::Int;
    max_iter::Int=10000,
    dloss_rtol::Float64=1e-6,
    loss_rtol::Float64=1e-8,
    show_trace::Bool=false,
    show_every::Int=100,
) where {T <: Real, N}

Computes a rank-cp_rank CANDECOMP/PARAFAC (CP) decomposition of an arbitrary-order tensor X using alternating least squares (ALS).

This method iteratively updates each factor matrix by solving the normal equations associated with one mode while keeping all other factors fixed. MTTKRP terms are computed without explicitly forming matricized tensors or Khatri-Rao products, and the Gram matrices of the factors are reused across iterations to reduce computational cost. The relative reconstruction loss is monitored throughout the optimization, and the iteration stops when either the loss becomes sufficiently small or the change in loss falls below the specified tolerance.

Arguments

X: Input tensor of order N, with N >= 2.
cp_rank: Target CP rank.

Keyword Arguments

max_iter: Maximum number of ALS iterations.
dloss_rtol: Relative tolerance on the change in loss between successive iterations. Iteration stops when abs(last_loss - loss) < dloss_rtol.
loss_rtol: Relative tolerance on the loss itself. Iteration stops when loss < loss_rtol.
show_trace: If true, prints iteration progress and current loss.
show_every: Frequency, in iterations, at which progress information is printed when show_trace=true.

Returns

factors: A tuple of N factor matrices, where factors[n] has size (size(X, n), cp_rank).

source

TensorFactors.cp_als — Method

cp_als(
    X::AbstractArray{T, 3},
    cp_rank::Int;
    max_iter::Int=10000,
    dloss_rtol::Float64=1e-8,
    loss_rtol::Float64=1e-8,
    show_trace::Bool=false,
    show_every::Int=100,
) where {T <: Real}

Computes a rank-cp_rank CANDECOMP/PARAFAC (CP) decomposition of a 3rd-order tensor X using alternating least squares (ALS).

This is a specialized implementation for 3-way tensors. It iteratively updates the factor matrices A, B, and C while holding the others fixed. Each update is formed from the corresponding MTTKRP contraction and solved using the Hadamard product of the Gram matrices of the other two factors. Factor columns are normalized during the iteration, and the accumulated scaling is stored in lambda to improve numerical stability.

The optimization monitors the relative reconstruction error

||X - X̂||_F / ||X||_F

and terminates when either the error itself becomes sufficiently small or the change in error between successive iterations falls below the specified tolerance.

Arguments

X: Input 3rd-order tensor of size (I, J, K).
cp_rank: Target CP rank.

Keyword Arguments

max_iter: Maximum number of ALS iterations.
dloss_rtol: Absolute tolerance on the change in relative reconstruction error between successive iterations. Iteration stops when abs(last_loss - loss) < dloss_rtol.
loss_rtol: Tolerance on the relative reconstruction error itself. Iteration stops when loss < loss_rtol.
show_trace: If true, prints iteration progress and current relative error.
show_every: Frequency, in iterations, at which progress information is printed when show_trace=true.

Returns

lambda: Length-cp_rank vector containing the absorbed column scalings.
A: Factor matrix of size (size(X, 1), cp_rank).
B: Factor matrix of size (size(X, 2), cp_rank).
C: Factor matrix of size (size(X, 3), cp_rank).

The reconstructed tensor is represented implicitly as the CP model with weights lambda and factors A, B, and C.

source

TensorFactors.cp_contract — Function

cp_contract(factors::NTuple{N, <:AbstractMatrix{T}}) where {T <: Number, N}

Contract a CP (Canonical Polyadic) decomposition with N factor matrices into a full tensor.

This function reconstructs a full N-dimensional tensor from its factor matrices. To maximize performance and ensure correct memory allocation, the implementation utilizes metaprogramming to generate specialized methods for N ranging from 2 to 10. Each method leverages @tullio for efficient, multi-threaded tensor contraction, ensuring that intermediate allocations are optimized by providing concrete matrix variables to the macro's symbolic analyzer.

Arguments

factors: An N-element tuple of matrices, where each matrix factors[n] represents the factor matrix for the n-th mode.

Returns

X: The reconstructed N-dimensional tensor.

source

TensorFactors.cp_factors_to_flat — Method

cp_factors_to_flat(
    cp_factors::NTuple{N, <:AbstractMatrix{T}},
) where {T <: Real, N}

Flattens a tuple of CP factor matrices into a single parameter vector.

This function packs the factor matrices in cp_factors into one contiguous vector by concatenating the entries of each matrix in column-major order. The factors are stored sequentially in the same order as they appear in the input tuple, making this function the inverse of flat_to_cp_factors when the same cp_rank and row sizes are used.

This is useful when CP factors need to be represented in flattened form, for example for gradient-based optimization, parameter serialization, or interoperability with generic vector-based numerical routines.

Arguments

cp_factors: Tuple of N factor matrices, where all factors have the same number of columns equal to the CP rank.

Returns

p: A flat vector containing all factor entries in column-major order.

source

TensorFactors.cp_loss — Method

cp_loss(
    Gts::NTuple{N, <:AbstractMatrix{T}},
    A_n::AbstractMatrix{T},
    mttkrp_n::AbstractMatrix{T},
    X_norm2::U,
) where {N, T <: Number, U <: Real}

Computes the squared CP reconstruction loss from precomputed Gram matrices and an MTTKRP term for one mode.

This method is intended for use inside ALS iterations, where the Gram matrices of the factor matrices and the mode-n MTTKRP term have already been computed. Rather than re-evaluating the full loss from the factors and tensor directly, it uses the identity

||X - X̂||_F^2 = ||X||_F^2 - 2⟨A_n, MTTKRP_n⟩ + ||X̂||_F^2

where ||X̂||_F^2 is obtained from the Hadamard product of the Gram matrices. This provides a fast way to monitor convergence with minimal additional cost.

Arguments

Gts: Tuple of N Gram matrices, where Gts[k] = factors[k]' * factors[k].
A_n: Factor matrix for the mode used in the MTTKRP evaluation.
mttkrp_n: MTTKRP term corresponding to the same mode as A_n.
X_norm2: Squared Frobenius norm of the input tensor, i.e. sum(abs2, X).

Returns

The squared Frobenius loss ||X - X̂||_F^2.

source

TensorFactors.cp_loss — Method

cp_loss(
    p::AbstractVector{T},
    X::AbstractArray{U, 3},
    cp_rank::Int,
    norm2_X::V,
    GtA, GtB, GtC, H
)::T where {T <: Real, U <: Real, V <: Real}

Computes the CP reconstruction loss for a 3rd-order tensor X from a flattened parameter vector p, using preallocated workspace buffers.

This function interprets p as the concatenation of the three CP factor matrices A, B, and C, each with cp_rank columns, via flat_to_cp_factors. It then evaluates the squared Frobenius reconstruction loss

||X - X̂||_F^2

for the CP model induced by these factors.

To reduce allocations, the function expects the squared Frobenius norm of X, norm2_X = ||X||_F^2, to be provided by the caller, together with workspace buffers for Gram matrices and their Hadamard product. Internally, it computes an MTTKRP term with respect to the third mode, forms the Gram matrices A' * A, B' * B, and C' * C, and combines them to evaluate the loss efficiently.

This representation is useful when CP factors are optimized in flattened form, for example in vector-based first-order or second-order optimization routines.

Arguments

p: Flat parameter vector encoding the factor matrices A, B, and C in column-major order.
X: Input 3rd-order tensor.
cp_rank: CP rank, i.e. the common number of columns of the factor matrices.
norm2_X: Precomputed squared Frobenius norm of X, i.e. sum(abs2, X).
GtA: Preallocated buffer for the Gram matrix A' * A.
GtB: Preallocated buffer for the Gram matrix B' * B.
GtC: Preallocated buffer for the Gram matrix C' * C.
H: Preallocated buffer for the Hadamard product GtA .* GtB .* GtC.

Returns

The squared Frobenius loss ||X - X̂||_F^2.

Notes

The workspace buffers must have compatible sizes, typically (cp_rank, cp_rank).
The return value is clamped below by zero(T) to avoid small negative values caused by floating-point roundoff.

source

TensorFactors.cp_loss — Method

cp_loss(factors::NTuple{N, <:AbstractMatrix{T}}, X::AbstractArray{T, N}, norm2_X::T) where {N, T <: Number}

Computes the CP decomposition loss for an arbitrary-order tensor X from its factor matrices factors using pure BLAS operations.

This method expands the squared Frobenius norm analytically to avoid explicit tensor reconstruction and Khatri-Rao product formation. It sequentially contracts the tensor with the factor vectors using matrix-vector multiplications, allowing the loss to be evaluated efficiently with low memory overhead. This yields substantial speedups and is well suited to high-performance and GPU-compatible tensor factorization workflows.

Arguments

factors: Tuple of N factor matrices defining the CP decomposition, where factors[k] has size (size(X, k), R) and all factor matrices share the same column dimension R.
X: Input tensor of order N.

Returns

The squared Frobenius loss ||X - X̂||_F^2, where X̂ is the CP reconstruction induced by factors.

source

TensorFactors.cp_loss_fg! — Method

cp_loss_fg!(
    G,
    p::AbstractVector{T},
    X::AbstractArray{U, 3},
    cp_rank::Int,
    norm2_X::V,
    GtA, GtB, GtC, H_A, H_B, H_C
) where {T <: Real, U <: Real, V <: Real}

Computes both the CP reconstruction loss and its gradient for a 3rd-order tensor X from a flattened parameter vector p, writing the gradient in-place to G using preallocated workspace buffers.

This function interprets p as the concatenation of the three CP factor matrices A, B, and C, each with cp_rank columns, via flat_to_cp_factors. It also interprets G as storage for the flattened factor gradients gA, gB, and gC.

Internally, the function first forms the Gram matrices A' * A, B' * B, and C' * C, then constructs the Hadamard products needed for the CP gradient. It computes MTTKRP terms for each factor, uses them both to evaluate the squared Frobenius reconstruction loss

||X - X̂||_F^2

and to overwrite G with the corresponding gradient in flattened form.

The squared Frobenius norm of X is supplied by the caller through norm2_X, and all matrix intermediates are written into caller-provided buffers to avoid unnecessary allocations.

This function is useful in optimization routines that require the objective value and gradient to be evaluated together from a single flattened parameter vector.

Arguments

G: Output gradient vector, overwritten in-place with the gradient of the loss with respect to p.
p: Flat parameter vector encoding the factor matrices A, B, and C in column-major order.
X: Input 3rd-order tensor.
cp_rank: CP rank, i.e. the common number of columns of the factor matrices.
norm2_X: Precomputed squared Frobenius norm of X, i.e. sum(abs2, X).
GtA: Preallocated buffer for the Gram matrix A' * A.
GtB: Preallocated buffer for the Gram matrix B' * B.
GtC: Preallocated buffer for the Gram matrix C' * C.
H_A: Preallocated buffer for the Hadamard product GtB .* GtC.
H_B: Preallocated buffer for the Hadamard product GtA .* GtC.
H_C: Preallocated buffer for the Hadamard product GtA .* GtB.

Returns

The squared Frobenius loss ||X - X̂||_F^2.

source

TensorFactors.cp_loss_grad! — Method

cp_loss_grad!(
    g::AbstractVector{T},
    p::AbstractVector{T},
    X::AbstractArray{U, 3},
    cp_rank::Int,
    GtA, GtB, GtC, H_A, H_B, H_C
) where {T <: Real, U <: Real}

Computes the gradient of the 3rd-order CP reconstruction loss with respect to a flattened parameter vector p, writing the result in-place to g, using preallocated workspace buffers.

This function interprets p as the flattened CP factor matrices A, B, and C, and interprets g as storage for the corresponding factor gradients gA, gB, and gC via flat_to_cp_factors. It evaluates the gradient of the squared Frobenius loss

||X - X̂||_F^2

from the standard CP gradient formula based on MTTKRP terms and Gram matrices of the factor matrices.

To avoid unnecessary allocations, all intermediate Gram matrices and Hadamard products are written into caller-provided buffers. The output vector g is overwritten in-place and no new flattened gradient vector is allocated.

This is useful in optimization workflows where CP factors are represented as a single parameter vector and gradients must be returned in the same flattened form.

Arguments

g: Output gradient vector, overwritten in-place with the gradient of the loss with respect to p.
p: Flat parameter vector encoding the factor matrices A, B, and C in column-major order.
X: Input 3rd-order tensor.
cp_rank: CP rank, i.e. the common number of columns of the factor matrices.
GtA: Preallocated buffer for the Gram matrix A' * A.
GtB: Preallocated buffer for the Gram matrix B' * B.
GtC: Preallocated buffer for the Gram matrix C' * C.
H_A: Preallocated buffer for the Hadamard product GtB .* GtC.
H_B: Preallocated buffer for the Hadamard product GtA .* GtC.
H_C: Preallocated buffer for the Hadamard product GtA .* GtB.

Returns

nothing.

source

TensorFactors.cp_opt — Method

cp_opt(
    method::Optim.AbstractOptimizer,
    X::AbstractArray{T, N},
    cp_rank::Int;
    max_iter::Int=typemax(Int),
    show_trace::Bool=false,
    show_every::Int=100,
    p0::Union{Nothing, AbstractVector{T}}=nothing,
) where {T <: Real, N}

Fits a rank-cp_rank CP decomposition to a tensor X by minimizing the CP reconstruction loss with an optimizer from Optim.jl.

This function represents the CP factor matrices as a single flattened parameter vector and solves the resulting unconstrained optimization problem using the optimizer specified by method. If no initial parameter vector is provided, one is initialized randomly from a standard normal distribution. After optimization, the minimizer is reshaped into a tuple of CP factor matrices using flat_to_cp_factors.

The objective minimized is the squared Frobenius reconstruction loss ||X - X̂||_F^2, where X̂ is the rank-cp_rank CP reconstruction induced by the optimized factor matrices.

Arguments

method: Optimizer from Optim.jl, such as LBFGS() or ConjugateGradient().
X: Input tensor.
cp_rank: Target CP rank.

Keyword Arguments

max_iter: Maximum number of optimization iterations.
show_trace: If true, prints optimization progress information.
show_every: Frequency, in iterations, at which progress information is printed when show_trace=true.
p0: Optional initial flattened parameter vector. If nothing, a random initialization of length cp_rank * sum(size(X)) is used.

Returns

cp_factors: A tuple of factor matrices defining the fitted CP decomposition, where cp_factors[n] has size (size(X, n), cp_rank).

source

TensorFactors.flat_to_cp_factors — Method

flat_to_cp_factors(
    p::AbstractVector{T},
    cp_rank::Int,
    row_sizes::NTuple{N, Int},
) where {T <: Real, N}

Reshapes a flat parameter vector p into a tuple of CP factor matrices.

This function interprets p as the concatenation of N factor matrices stored in column-major order, where the n-th factor has size (row_sizes[n], cp_rank). It returns a tuple whose entries are views into the original vector reshaped as matrices, so no data is copied during the transformation.

This is useful when CP factors are stored or optimized in flattened form, for example in gradient-based optimization routines or parameter packing/unpacking utilities.

Arguments

p: Flat parameter vector containing all factor entries.
cp_rank: Common column dimension of each factor matrix, i.e. the CP rank.
row_sizes: Tuple specifying the number of rows in each factor matrix.

Returns

cp_factors: A tuple of N factor matrices, where cp_factors[n] has size (row_sizes[n], cp_rank).

source

TensorFactors.mttkrp! — Method

compute_mttkrp!(
    M::AbstractMatrix{T},
    X::AbstractArray{T, N},
    factors::NTuple{N, <:AbstractMatrix{T}},
    ::Val{n}
) where {T <: Number, N, n}

Computes the Matricized Tensor Times Khatri-Rao Product (MTTKRP) of an arbitrary-order tensor X along mode n, storing the result in the preallocated matrix M.

This implementation avoids explicitly forming either the matricized tensor or the Khatri-Rao product. Instead, it evaluates each rank-r column independently through a sequence of tensor contractions using BLAS-backed matrix-vector multiplications. The computation proceeds by contracting modes N, N-1, ..., n+1 first, then 1, 2, ..., n-1, while keeping mode n uncontracted. This reduces intermediate memory usage, enables efficient in-place execution, and is suitable for high-performance CPU and GPU-compatible tensor factorization workflows.

Arguments

M: Preallocated output matrix of size (size(X, n), R), where R is the column dimension of the factor matrices.
X: Input tensor of order N.
factors: Tuple of N factor matrices, where factors[k] has size (size(X, k), R).
Val(n): Compile-time mode index specifying the MTTKRP mode.

Returns

M: The updated output matrix containing the mode-n MTTKRP result.

source

TensorFactors.ttm — Method

ttm(
    A::AbstractArray{T, N},
    M::AbstractMatrix,
    mode::Int
) where {T, N}

Computes the n-mode product (Tensor-Times-Matrix) of a tensor A with a matrix M.

This operation projects the mode-th dimension of the tensor A onto the row space of the matrix M. Internally, the function unfolds the tensor along mode, performs a standard matrix multiplication M * A_{(n)}, reshapes the result, and restores the original order of the dimensions.

Arguments

A: The N-dimensional input tensor.
M: The matrix to multiply with the tensor. The number of columns in M must exactly match the length of A's mode-th dimension (size(M, 2) == size(A, mode)).
mode: The integer specifying the dimension of the tensor to be multiplied.

Returns

res: A new N-dimensional tensor. Its size is identical to A, except for the mode-th dimension, which is updated to the number of rows in M (size(M, 1)).

source

TensorFactors.tucker_contract — Function

tucker_contract(core::AbstractArray{T, N}, factors::NTuple{N, <:AbstractMatrix{T}}) where {T <: Number, N}

Contract a Tucker decomposition, consisting of a core tensor and N factor matrices, into a full tensor.

This function reconstructs a full N-dimensional tensor by multi-linear product of a core tensor G with N factor matrices. To maximize performance and ensure correct memory allocation, the implementation utilizes metaprogramming to generate specialized methods for N ranging from 3 to 10. Each method leverages @tullio for efficient, multi-threaded tensor contraction. By using metaprogramming to unpack the factor matrices into distinct variables, the @tullio symbolic analyzer can generate highly optimized kernel loops without the overhead of indexing into a collection.

Arguments

core: An N-dimensional core tensor of size (r1, r2, ..., rN).
factors: An N-element tuple of matrices, where each matrix factors[n] of size (In, rn) represents the factor matrix for the n-th mode.

Returns

X: The reconstructed N-dimensional tensor of size (I1, I2, ..., IN).

source

TensorFactors.tucker_hosvd — Method

hosvd(
    A::AbstractArray{T,N},
    ranks::NTuple{N,Int}
) where {T <: Number,N}

Computes the Truncated Higher-Order Singular Value Decomposition (HOSVD) of tensor A.

This function decomposes an N-dimensional tensor into a compressed core tensor and a set of orthogonal factor matrices corresponding to each mode. To maximize performance and minimize memory overhead, it avoids computing full SVDs on highly wide unfolded matrices. Instead, it computes the symmetric Gram matrix A_{(n)} * A_{(n)}' utilizing fast BLAS SYRK routines, followed by an eigenvalue decomposition. Memory allocations are further optimized by pooling matrix buffers based on the tensor's unique dimensions.

Arguments

A: The N-dimensional input tensor to be decomposed.
ranks: An N-element tuple of integers specifying the target truncation rank for each respective mode.

Returns

S: The highly compressed N-dimensional core tensor of size ranks.
factors: A tuple of N orthogonal factor matrices. The n-th matrix, factors[n], has a size of (size(A, n), ranks[n]).

source

TensorFactors.unfold_mode — Method

unfold_mode(
    A::AbstractArray{T, N},
    mode::Int
) where {T, N}

Unfolds (matricizes) an N-dimensional tensor A into a matrix along the specified mode.

This function permutes the dimensions of the tensor so that the target mode becomes the first dimension, while preserving the relative order of the remaining dimensions. It then reshapes the result into a 2D matrix. The returned matrix is explicitly materialized into a contiguous array (rather than a lazy view) to guarantee optimal BLAS performance in downstream matrix multiplications.

Arguments

A: The N-dimensional input tensor to be unfolded.
mode: The integer specifying the dimension along which to unfold the tensor.

Returns

unfolded_matrix: A 2D matrix of size (size(A, mode), J), where J is the product of all other dimensions of the tensor.

source