DOLFINx 0.11.0.0
DOLFINx C++
Loading...
Searching...
No Matches
dolfinx::la::impl Namespace Reference

Fetch the rows of B that correspond to the ghost columns of A. More...

Classes

struct  Sparsity
 Lightweight sparsity descriptor satisfying the SparsityImplementation concept required by the MatrixCSR constructor. More...

Functions

template<typename T>
std::tuple< std::shared_ptr< common::IndexMap >, std::vector< std::int32_t >, std::vector< std::int32_t >, std::vector< T > > fetch_ghost_rows (const dolfinx::la::MatrixCSR< T > &A, const dolfinx::la::MatrixCSR< T > &B)
 Fetch the rows of Matrix B which are referenced by the ghost columns of Matrix A.
template<typename T>
std::tuple< std::vector< std::int64_t >, std::vector< std::int32_t >, std::vector< std::int32_t >, std::vector< T > > matmul (const dolfinx::la::MatrixCSR< T > &A, const dolfinx::la::MatrixCSR< T > &B, std::shared_ptr< const common::IndexMap > new_col_map, std::span< const std::int32_t > ghost_row_ptr, std::span< const std::int32_t > ghost_cols, std::span< const T > ghost_vals)
 Compute the sparsity pattern and values of C = A*B in a single pass.
template<int BS0, int BS1, typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_csr (U &&data, const V &cols, const W &row_ptr, const X &x, const Y &xrows, const Y &xcols, OP op, typename Y::value_type num_rows)
 Incorporate data into a CSR matrix.
template<int BS0, int BS1, typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_blocked_csr (U &&data, const V &cols, const W &row_ptr, const X &x, const Y &xrows, const Y &xcols, OP op, typename Y::value_type num_rows)
 Incorporate blocked data with given block sizes into a non-blocked MatrixCSR.
template<typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_nonblocked_csr (U &&data, const V &cols, const W &row_ptr, const X &x, const Y &xrows, const Y &xcols, OP op, typename Y::value_type num_rows, int bs0, int bs1)
 Incorporate non-blocked data into a blocked matrix (data block size=1).
template<typename T, int BS1>
void spmv (std::span< const T > values, std::span< const std::int64_t > row_begin, std::span< const std::int64_t > row_end, std::span< const std::int32_t > indices, std::span< const T > x, std::span< T > y, int bs0, int bs1)
 Sparse matrix-vector product implementation.
template<typename T, int BS1>
void spmvT (std::span< const T > values, std::span< const std::int64_t > row_begin, std::span< const std::int64_t > row_end, std::span< const std::int32_t > indices, std::span< const T > x, std::span< T > y, int bs0, int bs1)
 Sparse matrix-vector transpose product implementation.
template<typename T, int BS0 = -1, int BS1 = -1>
std::tuple< std::vector< std::int32_t >, std::vector< std::int64_t >, std::vector< T > > local_transpose (const dolfinx::la::MatrixCSR< T > &A)
 Compute the local (diagonal-block) transpose of A.

Detailed Description

Fetch the rows of B that correspond to the ghost columns of A.

For computing the product A*B, each rank needs the rows of B whose global indices match the ghost columns of A. Those ghost columns are owned by remote ranks, so we request those rows via neighbourhood communication.

Parameters
AMatrix whose ghost column indices determine which rows of B are needed.
BMatrix whose rows are fetched.
Returns
A new MatrixCSR containing the fetched ghost rows of B, with an extended column IndexMap covering any new ghost columns introduced by those rows.

Function Documentation

◆ fetch_ghost_rows()

template<typename T>
std::tuple< std::shared_ptr< common::IndexMap >, std::vector< std::int32_t >, std::vector< std::int32_t >, std::vector< T > > fetch_ghost_rows ( const dolfinx::la::MatrixCSR< T > & A,
const dolfinx::la::MatrixCSR< T > & B )

Fetch the rows of Matrix B which are referenced by the ghost columns of Matrix A.

Parameters
AMatrixCSR
BMatrixCSR
Returns
Tuple containing [new index map, rowptr, cols, values] for the received rows

◆ insert_blocked_csr()

template<int BS0, int BS1, typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_blocked_csr ( U && data,
const V & cols,
const W & row_ptr,
const X & x,
const Y & xrows,
const Y & xcols,
OP op,
typename Y::value_type num_rows )

Incorporate blocked data with given block sizes into a non-blocked MatrixCSR.

Note
Matrix block size (bs=1). Matrix sparsity must be correct to accept the data.
See ::insert_csr for data layout.
Template Parameters
BS0Row block size of data.
BS1Column block size of data.
OPThe operation (usually "set" or "add").
Parameters
[out]dataCSR matrix data.
[in]colsCSR column indices.
[in]row_ptrPointer to the ith row in the CSR data.
[in]xThe m by n dense block of values (row-major) to add to the matrix.
[in]xrowsRow indices of x.
[in]xcolsColumn indices of x.
[in]opThe operation (set or add).
[in]num_rowsMaximum row index that can be set. Used when debugging to check that rows beyond a permitted range are not being set.

◆ insert_csr()

template<int BS0, int BS1, typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_csr ( U && data,
const V & cols,
const W & row_ptr,
const X & x,
const Y & xrows,
const Y & xcols,
OP op,
typename Y::value_type num_rows )

Incorporate data into a CSR matrix.

Template Parameters
BS0Row block size (of both matrix and data).
BS1Column block size (of both matrix and data).
OPThe operation (usually "set" or "add").
Parameters
[out]dataCSR matrix data.
[in]colsCSR column indices.
[in]row_ptrPointer to the ith row in the CSR data.
[in]xThe m by n dense block of values (row-major) to add to the matrix.
[in]xrowsRow indices of x.
[in]xcolsColumn indices of x.
[in]opThe operation (set or add),
[in]num_rowsMaximum row index that can be set. Used when debugging to check that rows beyond a permitted range are not being set.
Note
In the case of block data, where BS0 or BS1 are greater than one, the layout of the input data is still the same. For example, the following can be inserted into the top-left corner with xrows={0,1} and xcols={0,1}, BS0=2, BS1=2 and x={0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}.

0 1 | 2 3

4 5 | 6 7

8 9 | 10 11 12 13 | 14 15

◆ insert_nonblocked_csr()

template<typename OP, typename U, typename V, typename W, typename X, typename Y>
void insert_nonblocked_csr ( U && data,
const V & cols,
const W & row_ptr,
const X & x,
const Y & xrows,
const Y & xcols,
OP op,
typename Y::value_type num_rows,
int bs0,
int bs1 )

Incorporate non-blocked data into a blocked matrix (data block size=1).

Note
Matrix sparsity must be correct to accept the data.
See ::insert_csr for data layout.
Parameters
[out]dataCSR matrix data.
[in]colsCSR column indices.
[in]row_ptrPointer to the ith row in the CSR data.
[in]xThe m by n dense block of values (row-major) to add to the matrix.
[in]xrowsRrow indices of x.
[in]xcolsColumn indices of x.
[in]opThe operation (set or add).
[in]num_rowsMaximum row index that can be set. Used when debugging to check that rows beyond a permitted range are not being set.
[in]bs0Row block size of matrix.
[in]bs1Column block size of matrix.

◆ local_transpose()

template<typename T, int BS0 = -1, int BS1 = -1>
std::tuple< std::vector< std::int32_t >, std::vector< std::int64_t >, std::vector< T > > local_transpose ( const dolfinx::la::MatrixCSR< T > & A)

Compute the local (diagonal-block) transpose of A.

Iterates over owned rows of A and the owned-column entries within each row (indices [row_ptr[i], off_diag_offset[i])), building the transpose CSR in one bucket-fill pass. The resulting column indices are original row indices (0-based local), so columns within every output row are already in ascending order.

Parameters
AMatrixCSR
Template Parameters
TScalar type
BS0row block size, must match row block size of A, or use -1 for non-optimized
BS1col block size, must match col block size of A, or use -1 for non-optimized
Returns
Tuple (cols, row_ptr, values) for the transposed diagonal block.

◆ matmul()

template<typename T>
std::tuple< std::vector< std::int64_t >, std::vector< std::int32_t >, std::vector< std::int32_t >, std::vector< T > > matmul ( const dolfinx::la::MatrixCSR< T > & A,
const dolfinx::la::MatrixCSR< T > & B,
std::shared_ptr< const common::IndexMap > new_col_map,
std::span< const std::int32_t > ghost_row_ptr,
std::span< const std::int32_t > ghost_cols,
std::span< const T > ghost_vals )

Compute the sparsity pattern and values of C = A*B in a single pass.

Uses a dense accumulator (one entry per possible output column) to simultaneously detect nonzero columns and accumulate A[i,j]*B[j,k], eliminating the separate value-fill loop and the per-entry lower_bound search that a two-pass approach requires.

Parameters
ALeft matrix.
BRight matrix (local rows only).
new_col_mapExtended column IndexMap for C, from fetch_ghost_rows.
ghost_row_ptrCSR row pointer for the ghost rows of B.
ghost_colsLocal column indices (w.r.t. new_col_map) for ghost rows.
ghost_valsValues for the ghost rows of B.
Returns
Tuple (row_ptr, off_diag_offsets, cols, vals). off_diag_offsets[i] is the number of diagonal-block entries in row i, computed during the same sort step that establishes column order.

◆ spmv()

template<typename T, int BS1>
void spmv ( std::span< const T > values,
std::span< const std::int64_t > row_begin,
std::span< const std::int64_t > row_end,
std::span< const std::int32_t > indices,
std::span< const T > x,
std::span< T > y,
int bs0,
int bs1 )

Sparse matrix-vector product implementation.

Template Parameters
T
BS1
Parameters
values
row_begin
row_end
indices
x
y
bs0
bs1

◆ spmvT()

template<typename T, int BS1>
void spmvT ( std::span< const T > values,
std::span< const std::int64_t > row_begin,
std::span< const std::int64_t > row_end,
std::span< const std::int32_t > indices,
std::span< const T > x,
std::span< T > y,
int bs0,
int bs1 )

Sparse matrix-vector transpose product implementation.

Computes y += A^T x where A is given in CSR format. The transpose is applied implicitly: for each nonzero A(i, indices[j]) the contribution A(i,j) * x[i] is scattered into y[indices[j]].

Note
y is accumulated into (not overwritten). Callers should zero y before the first call if a fresh result is required.
The value block layout is row-major within each block: for block entry j the element at row-offset k0 and column-offset k1 is stored at values[j * bs0 * bs1 + k0 * bs1 + k1].
Template Parameters
TScalar type of the matrix and vector entries.
BS1Compile-time column block size. Pass -1 to use the runtime value bs1 instead.
Parameters
[in]valuesNonzero values of A, stored block-row-major. Length: nnz * bs0 * bs1.
[in]row_beginStart positions in values/indices for each row of A. Length: number of rows of A.
[in]row_endEnd positions in values/indices for each row of A. Length: number of rows of A.
[in]indicesColumn indices of each nonzero block entry of A. Length: nnz.
[in]xInput vector, indexed by the rows of A. Length: num_rows * bs0.
[in,out]yOutput vector, indexed by the columns of A, accumulated into. Length: num_cols * bs1.
[in]bs0Row block size (runtime value).
[in]bs1Column block size (runtime value, used when BS1 == -1).