Sparse matrix-vector products

The standard native Julia sparse matrix (SparseMatrixCSC) is stored in the compressed sparse column format (CSC). Matrix-vector products with this type of matrix do not use multithreading in Julia (although there is non-Julia software that does use multiple threads and obtains small speedups).

Many of our matrices $$A$$ , get a banded structure when we form $A^\top A$ . A banded structure is a matrix that has a small number of diagonals with nonzero elements. For example, if $$A$$ is a first-order finite-difference matrix for a 2D domain, $A^\top A$ also has a banded structure. We store the matrices $B = A^\top A$ in the compressed diagonal storage format (CDS) (sometimes referred to as the DIA format) if $B \in \mathbb{R}^{N \times N}$ has a banded structure. This means we store all $$d$$ nonzero diagonals as a dense matrix $R \in \mathbb{R}^{N \times d}$ and a vector $n \in \mathbb{R}^d$ with the indices of nonzero diagonals.

Matrix-vector products in the CDS format are faster for banded matrices and also achieve a nice speedup using multiple threads. The multi-treading is implemented in pure Julia using only a few lines of code.

Below, we compare timings for 2D and 3D models of various sizes for various number of threads on a dedicated cluster node with $$2$$ CPUs per node, 10 cores per CPU (Intel Ivy Bridge 2.8 GHz E5-2680v2) and 128 GB memory per node. We show results for matrix-vector products in the CSC and CDS format for a laplacian matrix. The CSC MVPs in Julia are computed as the transpose of the matrix, which is a bit faster than with the matrix itself. The matrices $$B$$ are always symmetric so we don’t need to compute additional transposes of matrices.

Figure1Times for matrix-vector products with Laplacian matrices for 2D models (top) and 3D models (bottom). Compressed column storage (CSC, native in `Julia`) and compressed diagonal storage (CDS, our implementation). Multi-threading does not pay off for too small models.

Figure2Times for matrix-vector products with Laplacian matrices for 2D and 3D models. Compressed column storage (CSC, native in `Julia`) and compressed diagonal storage (CDS, our implementation).