Close

Presentation

Pipirima: Predicting Patterns in Sparsity to Accelerate Matrix Algebra
DescriptionWhile sparsity, a feature of data in many applications, provides optimization opportunities such as reducing unnecessary computations, data transfers, and storage, it causes several challenges, too. For instance, even in state-of-the-art
sparse accelerators, sparsity can result in load imbalance; a performance bottleneck. To solve such challenges, our key insight
is that if while reading/streaming compressed sparse matrices we can quickly anticipate the locations of the non-zero values in
a sparse matrix, we can leverage this knowledge to accelerate processing sparse matrices. To enable this, we propose Pipirima, a lightweight prediction-based sparse accelerator. Inspired by traditional branch predictors, Pipirima uses resource-friendly simple counters to predict the patterns of non-zero values in the sparse matrices. We evaluate Pipirima based on sparse matrix vector multiplication (SpMV) and sparse matrix-dense matrix multiplication (SpMM) kernels on CSR compressed matrices derived from both scientific computing and transformer models. On average, our experiments show 6× and 4× speed up over Tensaurus for SpMM and SpMV, respectively on SuiteSparse workload. Pipirima also shows 40× speed up over ExTensor for SpMM. We achieve 8.3×, 48.2× over Tensaurus and ExTensor in lesser sparse transformer workloads. Piprima consumes 5.621mm2 area and 544.93mW power using 45nm technology with predictor related components as the least expensive ones.