Re: Sparse Matrix-Vector Multiply (again) and Bit-Vector Compression
please share the asm for spmv, the key kernel (s),
in any case, the execution time for operations using a mask, is very implementation/machine dependent
it is a function on how aggressive, in hardware, addressing the vector register elements is. is addressing always sequential or addresses generated out of sequence based on the bit mask (in V0
as a result of the increasing performance of HPCG, spmv is an extremely important computation kernel.
in my experience, the dominant time for HPCG, is the memory latency time (short vectors). also node to node networking
| I am now investigating how to efficiently implement sparse matrix X (dense) vector multiplications (spMV) using RISCV vectors using bit-vector format of
| compressing the sparse matrix. The inner loop of sequential spMV algorithm simply multiply-accumulates non-zeroes of a row of M with corresponding values of the
| dense vector (say V) to form an output element. To implement this efficiently where bit-vectors are used to compress M is where I have a question/observation.
| In bit-vector compression, the sparse matrix metadata is a bit map of 0s and 1s. A 0 indicates that the matrix value is a 0. A 1 indicates that the matrix value is
| non-zero. A values array -- say M_Vals[.] -- stores these non-zero values.
WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.