Re: 64-bit instruction encoding wish list
Open to feedback here.. But my thought was that I will not need vslide1up if I am able to control the reduction destination. A loop around the instructions: vredsum vd[rs1], vs1[rs1], vs2[*] rs1 = rs1 + 1 can enable doing a vector-width worth of reductions into adjacent elements and will not require sliding data through the vector between reductions. Probably cheaper this way than shifting data through vd.. See any issues with the way I am thinking here? Best Regards Nagendra On Wed, Mar 11, 2020 at 12:32 PM Claire Wolf <claire@...> wrote:
|
|