Re: 64-bit instruction encoding wish list
Claire Wolf <claire@...>
toggle quoted message Show quoted text
to me it seems like reading the dest element index from a scalar reg sounds like a significant microarchitectural overhead. can you describe why this is needed?
I would assume that use cases for this replace sequences of reduce operations interleaved with vector slide operations. If the reduction is a multi-cycle op and the slide is a single-cycle op then I would assume that getting rid of the single-cycle op won't change much in terms of performance. and if it's just to squeeze out the last bit of extra performance then maybe it would be an alternative to implement instruction function between reduce and vector slide (although that might cause issues with instruction throughput if some of the instructions are already 64-bit wide).
On Wed, 11 Mar 2020 at 16:15, Nagendra Gulur <nagendra.gd@...> wrote: