Re: Duplicate Counting Instruction


Hi Krste,

I read through your code and thanks for correcting my errors, 'or' is a good idea for multiple duplicates.
Here I'd like to explain why I made things a bit more complicated in my code.

In your code you are also fixing duplicates from least to most by v0 mask. But if we are sure that vhash
takes a lot more cycles to execute, then we can try to execute vhash only once since it outputs all
duplicates lanes at one time. We can thus store the duplicates in a mask register and fix them using
this register as a reference. We resolve duplicates from least to most as usual, where we mask off those
duplicate lanes resolved. As a result, we can only loop over the patch up loop without re-executing vhash

The code does not place any design difference to the vhash instruction design, we just demonstrated 
that it works for memory hazard problems.

Join { to automatically receive all group messages.