Date
1 - 2 of 2
Sequence to insert an element
Hi,
what is a reasonable sequence to insert an element into an arbitrary position in the vector?
I considered the following sequence (assume the input vector is v12)
vid.v v1
vmseq.vx v0, v1, <index>
vmerge.vxm v1, v12, <value>, v0.t
But I think this is problematic for sew=8 as there may be overflow if vlmax(sew=8)>256.
It may be possible for lmul={1,2,4} sew=8 to compute vid and vmseq using lmul={2,4,8} sew=16, respectively but the lmul=8,sew=8 case won't work as there is no lmul=16,sew=16.
I also came up with this other sequence but doesn't look great to me:
vslidedown.vx v1, v12, <index>
vmv.s.x v1, <value>
vslideup.vx v1, v1, <index>
vsetvl x0, <index>,sew,lmul,tu,mu
vmv.v.v v1, v12 # should leave the tail undisturbed
Thanks a lot,
--
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación
http://bsc.es/disclaimer
what is a reasonable sequence to insert an element into an arbitrary position in the vector?
I considered the following sequence (assume the input vector is v12)
vid.v v1
vmseq.vx v0, v1, <index>
vmerge.vxm v1, v12, <value>, v0.t
But I think this is problematic for sew=8 as there may be overflow if vlmax(sew=8)>256.
It may be possible for lmul={1,2,4} sew=8 to compute vid and vmseq using lmul={2,4,8} sew=16, respectively but the lmul=8,sew=8 case won't work as there is no lmul=16,sew=16.
I also came up with this other sequence but doesn't look great to me:
vslidedown.vx v1, v12, <index>
vmv.s.x v1, <value>
vslideup.vx v1, v1, <index>
vsetvl x0, <index>,sew,lmul,tu,mu
vmv.v.v v1, v12 # should leave the tail undisturbed
Thanks a lot,
--
Roger Ferrer Ibáñez - roger.ferrer@...
Barcelona Supercomputing Center - Centro Nacional de Supercomputación
http://bsc.es/disclaimer
David Horner
On 2020-10-16 11:10 a.m., Roger Ferrer Ibanez wrote:
And there are tricks to set it up, for example a direct load (register move) to v0 to set the correct bit.
The mask could be built in v2 and transfered under mask to clear lower or higher aliasing.
Hi,The mask could be built with sew=16, as the mask is ordinal based.
what is a reasonable sequence to insert an element into an arbitrary position in the vector?
I considered the following sequence (assume the input vector is v12)
vid.v v1
vmseq.vx v0, v1, <index>
vmerge.vxm v1, v12, <value>, v0.t
But I think this is problematic for sew=8 as there may be overflow if vlmax(sew=8)>256.
And there are tricks to set it up, for example a direct load (register move) to v0 to set the correct bit.
The mask could be built in v2 and transfered under mask to clear lower or higher aliasing.
It may be possible for lmul={1,2,4} sew=8 to compute vid and vmseq using lmul={2,4,8} sew=16, respectively but the lmul=8,sew=8 case won't work as there is no lmul=16,sew=16.
I also came up with this other sequence but doesn't look great to me:
vslidedown.vx v1, v12, <index>
vmv.s.x v1, <value>
vslideup.vx v1, v1, <index>
vsetvl x0, <index>,sew,lmul,tu,mu
vmv.v.v v1, v12 # should leave the tail undisturbed
Thanks a lot,