issue #393.4 - Towards a simple fractional LMUL design - third itteration .


David Horner
 

This is the fourth installation of the Simple Fractioanl LMUL design.

It does not yet address SLEN and a suggestion for a fractional SLEN (fracSLEN).

It should however clarify the ramifications of the last installment and address some necessary design decisions on register number use in instructions.

-------------------------------------------------------------------------- --------------------------------------------------------------------------

Lets consider in detail some rows from the last table:

LMUL 1/16 1/8 1/4 1/2 1 2 ...
....





group type





16:2 x a=0, 2,4,6, ...



8:1
a= odd



....





16:4
x a= 0,4, 8,12 ...


8:2
x a= 2,6, 10,14, ...


....





16:8

x a= 0,8, 16,24

8:4

x a= 4,12, 18,20, ...

4:2

x a= 2,6, 10,14, ...

....





8:8


x

4:4


x

2:2


x

LMUL **1/16 1/8 1/4 1/2 1 2

The most prominent feature is the register numbers, especially for LMUL=1/4 and 1/2 that have to be extensively shared.

Detailing the register group addressibility in given fractional LMUL

Looking first at column LMUL=1/8 we see a nice division of 16:2group and 8:1group addresses.
Exactly what we might expect, the same as LMUL=2 where half the registers are in register groups.
Unlike LMUL=2, the unused register addresses are used, specifically to address the LMUL=1/8 fractional registers (8:1group).

Looking next at column LMUL=1/4 there is a three way division of addresses.
The allocation to 16:4 is the expected multiples of 4, just as LMUL=4.
The allocation to 4:1 is also expected as expected, the odd addresses are for this level fractional registers.
However, 8:2 does not have all the usual addresses available, because they are already used by 16:4. So 4:1 has 16 register addresses, 8:2 has 8 register addresses as does 16:4.

And finally, looking at column LMUL=1/2 there are four addressing groups.
2:1 has all the odd addresses as expected (so not on the chart).
16:8 has the expected four multiples of eight addresses.
8:4 has the multiples of 4 addresses minus the multiples of 8 (reserved for 16:8)
4:2 therefore has the remaining multiples of 2 addresses
So 2:1 has 16 register addresses, 4:2 has 8, 8:4 has 4 and 16:8 has 4

I question the value of providing the 16:mgroups. Eliminating this would enable more registers for other groups. However, for now we will continue to consider it.

use of groups that are not “SEW instruction” addressable from any LMUL level

I included 2:2, 4:4 and 8:8 groups even though they conceptually exist in the LMUL=1 level and are not addressable by LMUL=1 (for reasons explained before).

This would be like allowing LMUL=8 widening and narrowing instructions.
These instructions would write to one of 2 register groups at 0 and 16, each using 16 base-arch registers.
I propose such a change in #397

This is especially useful in fractional LMUL as more address space is available to the LMUL=1/2 instructions. Further, otherwise there is no widening narrowing for LMUL=1/2 through fractional groups.

**Mixed group type usage **

SEW instructions have access to 16 fractional registers and 16 fractional groups at all levels except LMUL=1/16.
This raises the question of interoperability of these mix matched structured operands.
I propose:

For op.vv SEW instructions

  • the register numbers identify the category of each argument.
  • when both input operands are of the same category the arguments are processed accordingly
  • when operands category mismatch the lowest grouping is used for both arguements.

For op.vs and op.vx instructions

  • the vector operand category is determined by its register number

For widening instructions the same rules hold. The vd must be a multiple of twice the input category depth. But it is not constrained to be one of those addressable from the next higher LMUL.

For narrowing operations,

  • the source register number determines the category using the next higher LMUL
  • the target register determines the category using the current LMUL.

SLEN and fracSLEN will be in the next installation:

to be continued .....