issue #393.4 - Towards a simple fractional LMUL design - third itteration .
This is the fourth installation of the Simple Fractioanl LMUL design.
It does not yet address SLEN and a suggestion for a fractional SLEN (fracSLEN).
It should however clarify the ramifications of the last
installment and address some necessary design decisions on
register number use in instructions.
-------------------------------------------------------------------------- --------------------------------------------------------------------------
Lets consider in detail some rows from the last table:
LMUL | 1/16 | 1/8 | 1/4 | 1/2 | 1 | 2 ... |
---|---|---|---|---|---|---|
.... | ||||||
group type | ||||||
16:2 | x | a=0, 2,4,6, ... | ||||
8:1 | a= odd | |||||
.... | ||||||
16:4 | x | a= 0,4, 8,12 ... | ||||
8:2 | x | a= 2,6, 10,14, ... | ||||
.... | ||||||
16:8 | x | a= 0,8, 16,24 | ||||
8:4 | x | a= 4,12, 18,20, ... | ||||
4:2 | x | a= 2,6, 10,14, ... | ||||
.... | ||||||
8:8 | x | |||||
4:4 | x | |||||
2:2 | x | |||||
LMUL | **1/16 | 1/8 | 1/4 | 1/2 | 1 | 2 |
The most prominent feature is the register numbers, especially for LMUL=1/4 and 1/2 that have to be extensively shared.
Detailing the register group addressibility in given fractional LMUL
Looking first at column LMUL=1/8 we see a nice division of
16:2group and 8:1group addresses.
Exactly what we might expect, the same as LMUL=2 where half the
registers are in register groups.
Unlike LMUL=2, the unused register addresses are used,
specifically to address the LMUL=1/8 fractional registers
(8:1group).
Looking next at column LMUL=1/4 there is a three way division of
addresses.
The allocation to 16:4 is the expected multiples of 4, just as
LMUL=4.
The allocation to 4:1 is also expected as expected, the odd
addresses are for this level fractional registers.
However, 8:2 does not have all the usual addresses available,
because they are already used by 16:4. So 4:1 has 16 register
addresses, 8:2 has 8 register addresses as does 16:4.
And finally, looking at column LMUL=1/2 there are four addressing
groups.
2:1 has all the odd addresses as expected (so not on the chart).
16:8 has the expected four multiples of eight addresses.
8:4 has the multiples of 4 addresses minus the multiples of 8
(reserved for 16:8)
4:2 therefore has the remaining multiples of 2 addresses
So 2:1 has 16 register addresses, 4:2 has 8, 8:4 has 4 and 16:8
has 4
I question the value of providing the 16:mgroups. Eliminating this would enable more registers for other groups. However, for now we will continue to consider it.
use of groups that are not “SEW instruction” addressable from any LMUL level
I included 2:2, 4:4 and 8:8 groups even though they conceptually exist in the LMUL=1 level and are not addressable by LMUL=1 (for reasons explained before).
This would be like allowing LMUL=8 widening and narrowing
instructions.
These instructions would write to one of 2 register groups at 0
and 16, each using 16 base-arch registers.
I propose such a change in #397
This is especially useful in fractional LMUL as more address space is available to the LMUL=1/2 instructions. Further, otherwise there is no widening narrowing for LMUL=1/2 through fractional groups.
**Mixed group type usage **
SEW instructions have access to 16 fractional registers and 16
fractional groups at all levels except LMUL=1/16.
This raises the question of interoperability of these mix matched
structured operands.
I propose:
For op.vv SEW instructions
- the register numbers identify the category of each argument.
- when both input operands are of the same category the arguments are processed accordingly
- when operands category mismatch the lowest grouping is used for both arguements.
For op.vs and op.vx instructions
- the vector operand category is determined by its register number
For widening instructions the same rules hold. The vd must be a multiple of twice the input category depth. But it is not constrained to be one of those addressable from the next higher LMUL.
For narrowing operations,
- the source register number determines the category using the next higher LMUL
- the target register determines the category using the current LMUL.
SLEN and fracSLEN will be in the next installation:
to be continued .....