[RISC-V] [tech-*] STRATEGIC FEATURE COEXISTANCE was:([tech-fast-int] usefulness of PUSHINT/POPINT from [tech-code-size])


David Horner
 

These are all important considerations.

However, what they have in common when considering Allen's question:

This discussion is bringing up an issue that needs wider discussion about extensions in general.

is that they are all tactical considerations are in the context of our current framework of instruction space allocation. What we will find is that these trade-off considerations will reinforce the dilemma that Allen raises. How do we manage these conflicting "necessities/requirements" of different target environments. 

I have hinted at it already, we need not only tactical analysis of feature tradeoff in different domains but a strategic approach to support them.

The concern is nothing new. It has been raised, if only obliquely, many times prior on the [google] groups.riscv.org (-dev, -sw especially) and  lists.riscv.org TG threads.
The vector group, especially,  has grappled with it in the context of current V encoding being a subset of a [hypothetical] 64 bit encoding.

Specific proposals have been mentioned, but there was then no political will or perhaps more fairly, no common perception that there was a compelling reason to work systematically to address it. The [then] common thinking was that 48 and 64 bit instruction spaces will be used as 32 and 16 bit are exhausted, and everyone will be happy. Well, that naive hope has not materialized and many are envisioning clashes that will hurt RISCV progress, either fragmentation or stagnation, as tactical approaches and considerations are implement or debated.

Previously two major strategic approaches were hinted at, even if they were not outright proposed.

Hardware Support - this has been explicitly proposed in many flavours: and is currently in the minds of many.
     The idea is a mode shift analogous to
        arm's transition to thumb and back and
        intel's myriad of operating modes: real, protected, virtual, long and their disparate instantiations.
     I agree that implementations should have considerable freedom on how to provide hardware select-able functionality.
     However, a proposed framework to support that should be provided by RISV.org.
     Recent discussion and document tweaks about misa (Machine ISA register) suggest that this mechanism,
          though valuable, is inadequate as robust support for the explosion of features.
     An expanded framework will be necessary, perhaps along the lines of the two level performance counters definitions.
     The conflict with overlapping mappings of groups of instructions to the same encoding space is not easily addressed by this mechanism.

which leads us to

Software Support:

The Generalized Proposal:
All future extensions are not mapped to a fixed exclusive universal encoding,
but rather to appropriately sized [based initially off 32 isize] minor [22-bit], major[25-bit] or quadrant [30-bit] encoding,
that is allocated to the appropriate instruction encoding at link/load time to match the hardware [or hardware dynamic configuration, as above].
This handles the green field encodings.
Each feature could have a default minor/major/quadrant encoding designation.

Brown field can also managed, simply if the related co-encoded feature is present, with more complexity, and perhaps extensive opcode mapping if blended into other feature's encodings.

An implementation method would be to have a fixed exclusive universal prefix for each feature.
Each instruction would then be emitted by the compiler as a [prefix]:[instruction with default encoding] pair.
If the initial prefixes are also nops [most of which are currently designated as hints],
then the code would be executable on machines that use the default mapping
without any link/load intervention [at lower performance granted].

This approach is backward compatible for the other established extensions:
most notably F which consumes 7 major opcodes spaces [and *only* 5 with Zfinx (Zifloat?)] and
then AMO which also consumes the majority of a major opcode.

This strategic change has a number of immediate and significant benefits:
  1) custom reserved major op codes effectively become unreserved as "standard" extensions can be mapped there also.
       The custom reserved nature will then only be the designated default allocation, "standard extensions" will not default to them.
  2) as mentioned above, if the prefix is a nop then link/load support is not needed for direct execution support [only efficiency].
  3) the transition to higher bit encodings can be simplified. As easily as the compiler emmitting the designated  prefix for that feature that encodes for 64 bit instructions.
So, two assigned fixed exclusive encodings per feature may be useful, one a 64bit encoding and one a nop.

I do not intent to stifle any of the tactical discussions of co-usefulness of features and profile domains.
These are meaningful and useful considerations.

Rather, I hope that by having a framework for coexistance of features, that those discussions can proceed in a more guided way;
that discovers can be incorporated into a framework centric corpus of understanding of trade-offs and cooperative benefits of features/profiles.


On 2020-10-23 11:45 p.m., Robert Chyla wrote:
I agree with Greg's statements. For me 'code-size' is very important for small, deeply embedded/IoT-class small systems.

Work in other groups (bitmanip) will also benefit code size, but it is not primary focus I think as these will also improve code-speed.

Linux-like big processors usually have DDR RAM and code size is 'unlimited'.
It should not hurt as code-size advances will benefit such big systems, but we should not forget about 'cheap to implement'='logic size' factors.

IMO 'code-size' and 'code-speed' will be pulling same rug (ISA-space) into opposite directions. We must balance it properly - having a rug in one piece is IMO most important.

Regards,
/Robert

On 10/23/2020 5:11 PM, Greg Favor wrote:
It seems like a TG, probably through the statement of its charter, should clearly define what types or classes of systems it is focused on optimizing for (if there is an intended focus) and what types or classes of systems it does not expect to be appropriate for.   More concretely, it seems like there are a few TG's developing extensions oriented towards embedded real time systems and/or low-cost embedded systems.  These are extensions that would probably not be implemented in full-blown Linux-class systems.  Those extensions don't need to worry about being acceptable to such system designs, and can optimize for the requirements and constraints of their target class(es) of systems.

Unless I'm mistaken, this TG falls in that category.  And as long as the charter captures this, then the extension it produces can be properly evaluated against its goals and target system applications (and not be judged wrt other classes of systems).  And key trade-off considerations - like certain types of implementation approaches being acceptable or unacceptable for the target system applications - should probably be agreed upon early on.

Greg

On Fri, Oct 23, 2020 at 4:34 PM Allen Baum <allen.baum@...> wrote:
This discussion is bringing up an issue that needs wider discussion about extensions in general.
Risc-V is intended to be an architecture that supports an extremely wide range of implementations, 
ranging from very low gate count microcontrollers, to high end superscalar out-of-order processors.
How do we evaluate an extension that only makes sense at one end or the other?

I don't expect a vector, or even hypervisor extensions in a low gate count system.
There are other extensions that are primarily aimed at specific applications areas as well.

A micro sequenced (e.g. push/pop[int]) op might be fairly trivial to implement in a low gate count system
(e.g. without VM, but with PMPs) and have significant savings in code size, power, and increased performance.
They may have none of those, or less significant, advantages in a high end implementation --
and/or might be very difficult or costly to implement in them, (e.g. for TLB miss, interrupt, & exception handling )
 (I am not claiming that these specific ops do, but just pretend there is one like that)

Should we avoid defining instructions and extensions like that? 
Or just allow that some extensions just don't make sense for some class of implementation?
Are there guidelines we can put in place to help make those decisions? 
This same (not precisely the same) kind of issue is rearing its head in other places, e.g. range based CMOs.


--
Regards,
Robert Chyla, Lead Engineer, Debug and Trace Probe Software
IAR Systems
1211 Flynn Rd, Unit 104
Camarillo, CA  93012 USA
Office: +1 805 383 3682 x104
E-mail: Robert.Chyla@... Website: www.iar.com


Allen Baum
 

Are we talking about something that is effectively bank switching the opcodes here?
Something like that was proposed very early on, using a CSR (like MISA maybe - the details are lost to me) to enable and disable them.
The specific issue that brought it up is if someone developed a custom extension, did a lot of work, and then some other extension came along that stepped on those opcodes - and the implementation wanted to use both of them.
The author thought it was pretty obvious this kind of thing was going to happen. I don't think that exact scenario will, but running out of standard 32b opcodes with ratified extensions might. 
We're already starting to look at the long tail - extensions that are specialized to specific workloads, but highly advantageous to them.
I'm guessing we will get to the point that these extensions will not have to coexist inside a single app, though - so a bank switching approach (non-user mode at the least, perhaps not within an app at all) could potentially work, but it sounds ugly to make the tools understand the configuration.


On Sat, Oct 24, 2020 at 8:23 AM ds2horner <ds2horner@...> wrote:

These are all important considerations.

However, what they have in common when considering Allen's question:

This discussion is bringing up an issue that needs wider discussion about extensions in general.

is that they are all tactical considerations are in the context of our current framework of instruction space allocation. What we will find is that these trade-off considerations will reinforce the dilemma that Allen raises. How do we manage these conflicting "necessities/requirements" of different target environments. 

I have hinted at it already, we need not only tactical analysis of feature tradeoff in different domains but a strategic approach to support them.

The concern is nothing new. It has been raised, if only obliquely, many times prior on the [google] groups.riscv.org (-dev, -sw especially) and  lists.riscv.org TG threads.
The vector group, especially,  has grappled with it in the context of current V encoding being a subset of a [hypothetical] 64 bit encoding.

Specific proposals have been mentioned, but there was then no political will or perhaps more fairly, no common perception that there was a compelling reason to work systematically to address it. The [then] common thinking was that 48 and 64 bit instruction spaces will be used as 32 and 16 bit are exhausted, and everyone will be happy. Well, that naive hope has not materialized and many are envisioning clashes that will hurt RISCV progress, either fragmentation or stagnation, as tactical approaches and considerations are implement or debated.

Previously two major strategic approaches were hinted at, even if they were not outright proposed.

Hardware Support - this has been explicitly proposed in many flavours: and is currently in the minds of many.
     The idea is a mode shift analogous to
        arm's transition to thumb and back and
        intel's myriad of operating modes: real, protected, virtual, long and their disparate instantiations.
     I agree that implementations should have considerable freedom on how to provide hardware select-able functionality.
     However, a proposed framework to support that should be provided by RISV.org.
     Recent discussion and document tweaks about misa (Machine ISA register) suggest that this mechanism,
          though valuable, is inadequate as robust support for the explosion of features.
     An expanded framework will be necessary, perhaps along the lines of the two level performance counters definitions.
     The conflict with overlapping mappings of groups of instructions to the same encoding space is not easily addressed by this mechanism.

which leads us to

Software Support:

The Generalized Proposal:
All future extensions are not mapped to a fixed exclusive universal encoding,
but rather to appropriately sized [based initially off 32 isize] minor [22-bit], major[25-bit] or quadrant [30-bit] encoding,
that is allocated to the appropriate instruction encoding at link/load time to match the hardware [or hardware dynamic configuration, as above].
This handles the green field encodings.
Each feature could have a default minor/major/quadrant encoding designation.

Brown field can also managed, simply if the related co-encoded feature is present, with more complexity, and perhaps extensive opcode mapping if blended into other feature's encodings.

An implementation method would be to have a fixed exclusive universal prefix for each feature.
Each instruction would then be emitted by the compiler as a [prefix]:[instruction with default encoding] pair.
If the initial prefixes are also nops [most of which are currently designated as hints],
then the code would be executable on machines that use the default mapping
without any link/load intervention [at lower performance granted].

This approach is backward compatible for the other established extensions:
most notably F which consumes 7 major opcodes spaces [and *only* 5 with Zfinx (Zifloat?)] and
then AMO which also consumes the majority of a major opcode.

This strategic change has a number of immediate and significant benefits:
  1) custom reserved major op codes effectively become unreserved as "standard" extensions can be mapped there also.
       The custom reserved nature will then only be the designated default allocation, "standard extensions" will not default to them.
  2) as mentioned above, if the prefix is a nop then link/load support is not needed for direct execution support [only efficiency].
  3) the transition to higher bit encodings can be simplified. As easily as the compiler emmitting the designated  prefix for that feature that encodes for 64 bit instructions.
So, two assigned fixed exclusive encodings per feature may be useful, one a 64bit encoding and one a nop.

I do not intent to stifle any of the tactical discussions of co-usefulness of features and profile domains.
These are meaningful and useful considerations.

Rather, I hope that by having a framework for coexistance of features, that those discussions can proceed in a more guided way;
that discovers can be incorporated into a framework centric corpus of understanding of trade-offs and cooperative benefits of features/profiles.


On 2020-10-23 11:45 p.m., Robert Chyla wrote:
I agree with Greg's statements. For me 'code-size' is very important for small, deeply embedded/IoT-class small systems.

Work in other groups (bitmanip) will also benefit code size, but it is not primary focus I think as these will also improve code-speed.

Linux-like big processors usually have DDR RAM and code size is 'unlimited'.
It should not hurt as code-size advances will benefit such big systems, but we should not forget about 'cheap to implement'='logic size' factors.

IMO 'code-size' and 'code-speed' will be pulling same rug (ISA-space) into opposite directions. We must balance it properly - having a rug in one piece is IMO most important.

Regards,
/Robert

On 10/23/2020 5:11 PM, Greg Favor wrote:
It seems like a TG, probably through the statement of its charter, should clearly define what types or classes of systems it is focused on optimizing for (if there is an intended focus) and what types or classes of systems it does not expect to be appropriate for.   More concretely, it seems like there are a few TG's developing extensions oriented towards embedded real time systems and/or low-cost embedded systems.  These are extensions that would probably not be implemented in full-blown Linux-class systems.  Those extensions don't need to worry about being acceptable to such system designs, and can optimize for the requirements and constraints of their target class(es) of systems.

Unless I'm mistaken, this TG falls in that category.  And as long as the charter captures this, then the extension it produces can be properly evaluated against its goals and target system applications (and not be judged wrt other classes of systems).  And key trade-off considerations - like certain types of implementation approaches being acceptable or unacceptable for the target system applications - should probably be agreed upon early on.

Greg

On Fri, Oct 23, 2020 at 4:34 PM Allen Baum <allen.baum@...> wrote:
This discussion is bringing up an issue that needs wider discussion about extensions in general.
Risc-V is intended to be an architecture that supports an extremely wide range of implementations, 
ranging from very low gate count microcontrollers, to high end superscalar out-of-order processors.
How do we evaluate an extension that only makes sense at one end or the other?

I don't expect a vector, or even hypervisor extensions in a low gate count system.
There are other extensions that are primarily aimed at specific applications areas as well.

A micro sequenced (e.g. push/pop[int]) op might be fairly trivial to implement in a low gate count system
(e.g. without VM, but with PMPs) and have significant savings in code size, power, and increased performance.
They may have none of those, or less significant, advantages in a high end implementation --
and/or might be very difficult or costly to implement in them, (e.g. for TLB miss, interrupt, & exception handling )
 (I am not claiming that these specific ops do, but just pretend there is one like that)

Should we avoid defining instructions and extensions like that? 
Or just allow that some extensions just don't make sense for some class of implementation?
Are there guidelines we can put in place to help make those decisions? 
This same (not precisely the same) kind of issue is rearing its head in other places, e.g. range based CMOs.


--
Regards,
Robert Chyla, Lead Engineer, Debug and Trace Probe Software
IAR Systems
1211 Flynn Rd, Unit 104
Camarillo, CA  93012 USA
Office: +1 805 383 3682 x104
E-mail: Robert.Chyla@... Website: www.iar.com


David Horner
 

My take: This is analogous to ascii(7-bit) and EBCIDIC(8-bit) both competing in the 8 bit byte addressable character space.

Initial solutions were fragmentation, then code pages (select-able character sets).

Eventually unicode became the standard that allowed universal adoption and definition, and  down sizing to domains that needed a specific 8-bit byte encoding/mapping, printers, tty etc.

Just as C-extention initially relied on the linker/loader to do the code replacement, so to the uni-op-code would initially rely on the linker/loader.

As the tool chain becomes more sophisticated self-conforming-software  to hardware configuration (à la Linux)) will be developed.

.

 

On 2020-10-24 11:23 a.m., ds2horner wrote:

These are all important considerations.

However, what they have in common when considering Allen's question:

This discussion is bringing up an issue that needs wider discussion about extensions in general.

is that they are all tactical considerations are in the context of our current framework of instruction space allocation. What we will find is that these trade-off considerations will reinforce the dilemma that Allen raises. How do we manage these conflicting "necessities/requirements" of different target environments. 

I have hinted at it already, we need not only tactical analysis of feature tradeoff in different domains but a strategic approach to support them.

The concern is nothing new. It has been raised, if only obliquely, many times prior on the [google] groups.riscv.org (-dev, -sw especially) and  lists.riscv.org TG threads.
The vector group, especially,  has grappled with it in the context of current V encoding being a subset of a [hypothetical] 64 bit encoding.

Specific proposals have been mentioned, but there was then no political will or perhaps more fairly, no common perception that there was a compelling reason to work systematically to address it. The [then] common thinking was that 48 and 64 bit instruction spaces will be used as 32 and 16 bit are exhausted, and everyone will be happy. Well, that naive hope has not materialized and many are envisioning clashes that will hurt RISCV progress, either fragmentation or stagnation, as tactical approaches and considerations are implement or debated.

Previously two major strategic approaches were hinted at, even if they were not outright proposed.

Hardware Support - this has been explicitly proposed in many flavours: and is currently in the minds of many.
     The idea is a mode shift analogous to
        arm's transition to thumb and back and
        intel's myriad of operating modes: real, protected, virtual, long and their disparate instantiations.
     I agree that implementations should have considerable freedom on how to provide hardware select-able functionality.
     However, a proposed framework to support that should be provided by RISV.org.
     Recent discussion and document tweaks about misa (Machine ISA register) suggest that this mechanism,
          though valuable, is inadequate as robust support for the explosion of features.
     An expanded framework will be necessary, perhaps along the lines of the two level performance counters definitions.
     The conflict with overlapping mappings of groups of instructions to the same encoding space is not easily addressed by this mechanism.

which leads us to

Software Support:

The Generalized Proposal:
All future extensions are not mapped to a fixed exclusive universal encoding,
but rather to appropriately sized [based initially off 32 isize] minor [22-bit], major[25-bit] or quadrant [30-bit] encoding,
that is allocated to the appropriate instruction encoding at link/load time to match the hardware [or hardware dynamic configuration, as above].
This handles the green field encodings.
Each feature could have a default minor/major/quadrant encoding designation.

Brown field can also managed, simply if the related co-encoded feature is present, with more complexity, and perhaps extensive opcode mapping if blended into other feature's encodings.

An implementation method would be to have a fixed exclusive universal prefix for each feature.
Each instruction would then be emitted by the compiler as a [prefix]:[instruction with default encoding] pair.
If the initial prefixes are also nops [most of which are currently designated as hints],
then the code would be executable on machines that use the default mapping
without any link/load intervention [at lower performance granted].

This approach is backward compatible for the other established extensions:
most notably F which consumes 7 major opcodes spaces [and *only* 5 with Zfinx (Zifloat?)] and
then AMO which also consumes the majority of a major opcode.

This strategic change has a number of immediate and significant benefits:
  1) custom reserved major op codes effectively become unreserved as "standard" extensions can be mapped there also.
       The custom reserved nature will then only be the designated default allocation, "standard extensions" will not default to them.
  2) as mentioned above, if the prefix is a nop then link/load support is not needed for direct execution support [only efficiency].
  3) the transition to higher bit encodings can be simplified. As easily as the compiler emmitting the designated  prefix for that feature that encodes for 64 bit instructions.
So, two assigned fixed exclusive encodings per feature may be useful, one a 64bit encoding and one a nop.

I do not intent to stifle any of the tactical discussions of co-usefulness of features and profile domains.
These are meaningful and useful considerations.

Rather, I hope that by having a framework for coexistance of features, that those discussions can proceed in a more guided way;
that discovers can be incorporated into a framework centric corpus of understanding of trade-offs and cooperative benefits of features/profiles.


On 2020-10-23 11:45 p.m., Robert Chyla wrote:
I agree with Greg's statements. For me 'code-size' is very important for small, deeply embedded/IoT-class small systems.

Work in other groups (bitmanip) will also benefit code size, but it is not primary focus I think as these will also improve code-speed.

Linux-like big processors usually have DDR RAM and code size is 'unlimited'.
It should not hurt as code-size advances will benefit such big systems, but we should not forget about 'cheap to implement'='logic size' factors.

IMO 'code-size' and 'code-speed' will be pulling same rug (ISA-space) into opposite directions. We must balance it properly - having a rug in one piece is IMO most important.

Regards,
/Robert

On 10/23/2020 5:11 PM, Greg Favor wrote:
It seems like a TG, probably through the statement of its charter, should clearly define what types or classes of systems it is focused on optimizing for (if there is an intended focus) and what types or classes of systems it does not expect to be appropriate for.   More concretely, it seems like there are a few TG's developing extensions oriented towards embedded real time systems and/or low-cost embedded systems.  These are extensions that would probably not be implemented in full-blown Linux-class systems.  Those extensions don't need to worry about being acceptable to such system designs, and can optimize for the requirements and constraints of their target class(es) of systems.

Unless I'm mistaken, this TG falls in that category.  And as long as the charter captures this, then the extension it produces can be properly evaluated against its goals and target system applications (and not be judged wrt other classes of systems).  And key trade-off considerations - like certain types of implementation approaches being acceptable or unacceptable for the target system applications - should probably be agreed upon early on.

Greg

On Fri, Oct 23, 2020 at 4:34 PM Allen Baum <allen.baum@...> wrote:
This discussion is bringing up an issue that needs wider discussion about extensions in general.
Risc-V is intended to be an architecture that supports an extremely wide range of implementations, 
ranging from very low gate count microcontrollers, to high end superscalar out-of-order processors.
How do we evaluate an extension that only makes sense at one end or the other?

I don't expect a vector, or even hypervisor extensions in a low gate count system.
There are other extensions that are primarily aimed at specific applications areas as well.

A micro sequenced (e.g. push/pop[int]) op might be fairly trivial to implement in a low gate count system
(e.g. without VM, but with PMPs) and have significant savings in code size, power, and increased performance.
They may have none of those, or less significant, advantages in a high end implementation --
and/or might be very difficult or costly to implement in them, (e.g. for TLB miss, interrupt, & exception handling )
 (I am not claiming that these specific ops do, but just pretend there is one like that)

Should we avoid defining instructions and extensions like that? 
Or just allow that some extensions just don't make sense for some class of implementation?
Are there guidelines we can put in place to help make those decisions? 
This same (not precisely the same) kind of issue is rearing its head in other places, e.g. range based CMOs.


--
Regards,
Robert Chyla, Lead Engineer, Debug and Trace Probe Software
IAR Systems
1211 Flynn Rd, Unit 104
Camarillo, CA  93012 USA
Office: +1 805 383 3682 x104
E-mail: Robert.Chyla@... Website: www.iar.com


David Horner
 


On 2020-10-26 12:48 a.m., Allen Baum wrote:
Are we talking about something that is effectively bank switching the opcodes here?
That is one approach. It is a consideration that has recently been mentioned wrt misa.
Something like that was proposed very early on, using a CSR (like MISA maybe - the details are lost to me) to enable and disable them.

I remember  Luke Kenneth Casson Leighton <lkcl@...> was in on the discussions.

A variety of csr and related approaches were considered.

The specific issue that brought it up is if someone developed a custom extension, did a lot of work, and then some other extension came along that stepped on those opcodes - and the implementation wanted to use both of them.
The author thought it was pretty obvious this kind of thing was going to happen. I don't think that exact scenario will, but running out of standard 32b opcodes with ratified extensions might.

exactly.

Also in lkcl's case the "vectorization" extension of all opcodes is [was proposed] of this nature

We're already starting to look at the long tail - extensions that are specialized to specific workloads, but highly advantageous to them.
I'm guessing we will get to the point that these extensions will not have to coexist inside a single app, though - so a bank switching approach (non-user mode at the least, perhaps not within an app at all) could potentially work, but it sounds ugly to make the tools understand the configuration.
agreed. thus the uni-op-code approach wich can co-exist with any of these stragegies but provides a framework to mange them. (just as ascii and EBCIDIC extensions are comparably managed).


On Sat, Oct 24, 2020 at 8:23 AM ds2horner <ds2horner@...> wrote:

These are all important considerations.

However, what they have in common when considering Allen's question:

This discussion is bringing up an issue that needs wider discussion about extensions in general.

is that they are all tactical considerations are in the context of our current framework of instruction space allocation. What we will find is that these trade-off considerations will reinforce the dilemma that Allen raises. How do we manage these conflicting "necessities/requirements" of different target environments. 

I have hinted at it already, we need not only tactical analysis of feature tradeoff in different domains but a strategic approach to support them.

The concern is nothing new. It has been raised, if only obliquely, many times prior on the [google] groups.riscv.org (-dev, -sw especially) and  lists.riscv.org TG threads.
The vector group, especially,  has grappled with it in the context of current V encoding being a subset of a [hypothetical] 64 bit encoding.

Specific proposals have been mentioned, but there was then no political will or perhaps more fairly, no common perception that there was a compelling reason to work systematically to address it. The [then] common thinking was that 48 and 64 bit instruction spaces will be used as 32 and 16 bit are exhausted, and everyone will be happy. Well, that naive hope has not materialized and many are envisioning clashes that will hurt RISCV progress, either fragmentation or stagnation, as tactical approaches and considerations are implement or debated.

Previously two major strategic approaches were hinted at, even if they were not outright proposed.

Hardware Support - this has been explicitly proposed in many flavours: and is currently in the minds of many.
     The idea is a mode shift analogous to
        arm's transition to thumb and back and
        intel's myriad of operating modes: real, protected, virtual, long and their disparate instantiations.
     I agree that implementations should have considerable freedom on how to provide hardware select-able functionality.
     However, a proposed framework to support that should be provided by RISV.org.
     Recent discussion and document tweaks about misa (Machine ISA register) suggest that this mechanism,
          though valuable, is inadequate as robust support for the explosion of features.
     An expanded framework will be necessary, perhaps along the lines of the two level performance counters definitions.
     The conflict with overlapping mappings of groups of instructions to the same encoding space is not easily addressed by this mechanism.

which leads us to

Software Support:

The Generalized Proposal:
All future extensions are not mapped to a fixed exclusive universal encoding,
but rather to appropriately sized [based initially off 32 isize] minor [22-bit], major[25-bit] or quadrant [30-bit] encoding,
that is allocated to the appropriate instruction encoding at link/load time to match the hardware [or hardware dynamic configuration, as above].
This handles the green field encodings.
Each feature could have a default minor/major/quadrant encoding designation.

Brown field can also managed, simply if the related co-encoded feature is present, with more complexity, and perhaps extensive opcode mapping if blended into other feature's encodings.

An implementation method would be to have a fixed exclusive universal prefix for each feature.
Each instruction would then be emitted by the compiler as a [prefix]:[instruction with default encoding] pair.
If the initial prefixes are also nops [most of which are currently designated as hints],
then the code would be executable on machines that use the default mapping
without any link/load intervention [at lower performance granted].

This approach is backward compatible for the other established extensions:
most notably F which consumes 7 major opcodes spaces [and *only* 5 with Zfinx (Zifloat?)] and
then AMO which also consumes the majority of a major opcode.

This strategic change has a number of immediate and significant benefits:
  1) custom reserved major op codes effectively become unreserved as "standard" extensions can be mapped there also.
       The custom reserved nature will then only be the designated default allocation, "standard extensions" will not default to them.
  2) as mentioned above, if the prefix is a nop then link/load support is not needed for direct execution support [only efficiency].
  3) the transition to higher bit encodings can be simplified. As easily as the compiler emmitting the designated  prefix for that feature that encodes for 64 bit instructions.
So, two assigned fixed exclusive encodings per feature may be useful, one a 64bit encoding and one a nop.

I do not intent to stifle any of the tactical discussions of co-usefulness of features and profile domains.
These are meaningful and useful considerations.

Rather, I hope that by having a framework for coexistance of features, that those discussions can proceed in a more guided way;
that discovers can be incorporated into a framework centric corpus of understanding of trade-offs and cooperative benefits of features/profiles.


On 2020-10-23 11:45 p.m., Robert Chyla wrote:
I agree with Greg's statements. For me 'code-size' is very important for small, deeply embedded/IoT-class small systems.

Work in other groups (bitmanip) will also benefit code size, but it is not primary focus I think as these will also improve code-speed.

Linux-like big processors usually have DDR RAM and code size is 'unlimited'.
It should not hurt as code-size advances will benefit such big systems, but we should not forget about 'cheap to implement'='logic size' factors.

IMO 'code-size' and 'code-speed' will be pulling same rug (ISA-space) into opposite directions. We must balance it properly - having a rug in one piece is IMO most important.

Regards,
/Robert

On 10/23/2020 5:11 PM, Greg Favor wrote:
It seems like a TG, probably through the statement of its charter, should clearly define what types or classes of systems it is focused on optimizing for (if there is an intended focus) and what types or classes of systems it does not expect to be appropriate for.   More concretely, it seems like there are a few TG's developing extensions oriented towards embedded real time systems and/or low-cost embedded systems.  These are extensions that would probably not be implemented in full-blown Linux-class systems.  Those extensions don't need to worry about being acceptable to such system designs, and can optimize for the requirements and constraints of their target class(es) of systems.

Unless I'm mistaken, this TG falls in that category.  And as long as the charter captures this, then the extension it produces can be properly evaluated against its goals and target system applications (and not be judged wrt other classes of systems).  And key trade-off considerations - like certain types of implementation approaches being acceptable or unacceptable for the target system applications - should probably be agreed upon early on.

Greg

On Fri, Oct 23, 2020 at 4:34 PM Allen Baum <allen.baum@...> wrote:
This discussion is bringing up an issue that needs wider discussion about extensions in general.
Risc-V is intended to be an architecture that supports an extremely wide range of implementations, 
ranging from very low gate count microcontrollers, to high end superscalar out-of-order processors.
How do we evaluate an extension that only makes sense at one end or the other?

I don't expect a vector, or even hypervisor extensions in a low gate count system.
There are other extensions that are primarily aimed at specific applications areas as well.

A micro sequenced (e.g. push/pop[int]) op might be fairly trivial to implement in a low gate count system
(e.g. without VM, but with PMPs) and have significant savings in code size, power, and increased performance.
They may have none of those, or less significant, advantages in a high end implementation --
and/or might be very difficult or costly to implement in them, (e.g. for TLB miss, interrupt, & exception handling )
 (I am not claiming that these specific ops do, but just pretend there is one like that)

Should we avoid defining instructions and extensions like that? 
Or just allow that some extensions just don't make sense for some class of implementation?
Are there guidelines we can put in place to help make those decisions? 
This same (not precisely the same) kind of issue is rearing its head in other places, e.g. range based CMOs.


--
Regards,
Robert Chyla, Lead Engineer, Debug and Trace Probe Software
IAR Systems
1211 Flynn Rd, Unit 104
Camarillo, CA  93012 USA
Office: +1 805 383 3682 x104
E-mail: Robert.Chyla@... Website: www.iar.com