Date   

Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Brian Grayson
 

I noticed another typo that I don't think has been pointed out -- reuse of bit 59.

bit [59]  VSINH       -  If set, then counting of events in VS-mode is inhibited
bit [59]  VUINH       -  If set, then counting of events in VU-mode is inhibited

Brian

On Mon, Feb 1, 2021 at 2:08 PM Greg Favor <gfavor@...> wrote:
On Mon, Feb 1, 2021 at 10:00 AM Brian Grayson <brian.grayson@...> wrote:
Given the discussions about cache-ops and the name for them on tech-cmo and the desire to avoid "co", "COF" (which can also mean "change of flow") may not be the best choice for the extension short-name. What about just "Sshpm", as this extension is what really allows the HPM to be well-utilized by tools like perf? Or is that too confusing since hpm already exists?

Using 'hpm' probably would be a bit confusing.  But I'll look into alternatives.  Btw, since a new extension naming standard is being developed, ultimately this (and all other extensions) will need to conform to the new scheme (although the 'Ss' part of this name is expected to be consistent with that new scheme).  Also note that CMO group extensions will have "Zi*" names and the concern over use of "co" or "cop" as a root name was particularly in that context (i.e. wrt other Unpriv spec extensions; while this extension in the "S" name space for Priv extensions).  But in any case I'll explore alternatives that may be acceptable.
 
Is there a reason there is no mcountovf? It would simplify the software for an M-mode tool, and for cores that don't have an S-mode.

This has been discussed (with the lead architects; I'll stop repeatedly mentioning this).  And in standard RISC philosophy form, it was considered to have insufficient justification.  For a core with S-mode and if M-mode wants to examine the bits for counters that have not been "delegated" down to S-mode via mcounteren, then M-mode can either use a three-instruction sequence to read a version of scountovf unaffected by mcounteren, or it can directly check the individual mhpmevent.OF bits that it cares about.  The latter also applies for a core without S-mode that implements this extension.  (Further, I imagine a "no S-mode" CPU probably only implements a small number of counters.)
 
How is overflow defined for an implementation that implements 32<n<64 bits in the counter registers? Although the registers are architecturally 64 bits, an implementation may not want to support all of them.

The Priv spec says "The mhpmcounters are WARL registers that support up to 64 bits of precision".  This allows complete flexibility for how many implemented bits there are.

Since count values are defined as unsigned, there is always an equivalent unsigned 64-bit current count value irrespective of the implemented size.  So overflow is well-defined (modulo the issue down below).
 
Mandating full 64-bit counters may make an implementation area-prohibitive for the smallest of perfmon-enabled embedded cores. I think this could be specified like this: "An implementation may implement less than 64 bits for the hpmcounter CSRs. On such an implementation, software can query the bit width of the hmpcounter registers by taking advantage of the WARL behavior: writing all 1's and reading back to see which bits retained the set value.

This would be an issue to raise with the existing Priv spec, not with this extension.  But as noted above, this isn't really an issue since it is already comprehended by the Priv spec.
 
Also, on such implementations, overflow is defined to occur when the highest implemented bit transitions from 1 to 0." Given that, software can do the right thing regardless of implemented bit width.

Good point.  The current proposed definition doesn't properly comprehend the WARL nature of the hpmcounter registers.  I'll switch to a definition along the lines of what you describe (I agree that that is what is needed).  Count values remain as unsigned values and overflow is unsigned overflow of the implemented bits.
 
Greg


Re: Preferred manner of supporting bus errors in RISC-V

Greg Chadwick
 

Hello,

Thanks for raising this Arjan, it's been a low-priority item on my TODO list to
open a discussion on bus errors for a while now (I work on Ibex amongst other
things at lowRISC).

I think RISC-V should allow implementation to choose whether not they want
precise or imprecise bus errors, which I think is the case now. However as you
point out the specification is pretty silent on the matter. Some wording around
what the possibilities might be and ensuring the specification doesn't prevent
certain options from working without good reason seems prudent.

In particular we have the issue of the mcause exception code for bus errors that
you raise. I believe codes 1,5 and 7 are meant to be PMP faults only. Ibex is
non-confirming at the moment due to its use of the same code for both PMP and
bus errors. I think SweRV may do the same (look at the EH1 source here:
https://github.com/chipsalliance/Cores-SweRV/blob/7332edc0adaa7e9a0c842d169154429e8d987786/design/lsu/lsu_lsc_ctl.sv#L211
when generating its exception packet it combines access and bus errors together
and only alters type for misaligned or not).  The Andes/Gowin N25 also looks to
use the PMP mcause codes for precise bus errors (see page 87 of
https://www.gowinsemi.com/upload/database_doc/586/document_ja/5de4c10ca33c9.pdf)

I don't really mind if we introduce a new code here or broaden the definition of
'access fault' to include non PMP errors like bus errors. It could even be left
implementation defined though I'd prefer a specification defined bus error
mcause.

I did also have some concerns around how precise bus errors interact with
interrupts. In particular if you have an outstanding memory access (that may or
may not see a bus error) and receive an interrupt is it permissible to
effectively ignore the interrupt until the potential bus error is resolved?
Again I think the specification gives implementations room to do different
things here as it's up to the implementation how an interrupt becomes pending
(see some extensive discussion here:
https://github.com/riscv/riscv-isa-manual/issues/544) some extra wording
somewhere to make it clear this is a possibility could be useful.

Cheers,

Greg Chadwick

On Wed, Feb 3, 2021 at 11:35 AM Arjan Bink <Arjan.Bink@...> wrote:

Hi all,

 

We want to add support for ‘bus errors’ in our RISC-V design (e.g. signaled via AXI bresp/rresp signals). I studied a couple of different RISC-V architectures and I do not see a common approach for dealing with this.

 

Some examples:

 

  • SiFive uses a ‘bus error unit’ that converts bus errors into regular interrupts
  • Ibex implements precise bus errors and causes exceptions using RISC-V defined mcause exception codes (i.e. instruction access fault (exception code 1), load access fault (exception code 5), store/AMO access fault (exception code 7)
  • SweRV-EL2 maps imprecise bus errors onto custom NMIs (and they also have precise bus errors).

 

The RISC-V Privileged specification hardly mentions this topics, but has the following quotes that might be related:

 

“Non-maskable interrupts (NMIs) are only used for hardware error conditions”

“Precise PMA traps might not always be possible, for example, when probing a legacy bus architecture that uses access failures as part of the discovery mechanism. In this case, error responses from slave devices will be reported as imprecise bus-error interrupts.”

 

In our design we will have a PMP (so exception codes 1, 5, 7 are used to report precise PMP exceptions), precise instruction bus errors, and imprecise data bus errors. What is the intended manner of dealing with these precise instruction bus errors and imprecise data bus errors? Should we cause NMIs for them? Should we map them to a regular exception non-interrupt) with mcause exception codes 1, 5, 7 (which would be confusing as software can then not distinguish them from the PMP errors and also code 5 and 7 would be used for both precise PMP exceptions and imprecise data bus exceptions). Usage of an external ‘bus error unit’ does not seem appropriate as it could easily cause an interrupt on a speculative (and never actually executed) instruction fetch.

 

So, is there any common or recommended manner of dealing with bus errors?

 

Best regards,

Arjan


Preferred manner of supporting bus errors in RISC-V

Arjan Bink
 

Hi all,

 

We want to add support for ‘bus errors’ in our RISC-V design (e.g. signaled via AXI bresp/rresp signals). I studied a couple of different RISC-V architectures and I do not see a common approach for dealing with this.

 

Some examples:

 

  • SiFive uses a ‘bus error unit’ that converts bus errors into regular interrupts
  • Ibex implements precise bus errors and causes exceptions using RISC-V defined mcause exception codes (i.e. instruction access fault (exception code 1), load access fault (exception code 5), store/AMO access fault (exception code 7)
  • SweRV-EL2 maps imprecise bus errors onto custom NMIs (and they also have precise bus errors).

 

The RISC-V Privileged specification hardly mentions this topics, but has the following quotes that might be related:

 

“Non-maskable interrupts (NMIs) are only used for hardware error conditions”

“Precise PMA traps might not always be possible, for example, when probing a legacy bus architecture that uses access failures as part of the discovery mechanism. In this case, error responses from slave devices will be reported as imprecise bus-error interrupts.”

 

In our design we will have a PMP (so exception codes 1, 5, 7 are used to report precise PMP exceptions), precise instruction bus errors, and imprecise data bus errors. What is the intended manner of dealing with these precise instruction bus errors and imprecise data bus errors? Should we cause NMIs for them? Should we map them to a regular exception non-interrupt) with mcause exception codes 1, 5, 7 (which would be confusing as software can then not distinguish them from the PMP errors and also code 5 and 7 would be used for both precise PMP exceptions and imprecise data bus exceptions). Usage of an external ‘bus error unit’ does not seem appropriate as it could easily cause an interrupt on a speculative (and never actually executed) instruction fetch.

 

So, is there any common or recommended manner of dealing with bus errors?

 

Best regards,

Arjan


Re: RISC-V H-extension freeze consideration

Greg Favor
 

I generally agree with Andrew.  At the same time I'll also observe that, practically speaking, the AIA is coming soon and it very much directly interacts with the H extension.  So waiting a little longer to see that at least stabilize is a good (and probably necessary) compromise.  (Past that we can then come back to arguing when to draw the line on freezing the H extension spec.)  Further, there are other extensions happening now and the next few months (from the virt-mem group, pointer masking from the J group, and a couple of fast-track extensions) that it would be good to stabilize if not freeze in conjunction with the H extension.  So, in my own opinion, we're getting close.  Not a few weeks, but not quarters either.  (I'll also say that the "pressure is on" to intelligently try and get through this period of time sooner than later.)

Having all these things that interact with virtualization being finalized together (I'm being loose for now wrt official "stable" versus "freeze" milestones, to focus on the general idea) is a good thing for the reasons Andrew mentioned.  Most important (risk-wise) to me is seeing the virt-mem extensions and AIA stuff stabilize.

Now, when it comes to the IOMMU, its architecture needs to (or strongly should) follow the CPU virtualization architecture.  But I think it is an acceptable compromise to not hold up all the preceding because of the IOMMU architecture.  I see very low risk of realizing from the IOMMU architecture that something in the H extension should have been done differently.  Maybe some extra feature will be identified, but that could be done as an extension to the H extension (and I think that is also low risk of happening).  I'll also note that the AIA will cover how an IOMMU handles virtualization of I/O interrupts (aka MSIs).  Which leaves normal translation of I/O addresses to follow the mold of the Supervisor and Hypervisor architectures.  (And, for completeness, many of the other "interesting" aspects of an IOMMU architecture I believe can and should comport with the Supervisor and Hypervisor architectures as needed.)

In short, I think a reasonable compromise is to wait a "little" bit longer for most of the above "coming soon" things, and to decouple the IOMMU timeline from freezing the H extension and related extensions.

Greg


On Tue, Feb 2, 2021 at 8:13 PM Andrew Waterman <andrew@...> wrote:
In other architectures, those devices are needlessly complex in part because they weren’t co-designed with the ISA. Yes, they can be independently designed, but possibly with regrettable consequences.

On Tue, Feb 2, 2021 at 8:06 PM Anup Patel <Anup.Patel@...> wrote:
















On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.



 



We already the required ISA support in H-extension for virtualization-aware interrupt controller.



 



The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.



 



Regards,



Anup



 









From: Andrew Waterman <andrew@...>


Sent: 03 February 2021 09:24


To: Anup Patel <Anup.Patel@...>


Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...;

tech-unixplatformspec@...


Subject: Re: RISC-V H-extension freeze consideration







 







I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically

disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.









 







Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.







 







On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:







Hi All,











The RISC-V H-extension v0.6.1 draft was released almost a year back in





May 2020. There has been no changes in the H-extension specification





since then.











Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,





Spike, and Rocket FPGA. We also have three different hypervisors ported





to RISC-V H-extension v0.6.1:





1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)





2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)





3. Bao (Works on Rocket FPGA)











Unfortunately, RISC-V H-extension not being in freeze state is now gating





further software development because major open source projects (such





as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches





only for frozen or ratified RISC-V extensions.











Few examples of gated software development:





1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V





    patches are already reviewed and acked by maintainers in July 2019.





    Currently, we are rebasing KVM RISC-V patches with every kernel





    release since 1.5+ years.





2. GCC RISC-V not accepting patches for H-extension related instructions





3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in





    upstream Linux RISC-V





4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is





    not merged in upstream Linux RISC-V





5. Various feature additions (such as SBI v0.2, nested, etc) can't happen





   (or can't be merged) until KVM RISC-V is merged in Linux RISC-V





6. Libvirt library blocked on QEMU KVM RISC-V acceleration being





    available. The Libvirt library is a crucial piece in open-source cloud





    solutions (such as open-stack).





7. As time passes more items (such as KVM RISC-V AIA support) will





    get blocked if KVM RISC-V is not merged upstream.











We would request the TSC to consider freezing RISC-V H-extension v0.6.1





draft specification. Remaining items in done checklist for ratification can





certainly be completed while H-extension is in the frozen state.











Best Regards,





Anup Patel


















Re: RISC-V H-extension freeze consideration

Andrew Waterman
 

In other architectures, those devices are needlessly complex in part because they weren’t co-designed with the ISA. Yes, they can be independently designed, but possibly with regrettable consequences.

On Tue, Feb 2, 2021 at 8:06 PM Anup Patel <Anup.Patel@...> wrote:
















On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.



 



We already the required ISA support in H-extension for virtualization-aware interrupt controller.



 



The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.



 



Regards,



Anup



 









From: Andrew Waterman <andrew@...>


Sent: 03 February 2021 09:24


To: Anup Patel <Anup.Patel@...>


Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...;

tech-unixplatformspec@...


Subject: Re: RISC-V H-extension freeze consideration







 







I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically

disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.









 







Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.







 







On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:







Hi All,











The RISC-V H-extension v0.6.1 draft was released almost a year back in





May 2020. There has been no changes in the H-extension specification





since then.











Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,





Spike, and Rocket FPGA. We also have three different hypervisors ported





to RISC-V H-extension v0.6.1:





1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)





2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)





3. Bao (Works on Rocket FPGA)











Unfortunately, RISC-V H-extension not being in freeze state is now gating





further software development because major open source projects (such





as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches





only for frozen or ratified RISC-V extensions.











Few examples of gated software development:





1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V





    patches are already reviewed and acked by maintainers in July 2019.





    Currently, we are rebasing KVM RISC-V patches with every kernel





    release since 1.5+ years.





2. GCC RISC-V not accepting patches for H-extension related instructions





3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in





    upstream Linux RISC-V





4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is





    not merged in upstream Linux RISC-V





5. Various feature additions (such as SBI v0.2, nested, etc) can't happen





   (or can't be merged) until KVM RISC-V is merged in Linux RISC-V





6. Libvirt library blocked on QEMU KVM RISC-V acceleration being





    available. The Libvirt library is a crucial piece in open-source cloud





    solutions (such as open-stack).





7. As time passes more items (such as KVM RISC-V AIA support) will





    get blocked if KVM RISC-V is not merged upstream.











We would request the TSC to consider freezing RISC-V H-extension v0.6.1





draft specification. Remaining items in done checklist for ratification can





certainly be completed while H-extension is in the frozen state.











Best Regards,





Anup Patel


















Re: RISC-V H-extension freeze consideration

Anup Patel
 

On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.

 

We already the required ISA support in H-extension for virtualization-aware interrupt controller.

 

The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.

 

Regards,

Anup

 

From: Andrew Waterman <andrew@...>
Sent: 03 February 2021 09:24
To: Anup Patel <Anup.Patel@...>
Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...; tech-unixplatformspec@...
Subject: Re: RISC-V H-extension freeze consideration

 

I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.

 

Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.

 

On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:

Hi All,



The RISC-V H-extension v0.6.1 draft was released almost a year back in

May 2020. There has been no changes in the H-extension specification

since then.



Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,

Spike, and Rocket FPGA. We also have three different hypervisors ported

to RISC-V H-extension v0.6.1:

1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)

2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)

3. Bao (Works on Rocket FPGA)



Unfortunately, RISC-V H-extension not being in freeze state is now gating

further software development because major open source projects (such

as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches

only for frozen or ratified RISC-V extensions.



Few examples of gated software development:

1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V

    patches are already reviewed and acked by maintainers in July 2019.

    Currently, we are rebasing KVM RISC-V patches with every kernel

    release since 1.5+ years.

2. GCC RISC-V not accepting patches for H-extension related instructions

3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in

    upstream Linux RISC-V

4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is

    not merged in upstream Linux RISC-V

5. Various feature additions (such as SBI v0.2, nested, etc) can't happen

   (or can't be merged) until KVM RISC-V is merged in Linux RISC-V

6. Libvirt library blocked on QEMU KVM RISC-V acceleration being

    available. The Libvirt library is a crucial piece in open-source cloud

    solutions (such as open-stack).

7. As time passes more items (such as KVM RISC-V AIA support) will

    get blocked if KVM RISC-V is not merged upstream.



We would request the TSC to consider freezing RISC-V H-extension v0.6.1

draft specification. Remaining items in done checklist for ratification can

certainly be completed while H-extension is in the frozen state.



Best Regards,

Anup Patel


Re: RISC-V H-extension freeze consideration

Andrew Waterman
 



On Tue, Feb 2, 2021 at 7:54 PM Andrew Waterman <andrew@...> wrote:
I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.

Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.

* substantially greater utility

Shouldn’t write emails on phone...


On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,



The RISC-V H-extension v0.6.1 draft was released almost a year back in

May 2020. There has been no changes in the H-extension specification

since then.



Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,

Spike, and Rocket FPGA. We also have three different hypervisors ported

to RISC-V H-extension v0.6.1:

1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)

2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)

3. Bao (Works on Rocket FPGA)



Unfortunately, RISC-V H-extension not being in freeze state is now gating

further software development because major open source projects (such

as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches

only for frozen or ratified RISC-V extensions.



Few examples of gated software development:

1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V

    patches are already reviewed and acked by maintainers in July 2019.

    Currently, we are rebasing KVM RISC-V patches with every kernel

    release since 1.5+ years.

2. GCC RISC-V not accepting patches for H-extension related instructions

3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in

    upstream Linux RISC-V

4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is

    not merged in upstream Linux RISC-V

5. Various feature additions (such as SBI v0.2, nested, etc) can't happen

   (or can't be merged) until KVM RISC-V is merged in Linux RISC-V

6. Libvirt library blocked on QEMU KVM RISC-V acceleration being

    available. The Libvirt library is a crucial piece in open-source cloud

    solutions (such as open-stack).

7. As time passes more items (such as KVM RISC-V AIA support) will

    get blocked if KVM RISC-V is not merged upstream.



We would request the TSC to consider freezing RISC-V H-extension v0.6.1

draft specification. Remaining items in done checklist for ratification can

certainly be completed while H-extension is in the frozen state.



Best Regards,

Anup Patel




Re: RISC-V H-extension freeze consideration

Andrew Waterman
 

I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.

Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.

On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,



The RISC-V H-extension v0.6.1 draft was released almost a year back in

May 2020. There has been no changes in the H-extension specification

since then.



Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,

Spike, and Rocket FPGA. We also have three different hypervisors ported

to RISC-V H-extension v0.6.1:

1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)

2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)

3. Bao (Works on Rocket FPGA)



Unfortunately, RISC-V H-extension not being in freeze state is now gating

further software development because major open source projects (such

as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches

only for frozen or ratified RISC-V extensions.



Few examples of gated software development:

1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V

    patches are already reviewed and acked by maintainers in July 2019.

    Currently, we are rebasing KVM RISC-V patches with every kernel

    release since 1.5+ years.

2. GCC RISC-V not accepting patches for H-extension related instructions

3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in

    upstream Linux RISC-V

4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is

    not merged in upstream Linux RISC-V

5. Various feature additions (such as SBI v0.2, nested, etc) can't happen

   (or can't be merged) until KVM RISC-V is merged in Linux RISC-V

6. Libvirt library blocked on QEMU KVM RISC-V acceleration being

    available. The Libvirt library is a crucial piece in open-source cloud

    solutions (such as open-stack).

7. As time passes more items (such as KVM RISC-V AIA support) will

    get blocked if KVM RISC-V is not merged upstream.



We would request the TSC to consider freezing RISC-V H-extension v0.6.1

draft specification. Remaining items in done checklist for ratification can

certainly be completed while H-extension is in the frozen state.



Best Regards,

Anup Patel


RISC-V H-extension freeze consideration

Anup Patel
 

Hi All,

The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.

Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)

Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.

Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.

We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.

Best Regards,
Anup Patel


Re: [RISC-V] [tech-unixplatformspec] [RISC-V] [tech-privileged] [Announcement] Successful KVM RISC-V bring up on FPGA (Rocket core with H extension)

atishp@...
 

On Tue, 2021-02-02 at 03:14 -0800, mark wrote:
congrats!

can we send something out to tech-announce about it?
Sure. I will send that once I have the detailed instructions available
in public domain.

--------
sent from a mobile device. please forgive any typos.

On Feb 2, 2021, at 12:40 AM, Atish Patra <atish.patra@...>
wrote:

Hi,
We are glad to announce that we are able to boot Linux in KVM guest
on
a FPGA (Rocket chip + H extension v0.6.1). We now have three
hypervisors working on a Hardware with H extension.

1. KVM [1]
2. Xvisor [2]
3. Bao [3]

KVM bring up was done using Firesim and the detailed instructions
will
be available very soon. Here are the software versions used for
bringup. Please find the attached boot log.

OpenSBI: v0.9
Linux kernel: 5.11-rc5 + KVM patches(v16) + few kernel fixes [4].
Kvmtool: Upstream + RISC-V KVMTOOL patches (v6) [5]


We would like to thank Sandro & Jose who implemented the H
extension.
The Rocket-H design is available as a AFI image to be used within
Firesim or a stand alone FPGA board.

We would also like to thank Andrew/John/Greg & others for defining
such
a clean specification as we did not discover any significant issues
while doing the bring up. As there are no changes proposed to the H
extension in the last year, we believe the current version of the H
extension can be considered as a freeze candidate. Please let us
know
if that is not the case.

--
Regards,
Atish

[1] https://github.com/kvm-riscv/howto/wiki
[2] https://github.com/xvisor/xvisor
[3] https://github.com/sandro2pinto/bao-rocket-h-firesim
[4] https://github.com/atishp04/linux/commits/rocket_kvm_working
[5] https://github.com/kvm-riscv/kvmtool





<kvm_fpga_boot_log.txt>



--
Regards,
Atish


Re: [Announcement] Successful KVM RISC-V bring up on FPGA (Rocket core with H extension)

mark
 

congrats!

can we send something out to tech-announce about it?

--------
sent from a mobile device. please forgive any typos.

On Feb 2, 2021, at 12:40 AM, Atish Patra <atish.patra@...> wrote:

Hi,
We are glad to announce that we are able to boot Linux in KVM guest on
a FPGA (Rocket chip + H extension v0.6.1). We now have three
hypervisors working on a Hardware with H extension.

1. KVM [1]
2. Xvisor [2]
3. Bao [3]

KVM bring up was done using Firesim and the detailed instructions will
be available very soon. Here are the software versions used for
bringup. Please find the attached boot log.

OpenSBI: v0.9
Linux kernel: 5.11-rc5 + KVM patches(v16) + few kernel fixes [4].
Kvmtool: Upstream + RISC-V KVMTOOL patches (v6) [5]


We would like to thank Sandro & Jose who implemented the H extension.
The Rocket-H design is available as a AFI image to be used within
Firesim or a stand alone FPGA board.

We would also like to thank Andrew/John/Greg & others for defining such
a clean specification as we did not discover any significant issues
while doing the bring up. As there are no changes proposed to the H
extension in the last year, we believe the current version of the H
extension can be considered as a freeze candidate. Please let us know
if that is not the case.

--
Regards,
Atish

[1] https://github.com/kvm-riscv/howto/wiki
[2] https://github.com/xvisor/xvisor
[3] https://github.com/sandro2pinto/bao-rocket-h-firesim
[4] https://github.com/atishp04/linux/commits/rocket_kvm_working
[5] https://github.com/kvm-riscv/kvmtool





<kvm_fpga_boot_log.txt>


Re: [Announcement] Successful KVM RISC-V bring up on FPGA (Rocket core with H extension)

Andrew Waterman
 

Awesome!


On Tue, Feb 2, 2021 at 12:40 AM Atish Patra <atish.patra@...> wrote:
Hi,
We are glad to announce that we are able to boot Linux in KVM guest on
a FPGA (Rocket chip + H extension v0.6.1). We now have three
hypervisors working on a Hardware with H extension.

1. KVM [1]
2. Xvisor [2]
3. Bao [3]

KVM bring up was done using Firesim and the detailed instructions will
be available very soon. Here are the software versions used for
bringup. Please find the attached boot log.

OpenSBI: v0.9
Linux kernel: 5.11-rc5 + KVM patches(v16) + few kernel fixes [4].
Kvmtool: Upstream + RISC-V KVMTOOL patches (v6) [5]


We would like to thank Sandro & Jose who implemented the H extension.
The Rocket-H design is available as a AFI image to be used within
Firesim or a stand alone FPGA board.

We would also like to thank Andrew/John/Greg & others for defining such
a clean specification as we did not discover any significant issues
while doing the bring up. As there are no changes proposed to the H
extension in the last year, we believe the current version of the H
extension can be considered as a freeze candidate. Please let us know
if that is not the case.

--
Regards,
Atish

[1] https://github.com/kvm-riscv/howto/wiki
[2] https://github.com/xvisor/xvisor
[3] https://github.com/sandro2pinto/bao-rocket-h-firesim
[4] https://github.com/atishp04/linux/commits/rocket_kvm_working
[5] https://github.com/kvm-riscv/kvmtool






[Announcement] Successful KVM RISC-V bring up on FPGA (Rocket core with H extension)

atishp@...
 

Hi,
We are glad to announce that we are able to boot Linux in KVM guest on
a FPGA (Rocket chip + H extension v0.6.1). We now have three
hypervisors working on a Hardware with H extension.

1. KVM [1]
2. Xvisor [2]
3. Bao [3]

KVM bring up was done using Firesim and the detailed instructions will
be available very soon. Here are the software versions used for
bringup. Please find the attached boot log.

OpenSBI: v0.9
Linux kernel: 5.11-rc5 + KVM patches(v16) + few kernel fixes [4].
Kvmtool: Upstream + RISC-V KVMTOOL patches (v6) [5]


We would like to thank Sandro & Jose who implemented the H extension.
The Rocket-H design is available as a AFI image to be used within
Firesim or a stand alone FPGA board.

We would also like to thank Andrew/John/Greg & others for defining such
a clean specification as we did not discover any significant issues
while doing the bring up. As there are no changes proposed to the H
extension in the last year, we believe the current version of the H
extension can be considered as a freeze candidate. Please let us know
if that is not the case.

--
Regards,
Atish

[1] https://github.com/kvm-riscv/howto/wiki
[2] https://github.com/xvisor/xvisor
[3] https://github.com/sandro2pinto/bao-rocket-h-firesim
[4] https://github.com/atishp04/linux/commits/rocket_kvm_working
[5] https://github.com/kvm-riscv/kvmtool


Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Greg Favor
 

On Mon, Feb 1, 2021 at 10:00 AM Brian Grayson <brian.grayson@...> wrote:
Given the discussions about cache-ops and the name for them on tech-cmo and the desire to avoid "co", "COF" (which can also mean "change of flow") may not be the best choice for the extension short-name. What about just "Sshpm", as this extension is what really allows the HPM to be well-utilized by tools like perf? Or is that too confusing since hpm already exists?

Using 'hpm' probably would be a bit confusing.  But I'll look into alternatives.  Btw, since a new extension naming standard is being developed, ultimately this (and all other extensions) will need to conform to the new scheme (although the 'Ss' part of this name is expected to be consistent with that new scheme).  Also note that CMO group extensions will have "Zi*" names and the concern over use of "co" or "cop" as a root name was particularly in that context (i.e. wrt other Unpriv spec extensions; while this extension in the "S" name space for Priv extensions).  But in any case I'll explore alternatives that may be acceptable.
 
Is there a reason there is no mcountovf? It would simplify the software for an M-mode tool, and for cores that don't have an S-mode.

This has been discussed (with the lead architects; I'll stop repeatedly mentioning this).  And in standard RISC philosophy form, it was considered to have insufficient justification.  For a core with S-mode and if M-mode wants to examine the bits for counters that have not been "delegated" down to S-mode via mcounteren, then M-mode can either use a three-instruction sequence to read a version of scountovf unaffected by mcounteren, or it can directly check the individual mhpmevent.OF bits that it cares about.  The latter also applies for a core without S-mode that implements this extension.  (Further, I imagine a "no S-mode" CPU probably only implements a small number of counters.)
 
How is overflow defined for an implementation that implements 32<n<64 bits in the counter registers? Although the registers are architecturally 64 bits, an implementation may not want to support all of them.

The Priv spec says "The mhpmcounters are WARL registers that support up to 64 bits of precision".  This allows complete flexibility for how many implemented bits there are.

Since count values are defined as unsigned, there is always an equivalent unsigned 64-bit current count value irrespective of the implemented size.  So overflow is well-defined (modulo the issue down below).
 
Mandating full 64-bit counters may make an implementation area-prohibitive for the smallest of perfmon-enabled embedded cores. I think this could be specified like this: "An implementation may implement less than 64 bits for the hpmcounter CSRs. On such an implementation, software can query the bit width of the hmpcounter registers by taking advantage of the WARL behavior: writing all 1's and reading back to see which bits retained the set value.

This would be an issue to raise with the existing Priv spec, not with this extension.  But as noted above, this isn't really an issue since it is already comprehended by the Priv spec.
 
Also, on such implementations, overflow is defined to occur when the highest implemented bit transitions from 1 to 0." Given that, software can do the right thing regardless of implemented bit width.

Good point.  The current proposed definition doesn't properly comprehend the WARL nature of the hpmcounter registers.  I'll switch to a definition along the lines of what you describe (I agree that that is what is needed).  Count values remain as unsigned values and overflow is unsigned overflow of the implemented bits.
 
Greg


Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Brian Grayson
 

Given the discussions about cache-ops and the name for them on tech-cmo and the desire to avoid "co", "COF" (which can also mean "change of flow") may not be the best choice for the extension short-name. What about just "Sshpm", as this extension is what really allows the HPM to be well-utilized by tools like perf? Or is that too confusing since hpm already exists?

I like the concept of putting overflow and filtering control into the mhpmevent registers -- single write to completely configure a counter.

Is there a reason there is no mcountovf? It would simplify the software for an M-mode tool, and for cores that don't have an S-mode.

How is overflow defined for an implementation that implements 32<n<64 bits in the counter registers? Although the registers are architecturally 64 bits, an implementation may not want to support all of them. Mandating full 64-bit counters may make an implementation area-prohibitive for the smallest of perfmon-enabled embedded cores. I think this could be specified like this: "An implementation may implement less than 64 bits for the hpmcounter CSRs. On such an implementation, software can query the bit width of the hmpcounter registers by taking advantage of the WARL behavior: writing all 1's and reading back to see which bits retained the set value. Also, on such implementations, overflow is defined to occur when the highest implemented bit transitions from 1 to 0." Given that, software can do the right thing regardless of implemented bit width.

Brian._,_._,_


Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Greg Favor
 

On Mon, Feb 1, 2021 at 6:29 AM Phil McCoy <pnm@...> wrote:
Could you clarify how this extension interacts with mideleg?  I assume interrupt 13 would be taken in M-mode by default unless it is delegated to S-mode, but it would be nice to state this explicitly.

You're correct.  Standard mideleg functionality applies.  I'll incorporate a clarification note.
 
For implementations that support the Hypervisor extension, hideleg, hvip, hip, hie, vsip and vsie would also be of interest.

As you note, this starts getting into adding a number of bits and associated functionality.  The broader arch consistency question is whether this is the best path for the architecture going forward as other local interrupts come into being.

This has specifically been discussed with the lead Priv architects and the view is that there is a different and better way to support "delegation" of local interrupts into a VM.  (In particular, the new virtualization-aware next gen interrupt architecture will properly support this.  A working group on this will be starting shortly; I believe a public announcement is imminent.)

Greg


Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Phil McCoy
 

Could you clarify how this extension interacts with mideleg?  I assume interrupt 13 would be taken in M-mode by default unless it is delegated to S-mode, but it would be nice to state this explicitly.

For implementations that support the Hypervisor extension, hideleg, hvip, hip, hie, vsip and vsie would also be of interest.

Thanks,
Phil


Re: Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Greg Favor
 

One typo crept by me and some other pre-reviewers:  scountovf contains shadow copies of the OF bits in the 29 mhpmevent CSRs (i.e. mhpmevent3-mhpmevent31).

Greg


On Sun, Jan 31, 2021 at 10:38 PM Greg Favor <gfavor@...> wrote:
Hi all,

Recently the TSC established a lightweight "fast track" architecture extension process that small, straightforward, relatively uncontentious arch extension proposals can utilize.  This is the second of two Privileged architecture related small extensions - that a number of people/companies have expressed desire for over the past year - that Andrew and I discussed trying to help move through this process sooner than later (especially since this entails much more than simply developing a spec).  The following starts with an intro for context, and then provides the draft spec.

Note that the draft spec is written as the actual changes to be made to existing paragraphs of Priv spec text (or additional paragraphs and/or sections within the existing text).  The surrounding sentence(s) of a change are included for context.  Text in square brackets is temporary commentary that is not part of the proposed spec changes.

In anticipation of some questions that may arise in people's minds, I'll note that this extension has been extensively reviewed by the lead architects of the Privileged and Hypervisor architectures for consistency with the current architecture (including little things like extension, CSR, and bit/field names).  Various changes were made along the way because of this.

===============================================================================
Introduction

The current Privileged specification defines mhpmevent CSRs to select and control event counting by the associated hpmcounter CSRs, but provides no standardization of any fields within these CSRs.  For at least Linux-class rich-OS systems it is desirable to standardize certain basic features that are broadly desired (and have come up over the past year plus on RISC-V lists, as well as have been the subject of past proposals).  This enables there to be standard upstream software support that eliminates the need for implementations to provide their own custom software support.  (Implementations are free, of course, to not implement this extension.)

This proposal serves to accomplish exactly this within the existing mhpmevent CSRs (and correspondingly avoids the unnecessary creation of whole new sets of CSRs - past just one new CSR).

Below is a one-page draft spec of the proposal - which sticks to addressing two basic well-understood needs that have been requested by various people.  The proposed extension name is "Sscof" ('Ss' for Privileged arch and Supervisor-level extensions, and 'cof' for Count Overflow and Filtering).  There are other features that various people may desire (and that even I would desire) that don't have clear-cut, non-contentious, and relatively broad support.  These can be grist for separate discussions and possibly another arch extension by a motivated party that gathers a sufficient degree of concensus.

Although one such feature worth highlighting is having a WrEn bit in mhpmevent that allows lower privilege modes that can read the associated hpmcounter CSR (based on the *counteren CSRs) to also be able to write it.  In essence enabling direct S/VS-mode and U/VU-mode write access instead of always requiring OpenSBI calls up to M-mode.  But this feature has had some contention, involves some details to properly support virtualization, and requires allocating a second set of "User-Read-Write" hpmcounter CSR numbers (since the current hpmcounter CSRs are "User-Read-Only").  If there is a broad upwelling of support and justification for this feature, and some party willing to put together a complete spec (including virtualization support), then this could be another fast-track extension.

Lastly note that the new count overflow interrupt will be treated as a standard local interrupt that is assigned to bit 13 in the mip/mie/sip/sie registers.  (This has been discussed and agreed to with key Priv Arch people.)

This posting to this email list starts an initial review period (over the next few weeks) for people to provide feedback, questions, comments, etc.

================================================================================
Proposed Spec

=======================================================================
=======================  Machine-Level ISA Additions  ========================

Hardware Performance Monitor
[ This extension expands the hardware performance monitor description and extends the mhpmevent registers to 64 bits (in RV32) as follows: ]

The hardware performance monitor includes 29 additional 64-bit event counters and 29 associated 64-bit event selector registers - the mhpmcounter3–mhpmcounter31 and mhpmevent3–mhpmevent31 CSRs.

The mhpmcounters are WARL registers that support up to 64 bits of precision on RV32 and RV64. 

The mhpmeventn registers are WARL registers that control which event causes the corresponding counter to increment and what happens when the corresponding count overflows. Currently just a few bits are defined here.  Past this, the actual selection and meaning of events is defined by the platform, but (mhpmevent == 0) is defined to mean “no event" and that the corresponding counter will never be incremented.  Typically the lower bits of mhpmevent will be used for event selection purposes.  

On RV32 only, reads of the mcycle, minstret, mhpmcountern, and mhpmeventn CSRs return the low 32 bits, while reads of the mcycleh, minstreth, mhpmcounternh, and mhpmeventnh CSRs return bits 63–32 of the corresponding counter or event selector.  [ The proposed CSR numbers for mhpmeventnh are 0x723 - 0x73F. ]

The following bits are added to mhpmevent:

bit [63]  OF            -  Overflow status and interrupt disable bit that is set when counter overflows

bit [62]  MINH        -  If set, then counting of events in M-mode is inhibited
bit [61]  SINH         -  If set, then counting of events in S/HS-mode is inhibited
bit [60]  UINH         -  If set, then counting of events in U-mode is inhibited
bit [59]  VSINH       -  If set, then counting of events in VS-mode is inhibited
bit [59]  VUINH       -  If set, then counting of events in VU-mode is inhibited
bit [58]  0                -  Reserved for possible future modes
bit [57]  0                -  Reserved for possible future modes

Each of the five 'x'INH bits, when set, inhibit counting of events while in privilege mode 'x'.  All-zeroes for these bits results in counting of events in all modes.

The OF bit is set when the corresponding hpmcounter overflows, and remains set until written by software.  Since hpmcounter values are unsigned values, overflow is defined as unsigned overflow.  [ This matches x86 and ARMv8. ]  Note that there is no loss of information after an overflow since the counter wraps around and keeps counting while the sticky OF bit remains set.  [ For a 64-bit counter it will be an awfully long time before another overflow could possibly occur. ]

If supervisor mode is implemented, the 32-bit scountovf register contains read-only shadow copies of the OF bits in all 32 mhpmevent registers.

If an hpmcounter overflows while the associated OF bit is zero, then a "count overflow interrupt request" is generated.  If the OF bit is one, then no interrupt request is generated.  Consequently the OF bit also functions as a count overflow interrupt disable for the associated hpmcounter.

----------------------------  Non-Normative Text    ----------------------------
There are not separate overflow status and overflow interrupt enable bits.  In practice, enabling overflow interrupt generation (by clearing the OF bit) is done in conjunction with initializing the counter to a starting value.  Once a counter has overflowed, it and the OF bit must be reinitialized before another overflow interrupt can be generated.
----------------------------------------------------------------------------------------

This "count overflow interrupt request" signal is treated as a standard local interrupt that corresponds to bit 13 in the mip/mie/sip/sie registers.  The mip/sip LCOFIP and mie/sie LCOFIE bits are respectively the interrupt-pending and interrupt-enable bits for this interrupt.  ('LCOFI' represents 'Local Count Overflow Interrupt'.)  [ This proposal doesn't try to introduce per-privilege mode overflow interrupt request signals.  ARMv8 doesn't have this and I don't think x86 does either. ]
 
Generation of a "count overflow interrupt request" by an hpmcounter sets the LCOFIP bit in the mip/sip registers and sets the associated OF bit.  The LCOFIP bit is cleared by software after servicing the count overflow interrupt resulting from one or more count overflows.

----------------------------  Non-Normative Text    ----------------------------
Software can maintain a bit mask to distinguish newly overflowed counters (yet to be serviced by an overflow interrupt handler) from overflowed counters that have already been serviced or that are configured to not generate an interrupt on overflow.
----------------------------------------------------------------------------------------

Machine Interrupt Registers (mip and mie)
[ This extension adds the description of the LCOFIP/LCOFIE bits in these registers (and modifies related text) as follows: ]

LCOFIP is added to mip in Figure 3.14 as bit 13.  LCOFIP is added to mie in Figure 3.15 as bit 13.

If the Sscof extension is implemented, bits mip.LCOFIP and mie.LCOFIE are the interrupt-pending and interrupt-enable bits for local count overflow interrupts.  LCOFIP is read-write in mip and reflects the occurrence of a local count overflow interrupt request resulting from any of the mhpmeventn.OF bits being set.   If the Sscof extension is not implemented, these LCOFIP and LCOFIE bits are hardwired to zeros.

Multiple simultaneous interrupts destined for different privilege modes are handled in decreasing order of destined privilege mode. Multiple simultaneous interrupts destined for the same privilege mode are handled in the following decreasing priority order: MEI, MSI, MTI, SEI, SSI, STI, LCOFI.

=========================================================================
=======================  Supervisor-Level ISA Additions  ========================

Supervisor Interrupt Registers (sip and sie)
[ This extension adds the description of the LCOFIP/LCOFIE bits in these registers (and modifies related text) as follows: ]

LCOFIP is added to sip in Figure 4.6 as bit 13.  LCOFIP is added to sie in Figure 4.7 as bit 13.

If the Sscof extension is implemented, bits sip.LCOFIP and sie.LCOFIE are the interrupt-pending and interrupt-enable bits for local count overflow interrupts.  LCOFIP is read-write in sip and reflects the occurrence of a local count overflow interrupt request resulting from any of the mhpmeventn.OF bits being set.  If the Sscof extension is not implemented, these LCOFIP and LCOFIE bits are hardwired to zeros. 

Each standard interrupt type (LCOFI, SEI, STI, or SSI) may not be implemented, in which case the corresponding interrupt-pending and interrupt-enable bits are hardwired to zeros.  All bits in sip and sie are WARL fields.

Multiple simultaneous interrupts destined for supervisor mode are handled in the following decreasing priority order: SEI, SSI, STI, LCOFI.

Supervisor Count Overflow (scountovf)
[ This extension adds this new CSR. ]

The scountovf CSR is a 32-bit read-only register that contains shadow copies of the OF bits in the 32 mhpmevent CSRs - where scountovf bit X corresponds to mhpmeventX.  The proposed CSR number is 0xD33.

This register enables supervisor-level overflow interrupt handler software to quickly and easily determine which counter(s) have overflowed (without needing to make an execution environment call or series of calls ultimately up to M-mode).  [ ARMv8 and x86 have a similar register for the same reasons. ]

Read access to bit X is subject to the same mcounteren (or mcounteren and hcounteren) CSRs that mediate access to the hpmcounter CSRs by S-mode (or VS-mode).  In M and S modes, scountovf bit X is readable when mcounteren bit X is set, and otherwise reads as zero.  Similarly, in VS mode, scountovf bit X is readable when mcounteren bit X and hcounteren bit X are both set, and otherwise reads as zero. 


Fast-track extension proposal for "Hardware Performance Monitor count overflow and mode-based event filtering"

Greg Favor
 

Hi all,

Recently the TSC established a lightweight "fast track" architecture extension process that small, straightforward, relatively uncontentious arch extension proposals can utilize.  This is the second of two Privileged architecture related small extensions - that a number of people/companies have expressed desire for over the past year - that Andrew and I discussed trying to help move through this process sooner than later (especially since this entails much more than simply developing a spec).  The following starts with an intro for context, and then provides the draft spec.

Note that the draft spec is written as the actual changes to be made to existing paragraphs of Priv spec text (or additional paragraphs and/or sections within the existing text).  The surrounding sentence(s) of a change are included for context.  Text in square brackets is temporary commentary that is not part of the proposed spec changes.

In anticipation of some questions that may arise in people's minds, I'll note that this extension has been extensively reviewed by the lead architects of the Privileged and Hypervisor architectures for consistency with the current architecture (including little things like extension, CSR, and bit/field names).  Various changes were made along the way because of this.

===============================================================================
Introduction

The current Privileged specification defines mhpmevent CSRs to select and control event counting by the associated hpmcounter CSRs, but provides no standardization of any fields within these CSRs.  For at least Linux-class rich-OS systems it is desirable to standardize certain basic features that are broadly desired (and have come up over the past year plus on RISC-V lists, as well as have been the subject of past proposals).  This enables there to be standard upstream software support that eliminates the need for implementations to provide their own custom software support.  (Implementations are free, of course, to not implement this extension.)

This proposal serves to accomplish exactly this within the existing mhpmevent CSRs (and correspondingly avoids the unnecessary creation of whole new sets of CSRs - past just one new CSR).

Below is a one-page draft spec of the proposal - which sticks to addressing two basic well-understood needs that have been requested by various people.  The proposed extension name is "Sscof" ('Ss' for Privileged arch and Supervisor-level extensions, and 'cof' for Count Overflow and Filtering).  There are other features that various people may desire (and that even I would desire) that don't have clear-cut, non-contentious, and relatively broad support.  These can be grist for separate discussions and possibly another arch extension by a motivated party that gathers a sufficient degree of concensus.

Although one such feature worth highlighting is having a WrEn bit in mhpmevent that allows lower privilege modes that can read the associated hpmcounter CSR (based on the *counteren CSRs) to also be able to write it.  In essence enabling direct S/VS-mode and U/VU-mode write access instead of always requiring OpenSBI calls up to M-mode.  But this feature has had some contention, involves some details to properly support virtualization, and requires allocating a second set of "User-Read-Write" hpmcounter CSR numbers (since the current hpmcounter CSRs are "User-Read-Only").  If there is a broad upwelling of support and justification for this feature, and some party willing to put together a complete spec (including virtualization support), then this could be another fast-track extension.

Lastly note that the new count overflow interrupt will be treated as a standard local interrupt that is assigned to bit 13 in the mip/mie/sip/sie registers.  (This has been discussed and agreed to with key Priv Arch people.)

This posting to this email list starts an initial review period (over the next few weeks) for people to provide feedback, questions, comments, etc.

================================================================================
Proposed Spec

=======================================================================
=======================  Machine-Level ISA Additions  ========================

Hardware Performance Monitor
[ This extension expands the hardware performance monitor description and extends the mhpmevent registers to 64 bits (in RV32) as follows: ]

The hardware performance monitor includes 29 additional 64-bit event counters and 29 associated 64-bit event selector registers - the mhpmcounter3–mhpmcounter31 and mhpmevent3–mhpmevent31 CSRs.

The mhpmcounters are WARL registers that support up to 64 bits of precision on RV32 and RV64. 

The mhpmeventn registers are WARL registers that control which event causes the corresponding counter to increment and what happens when the corresponding count overflows. Currently just a few bits are defined here.  Past this, the actual selection and meaning of events is defined by the platform, but (mhpmevent == 0) is defined to mean “no event" and that the corresponding counter will never be incremented.  Typically the lower bits of mhpmevent will be used for event selection purposes.  

On RV32 only, reads of the mcycle, minstret, mhpmcountern, and mhpmeventn CSRs return the low 32 bits, while reads of the mcycleh, minstreth, mhpmcounternh, and mhpmeventnh CSRs return bits 63–32 of the corresponding counter or event selector.  [ The proposed CSR numbers for mhpmeventnh are 0x723 - 0x73F. ]

The following bits are added to mhpmevent:

bit [63]  OF            -  Overflow status and interrupt disable bit that is set when counter overflows

bit [62]  MINH        -  If set, then counting of events in M-mode is inhibited
bit [61]  SINH         -  If set, then counting of events in S/HS-mode is inhibited
bit [60]  UINH         -  If set, then counting of events in U-mode is inhibited
bit [59]  VSINH       -  If set, then counting of events in VS-mode is inhibited
bit [59]  VUINH       -  If set, then counting of events in VU-mode is inhibited
bit [58]  0                -  Reserved for possible future modes
bit [57]  0                -  Reserved for possible future modes

Each of the five 'x'INH bits, when set, inhibit counting of events while in privilege mode 'x'.  All-zeroes for these bits results in counting of events in all modes.

The OF bit is set when the corresponding hpmcounter overflows, and remains set until written by software.  Since hpmcounter values are unsigned values, overflow is defined as unsigned overflow.  [ This matches x86 and ARMv8. ]  Note that there is no loss of information after an overflow since the counter wraps around and keeps counting while the sticky OF bit remains set.  [ For a 64-bit counter it will be an awfully long time before another overflow could possibly occur. ]

If supervisor mode is implemented, the 32-bit scountovf register contains read-only shadow copies of the OF bits in all 32 mhpmevent registers.

If an hpmcounter overflows while the associated OF bit is zero, then a "count overflow interrupt request" is generated.  If the OF bit is one, then no interrupt request is generated.  Consequently the OF bit also functions as a count overflow interrupt disable for the associated hpmcounter.

----------------------------  Non-Normative Text    ----------------------------
There are not separate overflow status and overflow interrupt enable bits.  In practice, enabling overflow interrupt generation (by clearing the OF bit) is done in conjunction with initializing the counter to a starting value.  Once a counter has overflowed, it and the OF bit must be reinitialized before another overflow interrupt can be generated.
----------------------------------------------------------------------------------------

This "count overflow interrupt request" signal is treated as a standard local interrupt that corresponds to bit 13 in the mip/mie/sip/sie registers.  The mip/sip LCOFIP and mie/sie LCOFIE bits are respectively the interrupt-pending and interrupt-enable bits for this interrupt.  ('LCOFI' represents 'Local Count Overflow Interrupt'.)  [ This proposal doesn't try to introduce per-privilege mode overflow interrupt request signals.  ARMv8 doesn't have this and I don't think x86 does either. ]
 
Generation of a "count overflow interrupt request" by an hpmcounter sets the LCOFIP bit in the mip/sip registers and sets the associated OF bit.  The LCOFIP bit is cleared by software after servicing the count overflow interrupt resulting from one or more count overflows.

----------------------------  Non-Normative Text    ----------------------------
Software can maintain a bit mask to distinguish newly overflowed counters (yet to be serviced by an overflow interrupt handler) from overflowed counters that have already been serviced or that are configured to not generate an interrupt on overflow.
----------------------------------------------------------------------------------------

Machine Interrupt Registers (mip and mie)
[ This extension adds the description of the LCOFIP/LCOFIE bits in these registers (and modifies related text) as follows: ]

LCOFIP is added to mip in Figure 3.14 as bit 13.  LCOFIP is added to mie in Figure 3.15 as bit 13.

If the Sscof extension is implemented, bits mip.LCOFIP and mie.LCOFIE are the interrupt-pending and interrupt-enable bits for local count overflow interrupts.  LCOFIP is read-write in mip and reflects the occurrence of a local count overflow interrupt request resulting from any of the mhpmeventn.OF bits being set.   If the Sscof extension is not implemented, these LCOFIP and LCOFIE bits are hardwired to zeros.

Multiple simultaneous interrupts destined for different privilege modes are handled in decreasing order of destined privilege mode. Multiple simultaneous interrupts destined for the same privilege mode are handled in the following decreasing priority order: MEI, MSI, MTI, SEI, SSI, STI, LCOFI.

=========================================================================
=======================  Supervisor-Level ISA Additions  ========================

Supervisor Interrupt Registers (sip and sie)
[ This extension adds the description of the LCOFIP/LCOFIE bits in these registers (and modifies related text) as follows: ]

LCOFIP is added to sip in Figure 4.6 as bit 13.  LCOFIP is added to sie in Figure 4.7 as bit 13.

If the Sscof extension is implemented, bits sip.LCOFIP and sie.LCOFIE are the interrupt-pending and interrupt-enable bits for local count overflow interrupts.  LCOFIP is read-write in sip and reflects the occurrence of a local count overflow interrupt request resulting from any of the mhpmeventn.OF bits being set.  If the Sscof extension is not implemented, these LCOFIP and LCOFIE bits are hardwired to zeros. 

Each standard interrupt type (LCOFI, SEI, STI, or SSI) may not be implemented, in which case the corresponding interrupt-pending and interrupt-enable bits are hardwired to zeros.  All bits in sip and sie are WARL fields.

Multiple simultaneous interrupts destined for supervisor mode are handled in the following decreasing priority order: SEI, SSI, STI, LCOFI.

Supervisor Count Overflow (scountovf)
[ This extension adds this new CSR. ]

The scountovf CSR is a 32-bit read-only register that contains shadow copies of the OF bits in the 32 mhpmevent CSRs - where scountovf bit X corresponds to mhpmeventX.  The proposed CSR number is 0xD33.

This register enables supervisor-level overflow interrupt handler software to quickly and easily determine which counter(s) have overflowed (without needing to make an execution environment call or series of calls ultimately up to M-mode).  [ ARMv8 and x86 have a similar register for the same reasons. ]

Read access to bit X is subject to the same mcounteren (or mcounteren and hcounteren) CSRs that mediate access to the hpmcounter CSRs by S-mode (or VS-mode).  In M and S modes, scountovf bit X is readable when mcounteren bit X is set, and otherwise reads as zero.  Similarly, in VS mode, scountovf bit X is readable when mcounteren bit X and hcounteren bit X are both set, and otherwise reads as zero. 


Re: Proposal: Delegating Exceptions from VS-mode or VU-mode to U-mode

陈家浩
 

Hi Anup,
 
If I understand correctly, this proposal will not cause the problem of your 2nd point because not all the MMIO will trap to user space in this proposal. The proposal still allow MMIO traps to kernel.
 
We can use the PTE.MMIO field in this proposal only for the MMIO traps that required to be delegated to user-level.  They will not be transferred by the G-stage.
 
The in-kernel MMIO traps will go to the G-stage because their PTE.MMIO is clear by the hypervisor. Then it will cause page fault on G-stage and trap to kernel as usual.
 
Regards,
Jiahao

741 - 760 of 1210