RISC-V H-extension freeze consideration
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
Hi All,
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.
On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:Hi All,
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.
We already the required ISA support in H-extension for virtualization-aware interrupt controller.
The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.
Regards,
Anup
Sent: 03 February 2021 09:24
To: Anup Patel <Anup.Patel@...>
Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...; tech-unixplatformspec@...
Subject: Re: RISC-V H-extension freeze consideration
I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.
Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.
On Tue, Feb 2, 2021 at 7:47 PM Anup Patel <Anup.Patel@...> wrote:
Hi All,
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.
We already the required ISA support in H-extension for virtualization-aware interrupt controller.
The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.
Regards,
Anup
From: Andrew Waterman <andrew@...>
Sent: 03 February 2021 09:24
To: Anup Patel <Anup.Patel@...>
Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...;
tech-unixplatformspec@...
Subject: Re: RISC-V H-extension freeze consideration
I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically
disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.
Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.
Hi All,
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
In other architectures, those devices are needlessly complex in part because they weren’t co-designed with the ISA. Yes, they can be independently designed, but possibly with regrettable consequences.On Tue, Feb 2, 2021 at 8:06 PM Anup Patel <Anup.Patel@...> wrote:On all major architectures (x86 and ARM64), the virtualization-aware interrupt controllers and IOMMUs are totally independent from ISA virtualization support.
We already the required ISA support in H-extension for virtualization-aware interrupt controller.
The IOMMUs are totally independent of CPU virtualization support on all major architectures and I don’t see how H-extension need to change for IOMMU support.
Regards,
Anup
From: Andrew Waterman <andrew@...>
Sent: 03 February 2021 09:24
To: Anup Patel <Anup.Patel@...>
Cc: Alistair Francis <Alistair.Francis@...>; Allen Baum <allen.baum@...>; Atish Patra <Atish.Patra@...>; Greg Favor <gfavor@...>; John Hauser <jh.riscv@...>; Krste Asanovic <krste@...>; tech-privileged@...;
tech-unixplatformspec@...
Subject: Re: RISC-V H-extension freeze consideration
I’m not in support of freezing it yet. My concern is that development of virtualization-aware interrupt controllers and IOMMUs will lead to reconsideration of some of the details. All of these items are logically interlocking, even if physically
disjoint separate. It’s entirely possible that we will make no changes as a result of that further development, but it’s far from certain.
Furthermore, the hypervisor extension is of substantially greater with those other items completed, so we aren’t losing out as much as it might seem by postponing the freeze.
Hi All,
The RISC-V H-extension v0.6.1 draft was released almost a year back in
May 2020. There has been no changes in the H-extension specification
since then.
Meanwhile, we have RISC-V H-extension v0.6.1 implemented in QEMU,
Spike, and Rocket FPGA. We also have three different hypervisors ported
to RISC-V H-extension v0.6.1:
1. Xvisor RISC-V (Works on QEMU, Spike, and Rocket FPGA)
2. KVM RISC-V (Works on QEMU, Spike and Rocket FPGA)
3. Bao (Works on Rocket FPGA)
Unfortunately, RISC-V H-extension not being in freeze state is now gating
further software development because major open source projects (such
as Linux RISC-V and GCC RISC-V) have adopted a policy of accepting patches
only for frozen or ratified RISC-V extensions.
Few examples of gated software development:
1. KVM RISC-V not merged in upstream Linux RISC-V. The KVM RISC-V
patches are already reviewed and acked by maintainers in July 2019.
Currently, we are rebasing KVM RISC-V patches with every kernel
release since 1.5+ years.
2. GCC RISC-V not accepting patches for H-extension related instructions
3. KVMTOOL RISC-V not merged because KVM RISC-V is not merged in
upstream Linux RISC-V
4. QEMU KVM RISC-V acceleration not merged because KVM RISC-V is
not merged in upstream Linux RISC-V
5. Various feature additions (such as SBI v0.2, nested, etc) can't happen
(or can't be merged) until KVM RISC-V is merged in Linux RISC-V
6. Libvirt library blocked on QEMU KVM RISC-V acceleration being
available. The Libvirt library is a crucial piece in open-source cloud
solutions (such as open-stack).
7. As time passes more items (such as KVM RISC-V AIA support) will
get blocked if KVM RISC-V is not merged upstream.
We would request the TSC to consider freezing RISC-V H-extension v0.6.1
draft specification. Remaining items in done checklist for ratification can
certainly be completed while H-extension is in the frozen state.
Best Regards,
Anup Patel
A quarter has passed, so we're 50% of the way towards talking "quarters". So please let me ask three questions:
- Has any insight been formed in the AIA specification as to whether it will require changes to the Hypervisor specification, and whether these changes can be done as part of the AIA specification (just like pointer masking is already defining vs* CSRs?)
- Has any list been made of which extensions should be frozen before the Hypervisor extension, and is there a clear path towards freezing them in a reasonable time period? Does this list include pointer masking, and if so why (considering that pointer masking is already being specified as if the Hypervisor extension is frozen or ratified first)?
- Is there any date being set for whatever meetings are needed to freeze the Hypervisor extension after all the dependendencies are frozen?
If the answer to any of the above three questions is no, what can be done to avoid the frankly ludicrous delay in the approval of a specification that has seen no significant change in one year?
Thanks,
Paolo
> So, in my own opinion, we're getting close. Not a few weeks, but not quarters either. (I'll also say that the "pressure is on" to intelligently try and get through this period of time sooner than later.)
A quarter has passed, so we're 50% of the way towards talking "quarters". So please let me ask three questions:
- Has any insight been formed in the AIA specification as to whether it will require changes to the Hypervisor specification, and whether these changes can be done as part of the AIA specification (just like pointer masking is already defining vs* CSRs?)
- Has any list been made of which extensions should be frozen before the Hypervisor extension, and is there a clear path towards freezing them in a reasonable time period? Does this list include pointer masking, and if so why (considering that pointer masking is already being specified as if the Hypervisor extension is frozen or ratified first)?
- Is there any date being set for whatever meetings are needed to freeze the Hypervisor extension after all the dependendencies are frozen?
If the answer to any of the above three questions is no, what can be done to avoid the frankly ludicrous delay in the approval of a specification that has seen no significant change in one year?
Thanks,
Paolo
If the answer to any of the above three questions is no, what can beThere is at least one small but significant change to the hypervisor
done to avoid the frankly ludicrous delay in the approval of a
specification that has seen no significant change in one year?
extension being discussed, to redefine the "G" bit in G-stage address
translation PTEs to indicate that a page of guest physical address
space is "virtual I/O", meaning the hardware must order VM accesses to
those addresses as though they were I/O accesses, not main memory.
Another minor change planned is to have attempts to write a strictly
read-only CSR always raise an illegal instruction exception, instead
of sometimes raising a virtual instruction exception as currently
specified.
The reason there has been no movement on the hypervisor extension
for several months is not because there is totally nothing to do, but
because I've lacked the time to attend to it simultaneously with a
thousand other things.
If you'd like more progress on the hypervisor extension, feel free to
drive the discussion to get agreement one way or another on the first
point, the "I/O" bit in G-stage PTEs. The issue concerns when a
hypervisor is emulating a device that has memory that is supposed to be
in I/O space but is actually being emulated using main memory. A guest
OS expects accesses to that virtual device memory to be in I/O space
and ordered according to the I/O rules, but that's not currently what
happens.
- John Hauser
-----Original Message-----From hypervisor perspective, the "G" bit in G-stage PTEs is not used at all.
From: tech-privileged@... <tech-privileged@...> On
Behalf Of John Hauser
Sent: 28 May 2021 23:41
To: tech-privileged@...
Cc: Paolo Bonzini <pbonzini@...>
Subject: Re: [RISC-V] [tech-privileged] RISC-V H-extension freeze consideration
Paolo Bonzini wrote:If the answer to any of the above three questions is no, what can beThere is at least one small but significant change to the hypervisor extension
done to avoid the frankly ludicrous delay in the approval of a
specification that has seen no significant change in one year?
being discussed, to redefine the "G" bit in G-stage address translation PTEs to
indicate that a page of guest physical address space is "virtual I/O", meaning
the hardware must order VM accesses to those addresses as though they
were I/O accesses, not main memory.
For software emulated MMIO, the hypervisor does not create any mapping
in the G-stage to ensure that it always traps which allows hypervisor to
trap-n-emulate it.
For pass-through MMIO (such as IMSIC guest MSI files directly accessed
by Guest), the guest physical address translates to host physical address
of actual MMIO device in the G-stage and we will have host PMAs which
will mark all MMIO devices as IO regions.
At this point, the G bit in the G-stage PTE is unused from software
perspective. Why do we need to re-purpose G-bit because we already
have PMAs marking all MMIO addresses as I/O region ?
Regards,
Another minor change planned is to have attempts to write a strictly read-
only CSR always raise an illegal instruction exception, instead of sometimes
raising a virtual instruction exception as currently specified.
The reason there has been no movement on the hypervisor extension for
several months is not because there is totally nothing to do, but because I've
lacked the time to attend to it simultaneously with a thousand other things.
If you'd like more progress on the hypervisor extension, feel free to drive the
discussion to get agreement one way or another on the first point, the "I/O"
bit in G-stage PTEs. The issue concerns when a hypervisor is emulating a
device that has memory that is supposed to be in I/O space but is actually
being emulated using main memory. A guest OS expects accesses to that
virtual device memory to be in I/O space and ordered according to the I/O
rules, but that's not currently what happens.
- John Hauser
Anup
Why do we need to re-purpose G-bit because we alreadyTo repeat myself:
have PMAs marking all MMIO addresses as I/O region ?
The issue concerns when a hypervisor is emulating aThe PMAs aren't correct in this situation.
device that has memory that is supposed to be in I/O space but is actually
being emulated using main memory. A guest OS expects accesses to that
virtual device memory to be in I/O space and ordered according to the I/O
rules, but that's not currently what happens.
- John Hauser
Anup Patel wrote:
> Why do we need to re-purpose G-bit because we already
> have PMAs marking all MMIO addresses as I/O region ?
To repeat myself:
> The issue concerns when a hypervisor is emulating a
> device that has memory that is supposed to be in I/O space but is actually
> being emulated using main memory. A guest OS expects accesses to that
> virtual device memory to be in I/O space and ordered according to the I/O
> rules, but that's not currently what happens.
The PMAs aren't correct in this situation.
- John Hauser
What sort of device exposes regions of memory in I/O space? When I think ofHistorically, video cards and network cards definitely had memory
hypervisors emulating devices, all their registers typically *do stuff*
when you write to them.
buffers in what RISC-V would consider I/O space. Yes, typical video
and networking hardware may work differently today, but can we be
certain there are absolutely no such devices any more of any kind that
we need to care about? And will never be in the future, either?
I'd be fine if the answer is "yes", but I'm sure not willing to commit
to that answer solely on my own incomplete knowledge.
- John Hauser
> What sort of device exposes regions of memory in I/O space? When I think of
> hypervisors emulating devices, all their registers typically *do stuff*
> when you write to them.
Historically, video cards and network cards definitely had memory
buffers in what RISC-V would consider I/O space. Yes, typical video
and networking hardware may work differently today, but can we be
certain there are absolutely no such devices any more of any kind that
we need to care about? And will never be in the future, either?
I'd be fine if the answer is "yes", but I'm sure not willing to commit
to that answer solely on my own incomplete knowledge.
- John Hauser
Emulating an embedded system within a virtual machine is something weGreg Favor:
want to support, which implies an ability to emulate unsophisticated
hardware. Such as a video system that has a main video frame buffer
located in I/O space, as in olden times.
It seems like this G-stage "I/O" bit is going down a questionable rabbitDoesn't the x86 architecture have "total store ordering", and wouldn't
hole that:
- Provides functionality that is not provided by other architectures (x86,
ARM). (E.g. on ARM, main memory, whether cacheable or noncacheable, is
weakly ordered. There is no way to get strong "I/O" ordering within a page
without declaring the page as "I/O".)
that fact make the matter moot for it?
If Arm has a way to declare a page as "I/O", is that different than the
"virtual I/O" bit being debated for RISC-V G-stage address translation?
- Supports "legacy" situations that it is unclear who would actually care- John Hauser
about. (Or is this going to be the special thing that attracts people to
RISC-V since the other architectures have apparently ignored a real need in
the market. Sorry if that strayed too far into sarcasm.)
- Burdens all H implementations with supporting functionality that is
motivated by an uncertain "legacy" situation. At best this bit should be
optional.
I wrote:
> Emulating an embedded system within a virtual machine is something we
> want to support, which implies an ability to emulate unsophisticated
> hardware. Such as a video system that has a main video frame buffer
> located in I/O space, as in olden times.
Greg Favor:
> It seems like this G-stage "I/O" bit is going down a questionable rabbit
> hole that:
>
> - Provides functionality that is not provided by other architectures (x86,
> ARM). (E.g. on ARM, main memory, whether cacheable or noncacheable, is
> weakly ordered. There is no way to get strong "I/O" ordering within a page
> without declaring the page as "I/O".)
Doesn't the x86 architecture have "total store ordering", and wouldn't
that fact make the matter moot for it?
If Arm has a way to declare a page as "I/O", is that different than the
"virtual I/O" bit being debated for RISC-V G-stage address translation?
It seems like this G-stage "I/O" bit is going down a questionable rabbitI may have found the formula to defuse the issue. I think we can
hole that:
[...]
eliminate the need for a "virtual I/O" bit in G-stage page tables by
simply requiring device drivers to act more conservatively. To that
end, I propose inserting some version of the following two paragraphs
into the Unprivileged ISA's section 1.4, "Memory".
First:
A naturally aligned 256-byte subregion of the address space is
called a _paragraph_. The minimum unit of contiguous main memory
is a complete paragraph. That is, if any byte within a paragraph
is main memory, then every byte of the paragraph must be main
memory; and conversely, if any byte of a paragraph is vacant or
I/O, then every non-vacant byte of the paragraph must be considered
I/O.
Second:
If an I/O device has memory that is accessible in the address
space, and if any paragraph of that memory has the properties
that permit the system to label it as main memory, an execution
environment may choose each such paragraph to be either main
memory or I/O. When the same type of I/O device exists in multiple
systems, portable RISC-V software must assume that device memory
that is considered main memory in one execution environment may be
considered I/O in another execution environment, and vice versa.
It may also be appropriate to add a comment to the FENCE section
reminding that software may not know whether some device memory is
considered main memory or I/O, in which case it will need to fence
conservatively for either possibility.
I welcome comments.
- John Hauser
- If the Hypervisor informs the Guest that a memory region is IO but remaps it to a Main Memory region, then a FENCE [IO],[IO] executed by the guest might not order the desired accesses.
- If the Hypervisor informs the Guest that a memory region is Main Memory but remaps it to an IO region, then a FENCE [RW],[RW] executed by the guest might not order the desired accesses.
- If the Hypervisor informs the Guest that a memory region is Strongly Ordered but remaps it to a Weakly Ordered region, then only weak memory ordering might be provided."
Greg Favor wrote:
> It seems like this G-stage "I/O" bit is going down a questionable rabbit
> hole that:
> [...]
I may have found the formula to defuse the issue. I think we can
eliminate the need for a "virtual I/O" bit in G-stage page tables by
simply requiring device drivers to act more conservatively. To that
end, I propose inserting some version of the following two paragraphs
into the Unprivileged ISA's section 1.4, "Memory".
First:
A naturally aligned 256-byte subregion of the address space is
called a _paragraph_. The minimum unit of contiguous main memory
is a complete paragraph. That is, if any byte within a paragraph
is main memory, then every byte of the paragraph must be main
memory; and conversely, if any byte of a paragraph is vacant or
I/O, then every non-vacant byte of the paragraph must be considered
I/O.
Second:
If an I/O device has memory that is accessible in the address
space, and if any paragraph of that memory has the properties
that permit the system to label it as main memory, an execution
environment may choose each such paragraph to be either main
memory or I/O. When the same type of I/O device exists in multiple
systems, portable RISC-V software must assume that device memory
that is considered main memory in one execution environment may be
considered I/O in another execution environment, and vice versa.
It may also be appropriate to add a comment to the FENCE section
reminding that software may not know whether some device memory is
considered main memory or I/O, in which case it will need to fence
conservatively for either possibility.
I welcome comments.
- John Hauser
Another avenue to the same goal of eliminating the need for the "virtual I/O" bit in G-stage PTEs,rather than changing the guest device driver to add otherwise unnecessary FENCEs to Strongly Ordered IO regions or upgrade FENCE [IO],[IO] to RWIO,RWIO,require the Hypervisor to not break Guest execution (by telling the Guest one thing and doing another)!Example text:"When the Hypervisor informs the Guest that a memory region is Main Memory, IO, or Strongly Ordered IO, then it must not remap those addresses to a memory region of a different type (Main Memory, IO, or Strongly Ordered IO), otherwise memory ordering consistency could be lost by the Guest:
- If the Hypervisor informs the Guest that a memory region is IO but remaps it to a Main Memory region, then a FENCE [IO],[IO] executed by the guest might not order the desired accesses.
- If the Hypervisor informs the Guest that a memory region is Main Memory but remaps it to an IO region, then a FENCE [RW],[RW] executed by the guest might not order the desired accesses.
- If the Hypervisor informs the Guest that a memory region is Strongly Ordered but remaps it to a Weakly Ordered region, then only weak memory ordering might be provided."
-- JohnOn Sun, May 30, 2021 at 9:06 AM John Hauser <jh.riscv@...> wrote:Greg Favor wrote:
> It seems like this G-stage "I/O" bit is going down a questionable rabbit
> hole that:
> [...]
I may have found the formula to defuse the issue. I think we can
eliminate the need for a "virtual I/O" bit in G-stage page tables by
simply requiring device drivers to act more conservatively. To that
end, I propose inserting some version of the following two paragraphs
into the Unprivileged ISA's section 1.4, "Memory".
First:
A naturally aligned 256-byte subregion of the address space is
called a _paragraph_. The minimum unit of contiguous main memory
is a complete paragraph. That is, if any byte within a paragraph
is main memory, then every byte of the paragraph must be main
memory; and conversely, if any byte of a paragraph is vacant or
I/O, then every non-vacant byte of the paragraph must be considered
I/O.
Second:
If an I/O device has memory that is accessible in the address
space, and if any paragraph of that memory has the properties
that permit the system to label it as main memory, an execution
environment may choose each such paragraph to be either main
memory or I/O. When the same type of I/O device exists in multiple
systems, portable RISC-V software must assume that device memory
that is considered main memory in one execution environment may be
considered I/O in another execution environment, and vice versa.
It may also be appropriate to add a comment to the FENCE section
reminding that software may not know whether some device memory is
considered main memory or I/O, in which case it will need to fence
conservatively for either possibility.
I welcome comments.
- John Hauser
Yep, typically some kind of video RAM, or a buffer for outgoing/incoming network packets. Some old SCSI controllers also had a "scripts RAM" to program the DMA engine.What sort of device exposes regions of memory in I/O space? When I think ofHistorically, video cards and network cards definitely had memory
hypervisors emulating devices, all their registers typically*do stuff*
when you write to them.
buffers in what RISC-V would consider I/O space.
In practice, this shouldn't be an issue because typically there is some kind of "doorbell" register that the device writes to after filling in the on-board RAM. The doorbell register is where the synchronization happens between the writer (driver) and the reader (device). Likewise, interrupt injection is where the synchronization happens between the device's writes and the driver's reads.
All in all, this doesn't seem like an absolute necessity in the first version of the specification. There have been extensions to the S-mode page tables like Svnapot and this example is similarly niche.
Paolo
Yes, typical video
and networking hardware may work differently today, but can we be
certain there are absolutely no such devices any more of any kind that
we need to care about? And will never be in the future, either?
I'd be fine if the answer is "yes", but I'm sure not willing to commit
to that answer solely on my own incomplete knowledge.