On Tue, Jul 21, 2020 at 6:09 AM Jonathan Behrens < behrensj@...> wrote: My understanding is that sfence.vma's are never required by the RISC-V spec, only that failing to do them can cause undesirable but well defined behavior.
The preceding is true, but your following paragraph isn't quite true. In the current architecture spec one isn't free to hijack or reuse ASID=0 in the way I think you are describing. It might be nice if that was allowed, but I don't believe it is. As far as trying to change the arch spec to allow something like this, I'm not pushing for that (and expect that would meet a lot of resistance).
Greg
I'd suggest that the same be true here. We could consider the Bare mode to reuse ASID=0, and therefore software would have to do a sfence.vma only if there were stale mappings in that ASID that it didn't want to be used. My (completely uninformed) guess is that it shouldn't be too difficult for hardware to ignore global mappings when in Bare mode, but if people think otherwise, then the spec could say those also need to be flushed.
Overall I agree that this case doesn't need to be fast, but still should be consistent with how RISC-V does things in other places. And if it is possible to make TLB flushes restricted to a single ASID rather than global across all of them, then I think it makes sense to try to achieve that.
Jonathan
Comments below: On Mon, Jul 20, 2020 at 8:14 PM Bill Huffman < huffman@...> wrote:
Hi Greg,
My sense is that the transitions from SvXX to Bare and from Bare to the same SvXX that was previously in force are special transitions. One reason seems to me the extreme simplicity of Bare compared with other modes. It's
easier to switch. Switches to/from Bare mode should be rare. Typically one will switch from Bare mode to a translated mode as part of booting up an OS (e.g. Linux), and then will remain in that mode (until, say, the system crashes and must reboot). Further, all such switches are performed under full software control.
Switching to/from M-mode on the other hand is frequent and often hardware initiated. Also, any sfence.vma on M-mode exit would have to be after the exit (in potentially arbitrary code that happens to be at the return address).
Hence sfence'ing M-mode entry/exit is impractical as well as something that needs to be performant. Whereas Bare mode entry/exit is rare and software-managed.
If we require sfence.vma after a switch to or from Bare, does that also mean we have to require one after a switch to or from M-mode? If no, why is it different?
If a high-performance design caches "translations" in all modes of operation (including M-mode) in the TLBs, then M-mode translations must be distinguished from S/HS/U mode translations, which must be distinguished from VS/VU mode translations. That is a small set of three translation regimes (to use an ARMv8 term) for hardware to support and handle properly.
If one has to also distinguish Bare and non-Bare modes within the S/HS/U translation regime, that effectively becomes two separate translation regimes. Similarly, with the H-extension and two-stage translations inside VM's, the VS/VU regime needs to become four regimes (the four combinations of Bare and non-Bare stage 1 and stage 2 modes). Consequently TLB entries and surrounding logic now need to distinguish between and properly handle seven translation regimes. All to handle rare cases.
That, like most things, is doable, but isn't the whole point of a RISC architecture to reduce hardware cost and complexity and shift that to software where the software burden is small and the performance cost is minimal?
Greg
P.S. One could imagine instead doing data-dependent implicit sfence.vma operation on CSR writes to the *atp registers, but besides being data-dependent (which RISC-V avoids when it comes to having data-dependent exceptions) that is a rather CISC'y thing to do. Which goes back to my preceding point. If yes, it will cost more to switch briefly to M-mode than
I'd want it to.
Bill
On 7/20/20 7:08 PM, Greg Favor wrote:
EXTERNAL MAIL
I would like to get people's views on the question of when is an sfence.vma required after changing the satp.mode field (to see what support there is for the following change/clarification in the Privileged spec).
Currently an sfence.vma is required when changing between SvXX modes, but is not required when changing to/from Bare mode.
In both cases there is an implicit wholesale change to the page tables, i.e. the translation of any given address generally has suddenly changed.
For some designs (that cache Bare mode translations in the TLBs for the sake of caching the PMA and PMP info for an address), having software be required to do an sfence.vma can simplify the hardware.
So the question is whether there should be architectural consistency in requiring sfence'ing after changing satp.mode (i.e. all mode changes require an sfence), versus having some mode cases being treated differently (i.e. changes to/from Bare mode not
requiring an sfence)?
My (and Ventana's) bias is towards the former - for our sake and for other future high performance CPU designs by others. But I'm interested to see if others feel similarly or not.
Greg
|
|
Jonathan Behrens <behrensj@...>
My understanding is that sfence.vma's are never required by the RISC-V spec, only that failing to do them can cause undesirable but well defined behavior.
I'd suggest that the same be true here. We could consider the Bare mode to reuse ASID=0, and therefore software would have to do a sfence.vma only if there were stale mappings in that ASID that it didn't want to be used. My (completely uninformed) guess is that it shouldn't be too difficult for hardware to ignore global mappings when in Bare mode, but if people think otherwise, then the spec could say those also need to be flushed.
Overall I agree that this case doesn't need to be fast, but still should be consistent with how RISC-V does things in other places. And if it is possible to make TLB flushes restricted to a single ASID rather than global across all of them, then I think it makes sense to try to achieve that.
Jonathan
toggle quoted message
Show quoted text
Comments below: On Mon, Jul 20, 2020 at 8:14 PM Bill Huffman < huffman@...> wrote:
Hi Greg,
My sense is that the transitions from SvXX to Bare and from Bare to the same SvXX that was previously in force are special transitions. One reason seems to me the extreme simplicity of Bare compared with other modes. It's
easier to switch. Switches to/from Bare mode should be rare. Typically one will switch from Bare mode to a translated mode as part of booting up an OS (e.g. Linux), and then will remain in that mode (until, say, the system crashes and must reboot). Further, all such switches are performed under full software control.
Switching to/from M-mode on the other hand is frequent and often hardware initiated. Also, any sfence.vma on M-mode exit would have to be after the exit (in potentially arbitrary code that happens to be at the return address).
Hence sfence'ing M-mode entry/exit is impractical as well as something that needs to be performant. Whereas Bare mode entry/exit is rare and software-managed.
If we require sfence.vma after a switch to or from Bare, does that also mean we have to require one after a switch to or from M-mode? If no, why is it different?
If a high-performance design caches "translations" in all modes of operation (including M-mode) in the TLBs, then M-mode translations must be distinguished from S/HS/U mode translations, which must be distinguished from VS/VU mode translations. That is a small set of three translation regimes (to use an ARMv8 term) for hardware to support and handle properly.
If one has to also distinguish Bare and non-Bare modes within the S/HS/U translation regime, that effectively becomes two separate translation regimes. Similarly, with the H-extension and two-stage translations inside VM's, the VS/VU regime needs to become four regimes (the four combinations of Bare and non-Bare stage 1 and stage 2 modes). Consequently TLB entries and surrounding logic now need to distinguish between and properly handle seven translation regimes. All to handle rare cases.
That, like most things, is doable, but isn't the whole point of a RISC architecture to reduce hardware cost and complexity and shift that to software where the software burden is small and the performance cost is minimal?
Greg
P.S. One could imagine instead doing data-dependent implicit sfence.vma operation on CSR writes to the *atp registers, but besides being data-dependent (which RISC-V avoids when it comes to having data-dependent exceptions) that is a rather CISC'y thing to do. Which goes back to my preceding point. If yes, it will cost more to switch briefly to M-mode than
I'd want it to.
Bill
On 7/20/20 7:08 PM, Greg Favor wrote:
EXTERNAL MAIL
I would like to get people's views on the question of when is an sfence.vma required after changing the satp.mode field (to see what support there is for the following change/clarification in the Privileged spec).
Currently an sfence.vma is required when changing between SvXX modes, but is not required when changing to/from Bare mode.
In both cases there is an implicit wholesale change to the page tables, i.e. the translation of any given address generally has suddenly changed.
For some designs (that cache Bare mode translations in the TLBs for the sake of caching the PMA and PMP info for an address), having software be required to do an sfence.vma can simplify the hardware.
So the question is whether there should be architectural consistency in requiring sfence'ing after changing satp.mode (i.e. all mode changes require an sfence), versus having some mode cases being treated differently (i.e. changes to/from Bare mode not
requiring an sfence)?
My (and Ventana's) bias is towards the former - for our sake and for other future high performance CPU designs by others. But I'm interested to see if others feel similarly or not.
Greg
|
|
Comments below: On Mon, Jul 20, 2020 at 8:14 PM Bill Huffman < huffman@...> wrote:
Hi Greg,
My sense is that the transitions from SvXX to Bare and from Bare to the same SvXX that was previously in force are special transitions. One reason seems to me the extreme simplicity of Bare compared with other modes. It's
easier to switch. Switches to/from Bare mode should be rare. Typically one will switch from Bare mode to a translated mode as part of booting up an OS (e.g. Linux), and then will remain in that mode (until, say, the system crashes and must reboot). Further, all such switches are performed under full software control.
Switching to/from M-mode on the other hand is frequent and often hardware initiated. Also, any sfence.vma on M-mode exit would have to be after the exit (in potentially arbitrary code that happens to be at the return address).
Hence sfence'ing M-mode entry/exit is impractical as well as something that needs to be performant. Whereas Bare mode entry/exit is rare and software-managed.
If we require sfence.vma after a switch to or from Bare, does that also mean we have to require one after a switch to or from M-mode? If no, why is it different?
If a high-performance design caches "translations" in all modes of operation (including M-mode) in the TLBs, then M-mode translations must be distinguished from S/HS/U mode translations, which must be distinguished from VS/VU mode translations. That is a small set of three translation regimes (to use an ARMv8 term) for hardware to support and handle properly.
If one has to also distinguish Bare and non-Bare modes within the S/HS/U translation regime, that effectively becomes two separate translation regimes. Similarly, with the H-extension and two-stage translations inside VM's, the VS/VU regime needs to become four regimes (the four combinations of Bare and non-Bare stage 1 and stage 2 modes). Consequently TLB entries and surrounding logic now need to distinguish between and properly handle seven translation regimes. All to handle rare cases.
That, like most things, is doable, but isn't the whole point of a RISC architecture to reduce hardware cost and complexity and shift that to software where the software burden is small and the performance cost is minimal?
Greg
P.S. One could imagine instead doing data-dependent implicit sfence.vma operation on CSR writes to the *atp registers, but besides being data-dependent (which RISC-V avoids when it comes to having data-dependent exceptions) that is a rather CISC'y thing to do. Which goes back to my preceding point. If yes, it will cost more to switch briefly to M-mode than
I'd want it to.
Bill
On 7/20/20 7:08 PM, Greg Favor wrote:
EXTERNAL MAIL
I would like to get people's views on the question of when is an sfence.vma required after changing the satp.mode field (to see what support there is for the following change/clarification in the Privileged spec).
Currently an sfence.vma is required when changing between SvXX modes, but is not required when changing to/from Bare mode.
In both cases there is an implicit wholesale change to the page tables, i.e. the translation of any given address generally has suddenly changed.
For some designs (that cache Bare mode translations in the TLBs for the sake of caching the PMA and PMP info for an address), having software be required to do an sfence.vma can simplify the hardware.
So the question is whether there should be architectural consistency in requiring sfence'ing after changing satp.mode (i.e. all mode changes require an sfence), versus having some mode cases being treated differently (i.e. changes to/from Bare mode not
requiring an sfence)?
My (and Ventana's) bias is towards the former - for our sake and for other future high performance CPU designs by others. But I'm interested to see if others feel similarly or not.
Greg
|
|
Hi Greg,
My sense is that the transitions from SvXX to Bare and from Bare to the same SvXX that was previously in force are special transitions. One reason seems to me the extreme simplicity of Bare compared with other modes. It's
easier to switch.
If we require sfence.vma after a switch to or from Bare, does that also mean we have to require one after a switch to or from M-mode? If no, why is it different? If yes, it will cost more to switch briefly to M-mode than
I'd want it to.
Bill
On 7/20/20 7:08 PM, Greg Favor wrote:
toggle quoted message
Show quoted text
EXTERNAL MAIL
I would like to get people's views on the question of when is an sfence.vma required after changing the satp.mode field (to see what support there is for the following change/clarification in the Privileged spec).
Currently an sfence.vma is required when changing between SvXX modes, but is not required when changing to/from Bare mode.
In both cases there is an implicit wholesale change to the page tables, i.e. the translation of any given address generally has suddenly changed.
For some designs (that cache Bare mode translations in the TLBs for the sake of caching the PMA and PMP info for an address), having software be required to do an sfence.vma can simplify the hardware.
So the question is whether there should be architectural consistency in requiring sfence'ing after changing satp.mode (i.e. all mode changes require an sfence), versus having some mode cases being treated differently (i.e. changes to/from Bare mode not
requiring an sfence)?
My (and Ventana's) bias is towards the former - for our sake and for other future high performance CPU designs by others. But I'm interested to see if others feel similarly or not.
Greg
|
|
I would like to get people's views on the question of when is an sfence.vma required after changing the satp.mode field (to see what support there is for the following change/clarification in the Privileged spec).
Currently an sfence.vma is required when changing between SvXX modes, but is not required when changing to/from Bare mode.
In both cases there is an implicit wholesale change to the page tables, i.e. the translation of any given address generally has suddenly changed.
For some designs (that cache Bare mode translations in the TLBs for the sake of caching the PMA and PMP info for an address), having software be required to do an sfence.vma can simplify the hardware.
So the question is whether there should be architectural consistency in requiring sfence'ing after changing satp.mode (i.e. all mode changes require an sfence), versus having some mode cases being treated differently (i.e. changes to/from Bare mode not requiring an sfence)?
My (and Ventana's) bias is towards the former - for our sake and for other future high performance CPU designs by others. But I'm interested to see if others feel similarly or not.
Greg
|
|