Date   

Re: Fast-track "stimecmp / vstimecmp" extension proposal

Allen Baum
 

Yea, that makes sense, though I'd say it's not strictly backwards compatible; old Mcode will silently fail.
I will not complain if we make a blanket statement that Mcode backwards compatibility is not a requirement.

On Tue, Dec 1, 2020 at 5:53 PM Greg Favor <gfavor@...> wrote:
On Tue, Dec 1, 2020 at 5:21 PM Allen Baum <allen.baum@...> wrote:
So to know if M-mode can fake an Smode counter interrupt or not,(which it can currently do), 
Mmode SW needs to attempt to write STMECMP and see if it traps? OR attempt to set the STIP bit and see if it sticks? 
You might want some non-normative text about discovery.

This text takes the same approach as much of the rest of the Supervisor chapter in the Priv spec, i.e. the existing and coming new discovery methods apply equally to these new CSRs as to all existing and new s*/h*/vs* CSRs that may be accessible in S/HS modes.  This extension doesn't try to establish anything different.  Or, put differently, the discovery issue is a general issue (for which this extension does not represent any form of special case).

Whether for this Zstc extension or the coming (for example) Zsn and Zsa extensions, there is trapping behavior as one (ugly) form of discovery method.  And then there will be the new discovery method being developed by tech-config which will be the more reasonable way to discover many things about optional extensions, and optional or WARL features within extensions.

So any new non-normative text about CSR discovery really should be added to the Priv/Unpriv specs as a general matter (separate from this and other in-progress Supervisor extensions).  (And each extension innovating its own "better" discovery method would be an undesirable way to go.)
 
IS there some reason for removing the existing behavior of allowing STIP to be writable in MIP?

Yes.  This was first pointed out by John, and Andrew and I agreed.  The gist is that when a hart implements stimecmp, for consistency we should require STIP to be read-only - just like MTIP.  (Especially since there is no longer a need for a software-writable bit behind STIP when stimecmp exists.)

Greg


Re: Fast-track "stimecmp / vstimecmp" extension proposal

Greg Favor
 

On Tue, Dec 1, 2020 at 5:21 PM Allen Baum <allen.baum@...> wrote:
So to know if M-mode can fake an Smode counter interrupt or not,(which it can currently do), 
Mmode SW needs to attempt to write STMECMP and see if it traps? OR attempt to set the STIP bit and see if it sticks? 
You might want some non-normative text about discovery.

This text takes the same approach as much of the rest of the Supervisor chapter in the Priv spec, i.e. the existing and coming new discovery methods apply equally to these new CSRs as to all existing and new s*/h*/vs* CSRs that may be accessible in S/HS modes.  This extension doesn't try to establish anything different.  Or, put differently, the discovery issue is a general issue (for which this extension does not represent any form of special case).

Whether for this Zstc extension or the coming (for example) Zsn and Zsa extensions, there is trapping behavior as one (ugly) form of discovery method.  And then there will be the new discovery method being developed by tech-config which will be the more reasonable way to discover many things about optional extensions, and optional or WARL features within extensions.

So any new non-normative text about CSR discovery really should be added to the Priv/Unpriv specs as a general matter (separate from this and other in-progress Supervisor extensions).  (And each extension innovating its own "better" discovery method would be an undesirable way to go.)
 
IS there some reason for removing the existing behavior of allowing STIP to be writable in MIP?

Yes.  This was first pointed out by John, and Andrew and I agreed.  The gist is that when a hart implements stimecmp, for consistency we should require STIP to be read-only - just like MTIP.  (Especially since there is no longer a need for a software-writable bit behind STIP when stimecmp exists.)

Greg


Re: Fast-track "stimecmp / vstimecmp" extension proposal

Allen Baum
 

So to know if M-mode can fake an Smode counter interrupt or not,(which it can currently do), 
Mmode SW needs to attempt to write STMECMP and see if it traps? OR attempt to set the STIP bit and see if it sticks? 
You might want some non-normative text about discovery.
IS there some reason for removing the existing behavior of allowing STIP to be writable in MIP?



On Tue, Dec 1, 2020 at 5:00 PM Greg Favor <gfavor@...> wrote:
Hi all,

Recently the TSC established a lightweight "fast track" architecture extension process that small, straightforward, relatively uncontentious arch extension proposals can utilize.  Andrew and I discussed trying to help move a couple of Privileged architecture related small extensions - that a number of people/companies have expressed desire for over the past year - through this process sooner than later (especially since this entails much more than simply proposing and reviewing a spec).  For this one Siqi and I are co-sponsors.  The following starts with an intro for context, and then provides the draft spec.

=================================================================
The current Privileged specification only defines a hardware mechanism for generating machine-mode timer interrupts (based on the mtime and mtimecmp registers).  With the resultant requirement that timer services for S-mode/HS-mode (and for VS-mode) have to all be provided by M-mode - via SBI calls from S/HS-mode up to M-mode (or VS-mode calls to HS-mode and then to M-mode).  M-mode software then multiplexes these multiple logical timers onto its one physical M-mode timer facility, and the M-mode timer interrupt handler passes timer interrupts back down to the appropriate lower privilege mode.

Over the past year or more, there have been a number of requests to provide supervisor mode with its own CSR-based timer interrupt facility that it can directly manage to provide its own timer service (in the form of having its own stimecmp register) - thus eliminating the large overheads for emulating S/HS-mode timers and timer interrupt generation up in M-mode.  This proposal, using the new fast-track architecture extension process, serves to accomplish exactly this.  Further, this proposal lays the groundwork for adding a similar facility to the Hypervisor extension for VS-mode.   (ARMv8, for example, has separate timer registers for its equivalent of M, S/HS, and VS privilege levels.)

Below is a one-page draft spec of the proposal - which sticks to addressing the basic well-understood need that has been requested.  The proposed extension name is "Zstc" ('Zs' for privileged Supervisor arch extensions, and 'tc' for timecmp).

Various further pieces of related architectural functionality have previously been suggested, debated, and questioned - without clear conclusion and agreement as to whether these are truly justified.  These more contentious enhancements are left to be sorted out and pursued by a follow-on effort if desired and justified by sufficient people.

The suggestion to also introduce an mtimecmp CSR has been considered, but ruled out for now as not sufficiently justified over the current definition of mtimecmp as a memory-mapped register (which would have to remain for backward compatibility in any case - resulting in two aliases for this register).  But if there is sufficient interest, a separate email thread can be started to discuss the possible addition of an mtimecmp CSR (as part of this or a separate fast-track extension).  For now I'll leave out details about how the mtimecmp memory-mapped register and mtimecmp CSR could inter-relate (it doesn't have to be complicated in practice).

This extension adds a single CSR - the stimecmp CSR.  This extension also describes what the corresponding addition to the Hypervisor extension would be - the vstimecmp CSR.  The former is the official part of this fast-track arch extension.  The latter is solely informative and reflective of what would then likely be added to the current H-extension draft spec as part of its virtualization of S-mode.

The draft spec below provides all the details.  Note that this extension very specifically strives to maintain maximal consistency with many little details in the existing Privileged architecture and Hypervisor extension (and has already been reviewed by the two lead Privileged and Hypervisor architects to ensure this).

This posting to this email list starts an initial review period (over the next few weeks) for people to provide feedback, questions, comments, etc.

Thanks,
Siqi and Greg

=========================================================================
=======================  Supervisor Extension Additions  ========================

Supervisor Timer Register (stimecmp)
The stimecmp CSR is a 64-bit register and has 64-bit precision on all RV32 and RV64 systems.  In RV32 only, reads of the stimecmp CSR return the low 32 bits, while reads of the stimecmph CSR return the high 32 bits of stimecmp.

The proposed CSR numbers for stimecmp / stimecmph are 0x108 / 0x109 (within the Supervisor Trap Setup block of CSRs).

A supervisor timer interrupt becomes pending - as reflected in the STIP bit in the mip and sip registers - whenever mtime contains a value greater than or equal to stimecmp, treating the values as unsigned integers.  Writes to stimecmp are guaranteed to be reflected in STIP eventually, but not necessarily immediately.  The interrupt remains posted until stimecmp becomes greater than mtime - typically as a result of writing stimecmp. The interrupt will be taken based on the standard interrupt enable and delegation rules.

----------------------------  Non-Normative Text    ----------------------------
A spurious timer interrupt might occur if an interrupt handler advances stimecmp then immediately returns, because STIP might not yet have fallen in the interim. All software should be written to assume this event is possible, but most software should assume this event is extremely unlikely. It is almost always more performant to incur an occasional spurious timer interrupt than to poll STIP until it falls.  
----------------------------------------------------------------------------------------

----------------------------  Non-Normative Text    ----------------------------
In systems in which a supervisor execution environment (SEE) provides timer facilities via an SBI function call, this SBI call will continue to support requests to schedule a timer interrupt.  The SEE will simply make use of stimecmp, changing its value as appropriate.  This ensures compatibility with existing S-mode software that uses this SEE facility, while new S-mode software takes advantage of stimecmp directly.)
----------------------------------------------------------------------------------------

Machine Interrupt Registers (mip and mie)
This extension modifies the description of the STIP/STIE bits in these registers as follows:

If supervisor mode is implemented, its mip.STIP and mie.STIE are the interrupt-pending and interrupt-enable bits for supervisor-level timer interrupts.  If the stimecmp register is not implemented, STIP is writable in mip, and may be written by M-mode software to deliver timer interrupts to S-mode.  If the stimecmp (supervisor-mode timer compare) register is implemented, STIP is read-only in mip and reflects the supervisor-level timer interrupt signal resulting from stimecmp.  This timer interrupt signal is cleared by writing stimecmp with a value greater than the current mtime value.

Supervisor Interrupt Registers (sip and sie)
This extension modifies the description of the STIP/STIE bits in these registers as follows:
 
Bits sip.STIP and sie.STIE are the interrupt-pending and interrupt-enable bits for supervisor level timer interrupts.  If implemented, STIP is read-only in sip, and is either set and cleared by the execution environment (if stimecmp is not implemented), or reflects the timer interrupt signal resulting from stimecmp (if stimecmp is implemented).  The sip.STIP bit, in response to timer interrupts generated by stimecmp, is set and cleared by writing stimecmp with a value that respectively is less than or equal to, or greater than, the current mtime value.

Machine Counter-Enable Register (mcounteren)
This extension adds to the description of the TM bit in this register as follows:

In addition, when the TM bit in the mcounteren register is clear, attempts to read the stimecmp register while executing in S-mode will cause an illegal instruction exception. When this bit is set, access to the stimecmp register (if implemented) is permitted in S-mode.


=========================================================================
=======================  Hypervisor Extension Additions  ========================

Virtual Supervisor Timer Register (vstimecmp)
The vstimecmp CSR is a 64-bit register and has 64-bit precision on all RV32 and RV64 systems.  In RV32 only, reads of the vstimecmp CSR return the low 32 bits, while reads of the vstimecmph CSR return the high 32 bits of vstimecmp.

The proposed CSR numbers for vstimecmp / vstimecmph are 0x208 / 0x209 (within the Virtual Supervisor Registers block of CSRs, and mirroring the CSR numbers for stimecmp/stimecmph).

A virtual supervisor timer interrupt becomes pending - as reflected in the VSTIP bit in the hip register - whenever (mtime + htimedelta) contains a value greater than or equal to vstimecmp, treating the values as unsigned integers.  Writes to vstimecmp and htimedelta are guaranteed to be reflected in VSTIP eventually, but not necessarily immediately.  The interrupt remains posted until vstimecmp becomes greater than (mtime + htimedelta) - typically as a result of writing vstimecmp. The interrupt will be taken based on the standard interrupt enable and delegation rules while V=1.

----------------------------  Non-Normative Text    ----------------------------
In systems in which a supervisor execution environment (SEE) implemented by an HS-mode hypervisor provides timer facilities via an SBI function call, this SBI call will continue to support requests to schedule a timer interrupt.  The SEE will simply make use of vstimecmp, changing its value as appropriate.  This ensures compatibility with existing guest VS-mode software that uses this SEE facility, while new VS-mode software takes advantage of vstimecmp directly.)
----------------------------------------------------------------------------------------

Hypervisor Interrupt Registers (hvip, hip, and hie)h
This extension modifies the description of the VSTIP/VSTIE bits in the hip/hie registers as follows:

Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the logical-OR of hvip.VSTIP and the timer interrupt signal resulting from vstimecmp (if vstimecmp is implemented).  The hip.VSTIP bit, in response to timer interrupts generated by vstimecmp, is set and cleared by writing vstimecmp with a value that respectively is less than or equal to, or greater than, the current (mtime + htimedelta) value.  The hip.VSTIP bit remains defined while V=0 as well as V=1.

Hypervisor Counter-Enable Register (hcounteren)
This extension adds to the description of the TM bit in this register as follows:

In addition, when the TM bit in the hcounteren register is clear, attempts to read the vstimecmp register while executing in VS-mode will cause a virtual illegal instruction exception if the same bit in mcounteren is 1. When this bit is set, access to the vstimecmp register (if implemented) is permitted in VS-mode.


Fast-track "stimecmp / vstimecmp" extension proposal

Greg Favor
 

Hi all,

Recently the TSC established a lightweight "fast track" architecture extension process that small, straightforward, relatively uncontentious arch extension proposals can utilize.  Andrew and I discussed trying to help move a couple of Privileged architecture related small extensions - that a number of people/companies have expressed desire for over the past year - through this process sooner than later (especially since this entails much more than simply proposing and reviewing a spec).  For this one Siqi and I are co-sponsors.  The following starts with an intro for context, and then provides the draft spec.

=================================================================
The current Privileged specification only defines a hardware mechanism for generating machine-mode timer interrupts (based on the mtime and mtimecmp registers).  With the resultant requirement that timer services for S-mode/HS-mode (and for VS-mode) have to all be provided by M-mode - via SBI calls from S/HS-mode up to M-mode (or VS-mode calls to HS-mode and then to M-mode).  M-mode software then multiplexes these multiple logical timers onto its one physical M-mode timer facility, and the M-mode timer interrupt handler passes timer interrupts back down to the appropriate lower privilege mode.

Over the past year or more, there have been a number of requests to provide supervisor mode with its own CSR-based timer interrupt facility that it can directly manage to provide its own timer service (in the form of having its own stimecmp register) - thus eliminating the large overheads for emulating S/HS-mode timers and timer interrupt generation up in M-mode.  This proposal, using the new fast-track architecture extension process, serves to accomplish exactly this.  Further, this proposal lays the groundwork for adding a similar facility to the Hypervisor extension for VS-mode.   (ARMv8, for example, has separate timer registers for its equivalent of M, S/HS, and VS privilege levels.)

Below is a one-page draft spec of the proposal - which sticks to addressing the basic well-understood need that has been requested.  The proposed extension name is "Zstc" ('Zs' for privileged Supervisor arch extensions, and 'tc' for timecmp).

Various further pieces of related architectural functionality have previously been suggested, debated, and questioned - without clear conclusion and agreement as to whether these are truly justified.  These more contentious enhancements are left to be sorted out and pursued by a follow-on effort if desired and justified by sufficient people.

The suggestion to also introduce an mtimecmp CSR has been considered, but ruled out for now as not sufficiently justified over the current definition of mtimecmp as a memory-mapped register (which would have to remain for backward compatibility in any case - resulting in two aliases for this register).  But if there is sufficient interest, a separate email thread can be started to discuss the possible addition of an mtimecmp CSR (as part of this or a separate fast-track extension).  For now I'll leave out details about how the mtimecmp memory-mapped register and mtimecmp CSR could inter-relate (it doesn't have to be complicated in practice).

This extension adds a single CSR - the stimecmp CSR.  This extension also describes what the corresponding addition to the Hypervisor extension would be - the vstimecmp CSR.  The former is the official part of this fast-track arch extension.  The latter is solely informative and reflective of what would then likely be added to the current H-extension draft spec as part of its virtualization of S-mode.

The draft spec below provides all the details.  Note that this extension very specifically strives to maintain maximal consistency with many little details in the existing Privileged architecture and Hypervisor extension (and has already been reviewed by the two lead Privileged and Hypervisor architects to ensure this).

This posting to this email list starts an initial review period (over the next few weeks) for people to provide feedback, questions, comments, etc.

Thanks,
Siqi and Greg

=========================================================================
=======================  Supervisor Extension Additions  ========================

Supervisor Timer Register (stimecmp)
The stimecmp CSR is a 64-bit register and has 64-bit precision on all RV32 and RV64 systems.  In RV32 only, reads of the stimecmp CSR return the low 32 bits, while reads of the stimecmph CSR return the high 32 bits of stimecmp.

The proposed CSR numbers for stimecmp / stimecmph are 0x108 / 0x109 (within the Supervisor Trap Setup block of CSRs).

A supervisor timer interrupt becomes pending - as reflected in the STIP bit in the mip and sip registers - whenever mtime contains a value greater than or equal to stimecmp, treating the values as unsigned integers.  Writes to stimecmp are guaranteed to be reflected in STIP eventually, but not necessarily immediately.  The interrupt remains posted until stimecmp becomes greater than mtime - typically as a result of writing stimecmp. The interrupt will be taken based on the standard interrupt enable and delegation rules.

----------------------------  Non-Normative Text    ----------------------------
A spurious timer interrupt might occur if an interrupt handler advances stimecmp then immediately returns, because STIP might not yet have fallen in the interim. All software should be written to assume this event is possible, but most software should assume this event is extremely unlikely. It is almost always more performant to incur an occasional spurious timer interrupt than to poll STIP until it falls.  
----------------------------------------------------------------------------------------

----------------------------  Non-Normative Text    ----------------------------
In systems in which a supervisor execution environment (SEE) provides timer facilities via an SBI function call, this SBI call will continue to support requests to schedule a timer interrupt.  The SEE will simply make use of stimecmp, changing its value as appropriate.  This ensures compatibility with existing S-mode software that uses this SEE facility, while new S-mode software takes advantage of stimecmp directly.)
----------------------------------------------------------------------------------------

Machine Interrupt Registers (mip and mie)
This extension modifies the description of the STIP/STIE bits in these registers as follows:

If supervisor mode is implemented, its mip.STIP and mie.STIE are the interrupt-pending and interrupt-enable bits for supervisor-level timer interrupts.  If the stimecmp register is not implemented, STIP is writable in mip, and may be written by M-mode software to deliver timer interrupts to S-mode.  If the stimecmp (supervisor-mode timer compare) register is implemented, STIP is read-only in mip and reflects the supervisor-level timer interrupt signal resulting from stimecmp.  This timer interrupt signal is cleared by writing stimecmp with a value greater than the current mtime value.

Supervisor Interrupt Registers (sip and sie)
This extension modifies the description of the STIP/STIE bits in these registers as follows:
 
Bits sip.STIP and sie.STIE are the interrupt-pending and interrupt-enable bits for supervisor level timer interrupts.  If implemented, STIP is read-only in sip, and is either set and cleared by the execution environment (if stimecmp is not implemented), or reflects the timer interrupt signal resulting from stimecmp (if stimecmp is implemented).  The sip.STIP bit, in response to timer interrupts generated by stimecmp, is set and cleared by writing stimecmp with a value that respectively is less than or equal to, or greater than, the current mtime value.

Machine Counter-Enable Register (mcounteren)
This extension adds to the description of the TM bit in this register as follows:

In addition, when the TM bit in the mcounteren register is clear, attempts to read the stimecmp register while executing in S-mode will cause an illegal instruction exception. When this bit is set, access to the stimecmp register (if implemented) is permitted in S-mode.


=========================================================================
=======================  Hypervisor Extension Additions  ========================

Virtual Supervisor Timer Register (vstimecmp)
The vstimecmp CSR is a 64-bit register and has 64-bit precision on all RV32 and RV64 systems.  In RV32 only, reads of the vstimecmp CSR return the low 32 bits, while reads of the vstimecmph CSR return the high 32 bits of vstimecmp.

The proposed CSR numbers for vstimecmp / vstimecmph are 0x208 / 0x209 (within the Virtual Supervisor Registers block of CSRs, and mirroring the CSR numbers for stimecmp/stimecmph).

A virtual supervisor timer interrupt becomes pending - as reflected in the VSTIP bit in the hip register - whenever (mtime + htimedelta) contains a value greater than or equal to vstimecmp, treating the values as unsigned integers.  Writes to vstimecmp and htimedelta are guaranteed to be reflected in VSTIP eventually, but not necessarily immediately.  The interrupt remains posted until vstimecmp becomes greater than (mtime + htimedelta) - typically as a result of writing vstimecmp. The interrupt will be taken based on the standard interrupt enable and delegation rules while V=1.

----------------------------  Non-Normative Text    ----------------------------
In systems in which a supervisor execution environment (SEE) implemented by an HS-mode hypervisor provides timer facilities via an SBI function call, this SBI call will continue to support requests to schedule a timer interrupt.  The SEE will simply make use of vstimecmp, changing its value as appropriate.  This ensures compatibility with existing guest VS-mode software that uses this SEE facility, while new VS-mode software takes advantage of vstimecmp directly.)
----------------------------------------------------------------------------------------

Hypervisor Interrupt Registers (hvip, hip, and hie)h
This extension modifies the description of the VSTIP/VSTIE bits in the hip/hie registers as follows:

Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the logical-OR of hvip.VSTIP and the timer interrupt signal resulting from vstimecmp (if vstimecmp is implemented).  The hip.VSTIP bit, in response to timer interrupts generated by vstimecmp, is set and cleared by writing vstimecmp with a value that respectively is less than or equal to, or greater than, the current (mtime + htimedelta) value.  The hip.VSTIP bit remains defined while V=0 as well as V=1.

Hypervisor Counter-Enable Register (hcounteren)
This extension adds to the description of the TM bit in this register as follows:

In addition, when the TM bit in the hcounteren register is clear, attempts to read the vstimecmp register while executing in VS-mode will cause a virtual illegal instruction exception if the same bit in mcounteren is 1. When this bit is set, access to the vstimecmp register (if implemented) is permitted in VS-mode.


Re: rv(64) address space size

swallach
 

thank you for your comment

1.  for  now, i am only proposing  the addition of the ARM extensions to  sv57. the reason being that current linux, for ARM,  can function with this extension

2. for sv64,  i would propose a more thorough,  both analysis and extension, for security.  it is unclear to me how this would function within the riscv privileged working group.

3. i am NOT  adverse to adding other security extensions to sv57.  i just want to make sure the ARM extension is specified

i  hope this clarifies my views




----------------

I'm trying to understand from your comments what ISA-related changes do you propose for Sv57/Sv64 other than the address space extension, the only comment I got is the TTBR0/1 split on ARM/ARM64. Also I don't see why Sv57/Sv64 should be treated differently security-wise, could you please clarify ?



WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Re: rv(64) address space size -benchmarks KASLR benchmarks

swallach
 

i found this after some time.

this is an extensive set of benchmarks.  comparing KASLR enabled and not enabled

draw your own conclusions

perhaps the summary   tells it all







WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Re: rv(64) address space size

Nick Kossifidis
 

Hello Steven,

Στις 2020-11-26 18:04, swallach έγραψε:
attached at my comments.
like everything else, performance is always a function of the
implementation. same ISA, different implementation, different
performance.
there are several themes to this thread.
1. do nothing for rv57 (keep INTEL)
2. add a some trivial stuff to rv57 (va translation) - NOTE the
comment on what ARM has done
3. do something better for rv64 (currently intel has no rv64)
I'm trying to understand from your comments what ISA-related changes do you propose for Sv57/Sv64 other than the address space extension, the only comment I got is the TTBR0/1 split on ARM/ARM64. Also I don't see why Sv57/Sv64 should be treated differently security-wise, could you please clarify ?

KAISER/KPTI is a software mitigation for leakages initially related to KASLR (another software mitigation) and later on to kernel memory's contents (Meltdown). It was merged upstream when Meltdown was discovered, and as Jonathan mentioned the performance penalty can get much worse than 0.28% (some benchmarks here -> http://www.brendangregg.com/blog/2018-02-09/kpti-kaiser-meltdown-performance.html) which is why it remains an optional feature, and even when enabled in the kernel it's only activated for CPUS vulnerable to Meltdown (check out X86_BUG_CPU_INSECURE).

A bit off-topic: have in mind there is an on-going discussion on the TEE TG about a Supervisor-level PMP mechanism which can further isolate the kernel memory from the user even when no MMU is available.

Regards,
Nick


Re: rv(64) address space size

swallach
 

attached at my comments.

like everything else, performance is always a function of the implementation. same ISA, different implementation, different performance.

there are several themes to this thread.

1. do nothing for rv57 (keep INTEL)
2. add a some trivial stuff to rv57 (va translation) - NOTE the comment on what ARM has done
3. do something better for rv64 (currently intel has no rv64)

clearly the application space needs to be considered to answer the above themes.

having done this too many times, my philosophy is to make a system call as efficient as a subroutine call/return. and make pointer dereferencing as secure as we can make it.

Trojan Horse Pointers are EVIL.

have a happy and safe thanksgiving



http://bsc.es/disclaimer


Re: rv(64) address space size

Jonathan Behrens <behrensj@...>
 

RISC-V can already support KAISER now so I'm not sure why rv64 would need anything special for it?

As a side note, that paper seriously understates the costs of KAISER. On other workloads it can be 2-3 orders of magnitude higher overhead (which is why Linux only enables it on processors that are vulnerable to Meltdown).

Jonathan


On Wed, Nov 25, 2020 at 6:52 PM swallach via lists.riscv.org <steven.wallach=bsc.es@...> wrote:
this would take some time.  but to begin

separating kernel from user,  provides both the first level of isolation.  this can be further used for

.hypervisor isolation
trojan horse pointers protection
etc

i attached one snapshot from the kalsr paper and also  mark hills presentation on spectre and meltdown

KALSR does NOT solver the world’s security problems.  it is only the beginning. 

and within the context if rv64 (maybe rv47)  this helps, aand their suggestions maybe more adoptable within the current definitional framework

if you have questions on the paper,  for particular issues,  please post them

--------------------------------------------------
imho,  perhaps not not for sv57,  but for sv64,  we incorporate some of the suggestions
  of the KASLR people.  there are linux versions that implement their suggestions.  perhaps only for sv64.

Could you clarify what suggestions you think we should implement? The KAISER paper describes a way of mitigating side channel attacks, but do you have specific lessons you think we should learn from it for designing sv64?




WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer



—————



Re: rv(64) address space size

swallach
 

this would take some time.  but to begin

separating kernel from user,  provides both the first level of isolation.  this can be further used for

.hypervisor isolation
trojan horse pointers protection
etc

i attached one snapshot from the kalsr paper and also  mark hills presentation on spectre and meltdown

KALSR does NOT solver the world’s security problems.  it is only the beginning. 

and within the context if rv64 (maybe rv47)  this helps, aand their suggestions maybe more adoptable within the current definitional framework

if you have questions on the paper,  for particular issues,  please post them

--------------------------------------------------
imho,  perhaps not not for sv57,  but for sv64,  we incorporate some of the suggestions
  of the KASLR people.  there are linux versions that implement their suggestions.  perhaps only for sv64.

Could you clarify what suggestions you think we should implement? The KAISER paper describes a way of mitigating side channel attacks, but do you have specific lessons you think we should learn from it for designing sv64?




WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer



—————



Re: rv(64) address space size

Jonathan Behrens <behrensj@...>
 

imho,  perhaps not not for sv57,  but for sv64,  we incorporate some of the suggestions
  of the KASLR people.  there are linux versions that implement their suggestions.  perhaps only for sv64.

Could you clarify what suggestions you think we should implement? The KAISER paper describes a way of mitigating side channel attacks, but do you have specific lessons you think we should learn from it for designing sv64?

Jonathan

On Wed, Nov 25, 2020 at 5:31 PM swallach via lists.riscv.org <steven.wallach=bsc.es@...> wrote:
this is a good start

imho,  perhaps not not for sv57,  but for sv64,  we incorporate some of the suggestions
  of the KASLR people.  there are linux versions that implement their suggestions.  perhaps only for sv64.

but thus should be up to discussion



It hasn’t been standardized yet, but there is a placeholder for the Sv57 encoding in the satp.MODE register field. There isn’t a chapter on the Sv57 spec, but it will follow the pattern of Sv39 -> Sv48, with one additional page-table level.

I will suggest to the virtual memory task group that we include Sv57 in the next batch of things we put up for ratification.

On Wed, Nov 25, 2020 at 2:15 PM swallach <steven.wallach@...> wrote:
is this documented?

much appreciated




WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer



Re: rv(64) address space size

swallach
 

this is a good start

imho,  perhaps not not for sv57,  but for sv64,  we incorporate some of the suggestions
  of the KASLR people.  there are linux versions that implement their suggestions.  perhaps only for sv64.

but thus should be up to discussion



It hasn’t been standardized yet, but there is a placeholder for the Sv57 encoding in the satp.MODE register field. There isn’t a chapter on the Sv57 spec, but it will follow the pattern of Sv39 -> Sv48, with one additional page-table level.

I will suggest to the virtual memory task group that we include Sv57 in the next batch of things we put up for ratification.

On Wed, Nov 25, 2020 at 2:15 PM swallach <steven.wallach@...> wrote:
is this documented?

much appreciated




WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer



Re: rv(64) address space size

Andrew Waterman
 

It hasn’t been standardized yet, but there is a placeholder for the Sv57 encoding in the satp.MODE register field. There isn’t a chapter on the Sv57 spec, but it will follow the pattern of Sv39 -> Sv48, with one additional page-table level.

I will suggest to the virtual memory task group that we include Sv57 in the next batch of things we put up for ratification.

On Wed, Nov 25, 2020 at 2:15 PM swallach <steven.wallach@...> wrote:
is this documented?

much appreciated



On Nov 25, 2020, at 5:10 PM, Krste Asanovic <krste@...> wrote:

The basic design is already laid out for expansion to Sv57 and Sv64 following the template of fewer bits,
Krste

On Nov 25, 2020, at 1:15 PM, swallach <steven.wallach@...> wrote:

the current size of the virtual address space is 48 bits.  (per the june 2019 spec - volume II)

as many of you know,  INTEL has increased their address space to 57 bits.  several designers of server and hpc class of riscv systems  have  asked me about this.

so,  what is the current view on this.  will riscv support the a 57 bit logical address space.

with the newer class of NVM  being   implemented,  many systems are looking at directly addressing,  cluster-wide ALL of physlcal memory
















WARNING / LEGAL TEXT: This message is intended only for the use of the

individual or entity to which it is addressed and may contain

information which is privileged, confidential, proprietary, or exempt

from disclosure under applicable law. If you are not the intended

recipient or the person responsible for delivering the message to the

intended recipient, you are strictly prohibited from disclosing,

distributing, copying, or in any way using this message. If you have

received this communication in error, please notify the sender and

destroy and delete any copies you may have received.





http://www.bsc.es/disclaimer















Re: rv(64) address space size

swallach
 

is this documented?

much appreciated



On Nov 25, 2020, at 5:10 PM, Krste Asanovic <krste@...> wrote:

The basic design is already laid out for expansion to Sv57 and Sv64 following the template of fewer bits,
Krste

On Nov 25, 2020, at 1:15 PM, swallach <steven.wallach@...> wrote:

the current size of the virtual address space is 48 bits.  (per the june 2019 spec - volume II)

as many of you know,  INTEL has increased their address space to 57 bits.  several designers of server and hpc class of riscv systems  have  asked me about this.

so,  what is the current view on this.  will riscv support the a 57 bit logical address space.

with the newer class of NVM  being   implemented,  many systems are looking at directly addressing,  cluster-wide ALL of physlcal memory



WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Re: rv(64) address space size

Krste Asanovic
 

The basic design is already laid out for expansion to Sv57 and Sv64 following the template of fewer bits,
Krste

On Nov 25, 2020, at 1:15 PM, swallach <steven.wallach@...> wrote:

the current size of the virtual address space is 48 bits.  (per the june 2019 spec - volume II)

as many of you know,  INTEL has increased their address space to 57 bits.  several designers of server and hpc class of riscv systems  have  asked me about this.

so,  what is the current view on this.  will riscv support the a 57 bit logical address space.

with the newer class of NVM  being   implemented,  many systems are looking at directly addressing,  cluster-wide ALL of physlcal memory


rv(64) address space size

swallach
 

the current size of the virtual address space is 48 bits.  (per the june 2019 spec - volume II)

as many of you know,  INTEL has increased their address space to 57 bits.  several designers of server and hpc class of riscv systems  have  asked me about this.

so,  what is the current view on this.  will riscv support the a 57 bit logical address space.

with the newer class of NVM  being   implemented,  many systems are looking at directly addressing,  cluster-wide ALL of physlcal memory


Re: PTE access type in Virtual Address Translation

Greg Favor
 

On Tue, Nov 17, 2020 at 2:05 AM Gracy Ge <gracy.ge@...> wrote:
From the Privileged spec, the 2nd step of virtual Address Translation process said, 
[2. Let pte be the value of the PTE at address a+va.vpn[i]×PTESIZE. (For Sv32, PTESIZE=4.)
If accessing pte violates a PMA or PMP check, raise an access exception corresponding to
the original access type.]
What is the required ACCESS type for PTE entries? I thought it should only have read access for non-leaf PTE and extra write access for leaf PTE(AD). Is this correct?

The actual PTE access is either a page-table data read or page-table data write (which gets checked against PMAs and PMPs as such).  But the "original access type" refers to the access that is being translated - which may be a load, a store, or an instruction fetch.

Greg




PTE access type in Virtual Address Translation

Gracy Ge
 

From the Privileged spec, the 2nd step of virtual Address Translation process said, 
[2. Let pte be the value of the PTE at address a+va.vpn[i]×PTESIZE. (For Sv32, PTESIZE=4.)
If accessing pte violates a PMA or PMP check, raise an access exception corresponding to
the original access type.]
What is the required ACCESS type for PTE entries? I thought it should only have read access for non-leaf PTE and extra write access for leaf PTE(AD). Is this correct?
 


Re: Access unprivileged regions from OS

Allen Baum
 

The sPMP proposal has not been discussed in any detail as far as I know, so it is hard to pin down.
The advantage of sPMP is that it should be much lower cost and simpler compared to page-based virtual memory.
The downside is there is no ability to relocate addresses - it is less flexible.

My understanding of the intent (which could be wrong, to be clear) is that it primarily targets an embedded system that had 3 privilege levels (M,S,U) but didn't need virtual memory. In such a system,  the OS (S-mode) privilege level needs the ability to configure memory protection, but the existing PMP can only be configured by M-mode. sPMP was proposed to correct this.

The Trusted Execution Environment (TEE) is defining an enhanced PMP, but it still only differentiates between M-mode and non-M-Mode; specifically, it doesn't differentiate between S-mode and U-mode protections.
In any case, there have been no  sPMP discussions in any TG that I'm aware of yet. That could happen in the TEE TG, but certainly not before the enhanced PMP work is finished .

On Mon, Nov 2, 2020 at 2:18 AM Xinhao (Freddie) Qu via lists.riscv.org <xinhaoqu=huawei.com@...> wrote:

Is sPMP an alternative to page-based permissions? If so, what advantage does it provide over the latter?

 

Thanks,

Freddie

 

From: Andrew Waterman [mailto:andrew@...]
Sent: Friday, October 30, 2020 10:51 PM
To: Bill Huffman <huffman@...>
Cc: Jonathan Behrens <behrensj@...>; Xinhaoqu (Freddie) <xinhaoqu@...>; Andrea Mondelli <andrea.mondelli@...>; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] Access unprivileged regions from OS

 

 

 

On Fri, Oct 30, 2020 at 3:46 PM Bill Huffman <huffman@...> wrote:

That might mean that the sPMP being considered might need a probe of some sort.

I'd tend to think that the user processes running under sPMP will have few enough data regions that software base-and-bounds checks on syscall arguments isn't prohibitively expensive.

 

      Bill

On 10/30/20 3:39 PM, Jonathan Behrens wrote:

EXTERNAL MAIL

In the worst case, a software page table walk isn't that expensive.

 

On Fri, Oct 30, 2020 at 6:26 PM Andrew Waterman via lists.riscv.org <andrew=sifive.com@...> wrote:

 

 

On Fri, Oct 30, 2020 at 2:46 PM Bill Huffman <huffman@...> wrote:

 

On 10/30/20 2:32 PM, Andrew Waterman wrote:

EXTERNAL MAIL

 

 

On Fri, Oct 30, 2020 at 8:19 AM Xinhaoqu (Freddie) <xinhaoqu@...> wrote:

Hi Andrew,

 

I’m not sure the sstatus.SUM bit is providing the equivalent of LDTR/STTR. The pair of load/store instructions lower their privilege level so that if they end up access privileged locations, they will fault. On the other hand, when status.SUM==1, even if the page is marked as “User”, supervisor code can still access it.

 

From section 3.1.6.3 in the Privileged ISA spec:

 

The SUM (permit Supervisor User Memory access) bit modifies the privilege with which S-mode

loads and stores access virtual memory. When SUM=0, S-mode memory accesses to pages that are

accessible by U-mode (U=1 in Figure 4.17) will fault. When SUM=1, these accesses are permitted.

SUM has no effect when page-based virtual memory is not in effect. Note that, while SUM is

ordinarily ignored when not executing in S-mode, it is in effect when MPRV=1 and MPP=S. SUM

is hardwired to 0 if S-mode is not supported.

 

There is nothing mentioning what would happen if load/store instructions in S-mode attempt locations that require privilege. That means to me they are permitted when sstatus.SUM==1. This behaviour is fine in itself, but doesn’t match what the LDTR/STTR instructions do. I think sstatus.SUM’s equivalent is PSTATE.PAN, not LDTR/STTR. In other words, LDTR/STTR has no equivalent in RISC-V, AFAIK.

 

Right.  SUM can be used to solve the same problem as LDTR/STTR, but it is not equivalent.  To avoid the concern you describe, the Linux kernel first performs a bounds check to guarantee the address is in the user process' VA range.  Then, it engages SUM and performs the unprivileged access.

That's, of course, a very Linux centric answer.  With more general address maps - or a "bare MMU" - the problem is harder to solve.  Do you have an expectation for that case?

It's not especially Linux-centric; it's more conventional-OS-with-paging-centric.

 

mstatus.MPRV handles the situation adequately for PMP-based protection in M/U systems.

 

S-mode with general address maps is not adequately addressed here, but at least at the moment, that strikes me as a bit too hypothetical of a problem.  If we did need to solve it, an S-mode MPRV feature would also suffice.

 

       Bill

 

 

Regards,

Freddie

 

 

From: tech-privileged@... [mailto:tech-privileged@...] On Behalf Of Andrew Waterman
Sent: Friday, October 30, 2020 9:50 AM
To: Andrea Mondelli <
andrea.mondelli@...>
Cc:
tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] Access unprivileged regions from OS

 

 

 

On Fri, Oct 30, 2020 at 2:45 AM Andrea Mondelli via lists.riscv.org <andrea.mondelli=huawei.com@...> wrote:

Hi all,

quoting the arm manual, "sometimes the OS does need to access unprivileged regions, for example, to write to a buffer owned by an application. To support this, the instruction set provides the LDTR and STTR instructions."
According to the Volume II: RISC-V Privileged Architectures Chapter 7, In RISCV we don't have any similar privileged instruction to do it.

There is an alternative way to have the same behavior? I was thinking other examples like checking user parameters when syscall are called.

 

Yeah.  Set the sstatus.SUM bit, then use regular load and store instructions to access user memory, then clear sstatus.SUM.

 

Any hints?

thanks in advance


ARM's new capability-based security ISA (building on top of ARMv8)

Greg Favor
 

This is just a quick FYI.  (Sorry if this is a bit spam'y; there isn't a clear TG to target with this email.  And I don't expect that this is something that plays into any near-term RISC-V standardization efforts.)

ARM has been working on a capability-based architecture for a while, and has just now made public the arch spec (and related specs).  This work is based on the Cheri project at Cambridge.  If you're curious about it, this link provides a high-level summary and contain a link that will get you to the arch documents:  ARM Morello Program

Greg

821 - 840 of 1227