
Bruce Hoult
On Thu, Nov 18, 2021 at 5:07 PM David Horner <ds2horner@...> wrote:
But if there were, the vl would need to be truncated to the first
in sequence that faulted.
That would potentially require back tracking by the handler from
the element load that faulted first, to each of the earlier loads
in the list. Simple implementations could simply execute it sequentially. Or have the trap handler execute the loads sequentially if any of them fault.
A substantial effort to essentially be thrown away on the next
try to discover the page mappings.
We don't care how slowly malicious code runs.
This was another reasons that ff gather was rejected. It does not
play well with the parallel load behaviour that is allowed for
loads.
It plays just as well as any gather does, in the absence of faults.
Faulting is very much NOT expected behaviour. You're probably about to terminate the program anyway, or drop into the debugger. The main requirement is that the user can see which iteration of their loop would have failed if the code had been left as scalar instructions instead of auto-vectorised.
If FF gather were implemented the designer would probably always
trap on any fault.
If the OS determines the virtual addresses are legitimate it
could preemptively page-in/allocate the requested addresses.
If any of the addresses are illegal/illegitimate it could
certainly mark this application suspect and escalate its
management to whatever security features are enabled.
This is a legal option for load sequential first fault but we
have at most 2 pages/regions to deal with.
One region, but it could be many page table entries, given sufficiently long vector registers -- up to 17 with 65536 bit VLEN and LMUL=8.
|
|
On 2021-11-17 10:33 p.m., Bill Huffman
wrote:
So if I set up indexes on a ff gather so
that I ensure the first one is legal while the rest access
pages I want to probe, how does the higher privilege software
know I learned about all those pages?
First there is currently no ff gather.
But if there were, the vl would need to be truncated to the first
in sequence that faulted.
That would potentially require back tracking by the handler from
the element load that faulted first, to each of the earlier loads
in the list.
A substantial effort to essentially be thrown away on the next
try to discover the page mappings.
This was another reasons that ff gather was rejected. It does not
play well with the parallel load behaviour that is allowed for
loads.
I suppose an implementation could trap on
any element if it started with vstart=0 and only complete with
zero values loaded if vstart>0.
Then the higher privilege software could
cooperate and restart even if the element failed for these
instructions with vstart pointed at the failed element.
But I don’t think that was the expected
implementation.
Not sure I follow this.
But note:
If FF gather were implemented the designer would probably always
trap on any fault.
If the OS determines the virtual addresses are legitimate it
could preemptively page-in/allocate the requested addresses.
If any of the addresses are illegal/illegitimate it could
certainly mark this application suspect and escalate its
management to whatever security features are enabled.
This is a legal option for load sequential first fault but we
have at most 2 pages/regions to deal with.
Indeed, the OS can emulate the set vl length behaviour. It would
still be an optimization even in the guard page scenario.
Bill
From:
tech-vector-ext@...
<tech-vector-ext@...>
On Behalf Of David Horner
Sent: Wednesday, November 17, 2021 10:12 PM
To: Jonathan Behrens <behrensj@...>
Cc: Andrew Waterman <andrew@...>; vector
<tech-vector-ext@...>
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector
Extension post-public review updates - fault flagging
The trap if first address bad is
stipulated behaviour.
The other are not specified in the vector
extension , but
1. the counter is part of the
generalized performance spec.
2. Always trap but allow resume is an
implementation option.
3. Dynamically limit vl=1 is a
hypothetical extension that could have a csr to manage.
Obviously these could be defined as
part of the privilege spec and the VM guys probably want
to nail them down exactly.
I prefer minimalistic S and M support
with least specs to allow implementations latitude.
On Wed., Nov. 17, 2021, 21:33 Jonathan
Behrens, <behrensj@...> wrote:
Are the mechanisms you mentioned
hypothetical future ISA extensions, or something
included in the current vector extension? In
particular, I don't see anything about M-mode and/or
HS-mode requesting a trap if too many non-faulting
fault-first-load instructions are executed which
modify vl.
On Wed, Nov 17, 2021 at 9:19 PM
David Horner <ds2horner@...>
wrote:
that should have been "The HS is in
control, it can "leak" or not as it sees fit"
obviously.
On 2021-11-17 8:45 p.m.,
Andrew Waterman wrote:
On Wed, Nov 17, 2021 at
5:41 PM Jonathan Behrens <behrensj@...>
wrote:
On Wed, Nov
17, 2021 at 4:19 PM Jonathan
Behrens <behrensj@...>
wrote:
The
security concern was being
able to probe addresses to
find accessible regions
without free of being
killed on touching a
prohibited region. It was
noted that this is still
present even for
unit-stride in supervisor
mode when using
translation to arbitrarily
probe supervisor physical
space. However, I believe
these security concerns
are manageable through
control mechanisms at
higher privilege levels
Could
someone say what these
control mechanisms are? In
particular, it seems like
a VS-mode guest operating
system could probe the
entire guest physical
address space using
fault-on-first load
without triggering any
intervention from HS-mode
or M-mode.
Perhaps I'm
being obtuse, but I'm having
trouble understanding why this
specific case is a concern:
it's within VS-mode's purview
to know anything and
everything about the guest
physical address space. (The
situation is materially
different than S vs. U,
because those two share a VA
space, whereas VS' GPA space
is disjoint from HS' VA
space.)
The physical
address space that the hypervisor
tells the guest about may not match
the one installed in hgatp. For
instance, some pages of the guest's
memory might be marked copy-on-write
or swapped out to disk. Or a
particular device may supposedly be
mapped into the guest VM, but
actually just be an unmapped region
so the host can trap-and-emulate any
accesses to it. Even today it is
possible for a guest VM to
indirectly learn that these things
might be happening, but directly
being able to check whether a page
is mapped adds a new level.
Yeah, agreed that
detecting paged-out pages is a similar
information leak.
The VS having this awareness can be very
beneficial.
It allows the OS to better manage its resources.
It can switch to handling other supervisory
actions while that data is paged/staged in.
Never the less, the control mechanisms I
previously mentioned apply here as well.
The HS is in control, it can "leak" or
not as it sees fit.
(Though I think COW is
not relevant here, since we're only
talking about load instructions.)
|
|
So if I set up indexes on a ff gather so that I ensure the first one is legal while the rest access pages I want to probe, how does the higher privilege software know I learned about all those pages?
I suppose an implementation could trap on any element if it started with vstart=0 and only complete with zero values loaded if vstart>0. Then the higher privilege software could cooperate and restart even if the element failed for these
instructions with vstart pointed at the failed element. But I don’t think that was the expected implementation.
Bill
toggle quoted message
Show quoted text
From: tech-vector-ext@... <tech-vector-ext@...>
On Behalf Of David Horner
Sent: Wednesday, November 17, 2021 10:12 PM
To: Jonathan Behrens <behrensj@...>
Cc: Andrew Waterman <andrew@...>; vector <tech-vector-ext@...>
Subject: Re: [RISC-V] [tech-vector-ext] RISC-V Vector Extension post-public review updates - fault flagging
The trap if first address bad is stipulated behaviour.
The other are not specified in the vector extension , but
1. the counter is part of the generalized performance spec.
2. Always trap but allow resume is an implementation option.
3. Dynamically limit vl=1 is a hypothetical extension that could have a csr to manage.
Obviously these could be defined as part of the privilege spec and the VM guys probably want to nail them down exactly.
I prefer minimalistic S and M support with least specs to allow implementations latitude.
On Wed., Nov. 17, 2021, 21:33 Jonathan Behrens, <behrensj@...> wrote:
Are the mechanisms you mentioned hypothetical future ISA extensions, or something included in the current vector extension? In particular, I don't see anything about M-mode and/or HS-mode requesting a trap if too many non-faulting fault-first-load
instructions are executed which modify vl.
On Wed, Nov 17, 2021 at 9:19 PM David Horner <ds2horner@...> wrote:
that should have been "The HS is in control, it can "leak" or not as it sees fit" obviously.
On 2021-11-17 8:45 p.m., Andrew Waterman wrote:
On Wed, Nov 17, 2021 at 5:41 PM Jonathan Behrens <behrensj@...> wrote:
On Wed, Nov 17, 2021 at 4:19 PM Jonathan Behrens <behrensj@...> wrote:
The security concern was being able to probe addresses to find accessible regions without free of being killed on touching a prohibited region. It was noted that this is still present even for unit-stride in supervisor mode when using
translation to arbitrarily probe supervisor physical space. However, I believe these security concerns are manageable through control mechanisms at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode
or M-mode.
Perhaps I'm being obtuse, but I'm having trouble understanding why this specific case is a concern: it's within VS-mode's purview to know anything and everything about the guest physical address space. (The situation is materially different
than S vs. U, because those two share a VA space, whereas VS' GPA space is disjoint from HS' VA space.)
The physical address space that the hypervisor tells the guest about may not match the one installed in hgatp. For instance, some pages of the guest's memory might be marked copy-on-write or swapped out to disk. Or a particular device may
supposedly be mapped into the guest VM, but actually just be an unmapped region so the host can trap-and-emulate any accesses to it. Even today it is possible for a guest VM to indirectly learn that these things might be happening, but directly being able
to check whether a page is mapped adds a new level.
Yeah, agreed that detecting paged-out pages is a similar information leak.
The VS having this awareness can be very beneficial.
It allows the OS to better manage its resources. It can switch to handling other supervisory actions while that data is paged/staged in.
Never the less, the control mechanisms I previously mentioned apply here as well.
The HS is in control, it can "leak" or not as it sees fit.
(Though I think COW is not relevant here, since we're only talking about load instructions.)
|
|
The trap if first address bad is stipulated behaviour.
The other are not specified in the vector extension , but 1. the counter is part of the generalized performance spec. 2. Always trap but allow resume is an implementation option. 3. Dynamically limit vl=1 is a hypothetical extension that could have a csr to manage.
Obviously these could be defined as part of the privilege spec and the VM guys probably want to nail them down exactly.
I prefer minimalistic S and M support with least specs to allow implementations latitude.
toggle quoted message
Show quoted text
On Wed., Nov. 17, 2021, 21:33 Jonathan Behrens, < behrensj@...> wrote: Are the mechanisms you mentioned hypothetical future ISA extensions, or something included in the current vector extension? In particular, I don't see anything about M-mode and/or HS-mode requesting a trap if too many non-faulting fault-first-load instructions are executed which modify vl.
Jonathan
On Wed, Nov 17, 2021 at 9:19 PM David Horner < ds2horner@...> wrote:
that should have been "The HS is in control, it can
"leak" or not as it sees fit" obviously.
On 2021-11-17 8:45 p.m., Andrew
Waterman wrote:
On Wed, Nov 17, 2021 at
5:41 PM Jonathan Behrens < behrensj@...>
wrote:
On Wed, Nov
17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...>
wrote:
The
security concern was being able to probe
addresses to find accessible regions
without free of being killed on touching a
prohibited region. It was noted that this
is still present even for unit-stride in
supervisor mode when using translation to
arbitrarily probe supervisor physical
space. However, I believe these security
concerns are manageable through control
mechanisms at higher privilege levels
Could someone say what these control
mechanisms are? In particular, it seems
like a VS-mode guest operating system
could probe the entire guest physical
address space using fault-on-first load
without triggering any intervention from
HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having
trouble understanding why this specific case
is a concern: it's within VS-mode's purview to
know anything and everything about the guest
physical address space. (The situation is
materially different than S vs. U, because
those two share a VA space, whereas VS' GPA
space is disjoint from HS' VA space.)
The physical address space that the hypervisor
tells the guest about may not match the one
installed in hgatp. For instance, some pages of the
guest's memory might be marked copy-on-write or
swapped out to disk. Or a particular device may
supposedly be mapped into the guest VM, but actually
just be an unmapped region so the host can
trap-and-emulate any accesses to it. Even today it
is possible for a guest VM to indirectly learn that
these things might be happening, but directly being
able to check whether a page is mapped adds a new
level.
Yeah, agreed that detecting paged-out pages is a
similar information leak.
The VS having this awareness can be very beneficial.
It allows the OS to better manage its resources. It can switch
to handling other supervisory actions while that data is
paged/staged in.
Never the less, the control mechanisms I previously mentioned
apply here as well.
The HS is in control, it can "leak" or not as it sees
fit.
(Though I think COW is not relevant here, since we're
only talking about load instructions.)
|
|
Jonathan Behrens <behrensj@...>
Are the mechanisms you mentioned hypothetical future ISA extensions, or something included in the current vector extension? In particular, I don't see anything about M-mode and/or HS-mode requesting a trap if too many non-faulting fault-first-load instructions are executed which modify vl.
Jonathan
toggle quoted message
Show quoted text
On Wed, Nov 17, 2021 at 9:19 PM David Horner < ds2horner@...> wrote:
that should have been "The HS is in control, it can
"leak" or not as it sees fit" obviously.
On 2021-11-17 8:45 p.m., Andrew
Waterman wrote:
On Wed, Nov 17, 2021 at
5:41 PM Jonathan Behrens < behrensj@...>
wrote:
On Wed, Nov
17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...>
wrote:
The
security concern was being able to probe
addresses to find accessible regions
without free of being killed on touching a
prohibited region. It was noted that this
is still present even for unit-stride in
supervisor mode when using translation to
arbitrarily probe supervisor physical
space. However, I believe these security
concerns are manageable through control
mechanisms at higher privilege levels
Could someone say what these control
mechanisms are? In particular, it seems
like a VS-mode guest operating system
could probe the entire guest physical
address space using fault-on-first load
without triggering any intervention from
HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having
trouble understanding why this specific case
is a concern: it's within VS-mode's purview to
know anything and everything about the guest
physical address space. (The situation is
materially different than S vs. U, because
those two share a VA space, whereas VS' GPA
space is disjoint from HS' VA space.)
The physical address space that the hypervisor
tells the guest about may not match the one
installed in hgatp. For instance, some pages of the
guest's memory might be marked copy-on-write or
swapped out to disk. Or a particular device may
supposedly be mapped into the guest VM, but actually
just be an unmapped region so the host can
trap-and-emulate any accesses to it. Even today it
is possible for a guest VM to indirectly learn that
these things might be happening, but directly being
able to check whether a page is mapped adds a new
level.
Yeah, agreed that detecting paged-out pages is a
similar information leak.
The VS having this awareness can be very beneficial.
It allows the OS to better manage its resources. It can switch
to handling other supervisory actions while that data is
paged/staged in.
Never the less, the control mechanisms I previously mentioned
apply here as well.
The HS is in control, it can "leak" or not as it sees
fit.
(Though I think COW is not relevant here, since we're
only talking about load instructions.)
|
|
that should have been "The HS is in control, it can
"leak" or not as it sees fit" obviously.
On 2021-11-17 9:16 p.m., David Horner
via lists.riscv.org wrote:
toggle quoted message
Show quoted text
On 2021-11-17 8:45 p.m., Andrew
Waterman wrote:
On Wed, Nov 17, 2021 at
5:41 PM Jonathan Behrens < behrensj@...>
wrote:
On Wed, Nov
17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...>
wrote:
The
security concern was being able to probe
addresses to find accessible regions
without free of being killed on touching a
prohibited region. It was noted that this
is still present even for unit-stride in
supervisor mode when using translation to
arbitrarily probe supervisor physical
space. However, I believe these security
concerns are manageable through control
mechanisms at higher privilege levels
Could someone say what these control
mechanisms are? In particular, it seems
like a VS-mode guest operating system
could probe the entire guest physical
address space using fault-on-first load
without triggering any intervention from
HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having
trouble understanding why this specific case
is a concern: it's within VS-mode's purview to
know anything and everything about the guest
physical address space. (The situation is
materially different than S vs. U, because
those two share a VA space, whereas VS' GPA
space is disjoint from HS' VA space.)
The physical address space that the hypervisor
tells the guest about may not match the one
installed in hgatp. For instance, some pages of the
guest's memory might be marked copy-on-write or
swapped out to disk. Or a particular device may
supposedly be mapped into the guest VM, but actually
just be an unmapped region so the host can
trap-and-emulate any accesses to it. Even today it
is possible for a guest VM to indirectly learn that
these things might be happening, but directly being
able to check whether a page is mapped adds a new
level.
Yeah, agreed that detecting paged-out pages is a
similar information leak.
The VS having this awareness can be very beneficial.
It allows the OS to better manage its resources. It can switch
to handling other supervisory actions while that data is
paged/staged in.
Never the less, the control mechanisms I previously mentioned
apply here as well.
The HS is in control, it can "leak" or not as it sees
fit.
(Though I think COW is not relevant here, since we're
only talking about load instructions.)
|
|
On 2021-11-17 8:45 p.m., Andrew
Waterman wrote:
On Wed, Nov 17, 2021 at 5:41
PM Jonathan Behrens < behrensj@...> wrote:
On Wed, Nov 17,
2021 at 4:19 PM Jonathan Behrens < behrensj@...>
wrote:
The
security concern was being able to probe
addresses to find accessible regions without
free of being killed on touching a
prohibited region. It was noted that this
is still present even for unit-stride in
supervisor mode when using translation to
arbitrarily probe supervisor physical space.
However, I believe these security concerns
are manageable through control mechanisms at
higher privilege levels
Could someone say what these control
mechanisms are? In particular, it seems like
a VS-mode guest operating system could probe
the entire guest physical address space
using fault-on-first load without triggering
any intervention from HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having
trouble understanding why this specific case is
a concern: it's within VS-mode's purview to know
anything and everything about the guest physical
address space. (The situation is materially
different than S vs. U, because those two share
a VA space, whereas VS' GPA space is disjoint
from HS' VA space.)
The physical address space that the hypervisor
tells the guest about may not match the one installed
in hgatp. For instance, some pages of the guest's
memory might be marked copy-on-write or swapped out to
disk. Or a particular device may supposedly be mapped
into the guest VM, but actually just be an unmapped
region so the host can trap-and-emulate any accesses
to it. Even today it is possible for a guest VM to
indirectly learn that these things might be happening,
but directly being able to check whether a page is
mapped adds a new level.
Yeah, agreed that detecting paged-out pages is a similar
information leak.
The VS having this awareness can be very beneficial.
It allows the OS to better manage its resources. It can switch to
handling other supervisory actions while that data is paged/staged
in.
Never the less, the control mechanisms I previously mentioned
apply here as well. The VS is in control, it can "leak" or not as
it sees fit.
(Though I think COW is not relevant here, since we're
only talking about load instructions.)
|
|
On Wed, Nov 17, 2021 at 5:41 PM Jonathan Behrens < behrensj@...> wrote:
On Wed, Nov 17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...> wrote:
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having trouble understanding why this specific case is a concern: it's within VS-mode's purview to know anything and everything about the guest physical address space. (The situation is materially different than S vs. U, because those two share a VA space, whereas VS' GPA space is disjoint from HS' VA space.)
The physical address space that the hypervisor tells the guest about may not match the one installed in hgatp. For instance, some pages of the guest's memory might be marked copy-on-write or swapped out to disk. Or a particular device may supposedly be mapped into the guest VM, but actually just be an unmapped region so the host can trap-and-emulate any accesses to it. Even today it is possible for a guest VM to indirectly learn that these things might be happening, but directly being able to check whether a page is mapped adds a new level.
Yeah, agreed that detecting paged-out pages is a similar information leak. (Though I think COW is not relevant here, since we're only talking about load instructions.)
|
|
Jonathan Behrens <behrensj@...>
On Wed, Nov 17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...> wrote:
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having trouble understanding why this specific case is a concern: it's within VS-mode's purview to know anything and everything about the guest physical address space. (The situation is materially different than S vs. U, because those two share a VA space, whereas VS' GPA space is disjoint from HS' VA space.)
The physical address space that the hypervisor tells the guest about may not match the one installed in hgatp. For instance, some pages of the guest's memory might be marked copy-on-write or swapped out to disk. Or a particular device may supposedly be mapped into the guest VM, but actually just be an unmapped region so the host can trap-and-emulate any accesses to it. Even today it is possible for a guest VM to indirectly learn that these things might be happening, but directly being able to check whether a page is mapped adds a new level.
|
|
On Wed, Nov 17, 2021 at 4:19 PM Jonathan Behrens < behrensj@...> wrote:
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode or M-mode.
Perhaps I'm being obtuse, but I'm having trouble understanding why this specific case is a concern: it's within VS-mode's purview to know anything and everything about the guest physical address space. (The situation is materially different than S vs. U, because those two share a VA space, whereas VS' GPA space is disjoint from HS' VA space.)
On 2021-11-17 5:36 p.m., Krste Asanovic
wrote:
The primary reason was lack of encoding space for
non-unit-stride fault-on-first instructions.
However, we did discuss its merit; if it would trump the encoding
dificulties, see below -
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels.
Krste
On Nov 17, 2021, at 2:21 PM, Bruce Hoult
< bruce@...> wrote:
On Thu, Nov 18, 2021 at 10:33 AM Bill
Huffman <huffman@...>
wrote:
From: Bruce Hoult <bruce@...>
Sent: Wednesday, November
17, 2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>;
Grigorios Magklis <grigorios.magklis@...>;
tech-vector-ext@...
Subject: Re: [RISC-V]
[tech-vector-ext] RISC-V Vector Extension
post-public review updates
Don't
forget some code may want to use a
mask in inverted sense for individual
instructions, without explicitly
creating a new mask. This was not
listed in the "wish list for 64 bits"
below, but it was in early RVV drafts.
Yes, that needs to be
considered as well.
I'm
not sure how common that really is,
and non-store uses can usually just
use a vmerge.vmm at the end anyway, at
the expense of possibly using extra
registers.
While
on the subject of future features, and
somewhat related ... the one big thing
I've noticed RVV lacking that SVE has
is a non-faulting version of indexed
loads ("gather") which creates a mask
showing which elements were
accessible. In SVE this goes into a
CSR which can then be moved into a
mask register, but of course with
sufficient encoding bits you could
directly put it into a normal
register.
Traditional
vector code doesn't really need this,
but SVE has an aim to be able to
vectorise all loops.
How does this
contribute to vectorizing all loops?
Because
otherwise you can't safely vectorise loops that do
indirect array accesses (e.g. a[b[i]]) with
data-dependent control flow.
there are two aspects here -
a) checking array indexes are within bounds, which absent proof
that the indexes are always in bound should be checked.
We have a viable mechanism for that. The index values are
loaded into a vector register in any event.
Set a mask on the compare of the index values with a scalar
bound.
b) handle a fault exceptions expeditiously by returning a mask of
would be faults.
This should be the exception case and therefore does not need
expediting.
As Krste says the OS level can provide a cooperative
mechanism to set a corresponding mask when exceptions occur, if it
is deemed the application [or system code] needs it.
But this feature does not mitigate the out of bound array
fetch. Many out of bounds locations can be in accessible memory.
Thus no need for instructions to have this feature whether
supporting all loop constructs or not.
I think this was not
included for security reasons rather
than ignored.
I don't think
there is any additional security implication.
I could be
wrong, as I'm not an expert on SVE, but I
believe that even if the gather operation is
done (somewhat) in parallel or in random order,
the instruction doesn't actually return a mask
indicating all the failed accesses. All mask
bits after the first element that was
inaccessible are also set to false. The
following code will process all the initial
elements and then invert the mask and loop back
and try to process the tail starting from the
first inaccessible element, which will then
actually fault if the loop didn't exit or skip
that element based on program logic.
|
|
On 2021-11-17 7:18 p.m., Jonathan
Behrens wrote:
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present
even for unit-stride in supervisor mode when using translation
to arbitrarily probe supervisor physical space. However, I
believe these security concerns are manageable through control
mechanisms at higher privilege levels
Could someone say what these control mechanisms are?
yes. I will below.
In particular, it seems like a VS-mode guest operating
system could probe the entire guest physical address space
using fault-on-first load without triggering any intervention
from HS-mode or M-mode.
It could depending upon what implementation details are designed
into the hart.
Control Mechanisms:
If the first address of the vector load is problematic, whether
first fault or not, the instruction will trap.
So, only in the case that an instruction starts on a valid
address and reads past the end of that valid range might the
instruction not fault.
The hart is allowed to fault even then.
A count of fault-first events could trigger a trap, thus any
misbehaving applications could thus be identified and managed.
This appears to be distinctly different from the SVE design.
On 2021-11-17 5:36 p.m., Krste Asanovic wrote:
The primary reason was lack of encoding space for
non-unit-stride fault-on-first instructions.
However, we did discuss its merit; if it would trump the
encoding dificulties, see below -
The security concern was being able to probe addresses to
find accessible regions without free of being killed on
touching a prohibited region. It was noted that this is
still present even for unit-stride in supervisor mode when
using translation to arbitrarily probe supervisor physical
space. However, I believe these security concerns are
manageable through control mechanisms at higher privilege
levels.
Krste
On Nov 17, 2021, at 2:21 PM, Bruce Hoult < bruce@...>
wrote:
On
Thu, Nov 18, 2021 at 10:33 AM Bill
Huffman <huffman@...>
wrote:
From:
Bruce Hoult <bruce@...>
Sent: Wednesday, November 17,
2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>;
Grigorios Magklis <grigorios.magklis@...>;
tech-vector-ext@...
Subject: Re: [RISC-V]
[tech-vector-ext] RISC-V Vector
Extension post-public review updates
Don't
forget some code may want to
use a mask in inverted sense
for individual instructions,
without explicitly creating a
new mask. This was not listed
in the "wish list for 64 bits"
below, but it was in early RVV
drafts.
Yes, that
needs to be considered as well.
I'm
not sure how common that
really is, and non-store uses
can usually just use a
vmerge.vmm at the end anyway,
at the expense of possibly
using extra registers.
While
on the subject of future
features, and somewhat related
... the one big thing I've
noticed RVV lacking that SVE
has is a non-faulting version
of indexed loads ("gather")
which creates a mask showing
which elements were
accessible. In SVE this goes
into a CSR which can then be
moved into a mask register,
but of course with sufficient
encoding bits you could
directly put it into a normal
register.
Traditional
vector code doesn't really
need this, but SVE has an aim
to be able to vectorise all
loops.
How does this
contribute to vectorizing all
loops?
Because otherwise
you can't safely vectorise loops that do
indirect array accesses (e.g. a[b[i]])
with data-dependent control flow.
there are two aspects here -
a) checking array indexes are within bounds, which absent
proof that the indexes are always in bound should be
checked.
We have a viable mechanism for that. The index values
are loaded into a vector register in any event.
Set a mask on the compare of the index values with a
scalar bound.
b) handle a fault exceptions expeditiously by returning a
mask of would be faults.
This should be the exception case and therefore does
not need expediting.
As Krste says the OS level can provide a cooperative
mechanism to set a corresponding mask when exceptions
occur, if it is deemed the application [or system code]
needs it.
But this feature does not mitigate the out of bound
array fetch. Many out of bounds locations can be in
accessible memory.
Thus no need for instructions to have this feature
whether supporting all loop constructs or not.
I think this
was not included for security
reasons rather than ignored.
I don't think
there is any additional security
implication.
I could be
wrong, as I'm not an expert on SVE, but
I believe that even if the gather
operation is done (somewhat) in parallel
or in random order, the instruction
doesn't actually return a mask
indicating all the failed accesses. All
mask bits after the first element that
was inaccessible are also set to false.
The following code will process all the
initial elements and then invert the
mask and loop back and try to process
the tail starting from the first
inaccessible element, which will then
actually fault if the loop didn't exit
or skip that element based on program
logic.
|
|

Krste Asanovic
A paranoid hypervisor could restrict ff loads to always reduce to vl=1, or only after x failed probes, for example.
toggle quoted message
Show quoted text
On Nov 17, 2021, at 4:18 PM, Jonathan Behrens < behrensj@...> wrote:
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode or M-mode.
Jonathan
On 2021-11-17 5:36 p.m., Krste Asanovic
wrote:
The primary reason was lack of encoding space for
non-unit-stride fault-on-first instructions.
However, we did discuss its merit; if it would trump the encoding
dificulties, see below -
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels.
Krste
On Nov 17, 2021, at 2:21 PM, Bruce Hoult
< bruce@...> wrote:
On Thu, Nov 18, 2021 at 10:33 AM Bill
Huffman <huffman@...>
wrote:
From: Bruce Hoult <bruce@...>
Sent: Wednesday, November
17, 2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>;
Grigorios Magklis <grigorios.magklis@...>;
tech-vector-ext@...
Subject: Re: [RISC-V]
[tech-vector-ext] RISC-V Vector Extension
post-public review updates
Don't
forget some code may want to use a
mask in inverted sense for individual
instructions, without explicitly
creating a new mask. This was not
listed in the "wish list for 64 bits"
below, but it was in early RVV drafts.
Yes, that needs to be
considered as well.
I'm
not sure how common that really is,
and non-store uses can usually just
use a vmerge.vmm at the end anyway, at
the expense of possibly using extra
registers.
While
on the subject of future features, and
somewhat related ... the one big thing
I've noticed RVV lacking that SVE has
is a non-faulting version of indexed
loads ("gather") which creates a mask
showing which elements were
accessible. In SVE this goes into a
CSR which can then be moved into a
mask register, but of course with
sufficient encoding bits you could
directly put it into a normal
register.
Traditional
vector code doesn't really need this,
but SVE has an aim to be able to
vectorise all loops.
How does this
contribute to vectorizing all loops?
Because
otherwise you can't safely vectorise loops that do
indirect array accesses (e.g. a[b[i]]) with
data-dependent control flow.
there are two aspects here -
a) checking array indexes are within bounds, which absent proof
that the indexes are always in bound should be checked. We have a viable mechanism for that. The index values are
loaded into a vector register in any event. Set a mask on the compare of the index values with a scalar
bound. b) handle a fault exceptions expeditiously by returning a mask of
would be faults. This should be the exception case and therefore does not need
expediting. As Krste says the OS level can provide a cooperative
mechanism to set a corresponding mask when exceptions occur, if it
is deemed the application [or system code] needs it. But this feature does not mitigate the out of bound array
fetch. Many out of bounds locations can be in accessible memory.
Thus no need for instructions to have this feature whether
supporting all loop constructs or not.
I think this was not
included for security reasons rather
than ignored.
I don't think
there is any additional security implication.
I could be
wrong, as I'm not an expert on SVE, but I
believe that even if the gather operation is
done (somewhat) in parallel or in random order,
the instruction doesn't actually return a mask
indicating all the failed accesses. All mask
bits after the first element that was
inaccessible are also set to false. The
following code will process all the initial
elements and then invert the mask and loop back
and try to process the tail starting from the
first inaccessible element, which will then
actually fault if the loop didn't exit or skip
that element based on program logic.
|
|
Jonathan Behrens <behrensj@...>
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels
Could someone say what these control mechanisms are? In particular, it seems like a VS-mode guest operating system could probe the entire guest physical address space using fault-on-first load without triggering any intervention from HS-mode or M-mode.
Jonathan
On 2021-11-17 5:36 p.m., Krste Asanovic
wrote:
The primary reason was lack of encoding space for
non-unit-stride fault-on-first instructions.
However, we did discuss its merit; if it would trump the encoding
dificulties, see below -
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels.
Krste
On Nov 17, 2021, at 2:21 PM, Bruce Hoult
< bruce@...> wrote:
On Thu, Nov 18, 2021 at 10:33 AM Bill
Huffman <huffman@...>
wrote:
From: Bruce Hoult <bruce@...>
Sent: Wednesday, November
17, 2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>;
Grigorios Magklis <grigorios.magklis@...>;
tech-vector-ext@...
Subject: Re: [RISC-V]
[tech-vector-ext] RISC-V Vector Extension
post-public review updates
Don't
forget some code may want to use a
mask in inverted sense for individual
instructions, without explicitly
creating a new mask. This was not
listed in the "wish list for 64 bits"
below, but it was in early RVV drafts.
Yes, that needs to be
considered as well.
I'm
not sure how common that really is,
and non-store uses can usually just
use a vmerge.vmm at the end anyway, at
the expense of possibly using extra
registers.
While
on the subject of future features, and
somewhat related ... the one big thing
I've noticed RVV lacking that SVE has
is a non-faulting version of indexed
loads ("gather") which creates a mask
showing which elements were
accessible. In SVE this goes into a
CSR which can then be moved into a
mask register, but of course with
sufficient encoding bits you could
directly put it into a normal
register.
Traditional
vector code doesn't really need this,
but SVE has an aim to be able to
vectorise all loops.
How does this
contribute to vectorizing all loops?
Because
otherwise you can't safely vectorise loops that do
indirect array accesses (e.g. a[b[i]]) with
data-dependent control flow.
there are two aspects here -
a) checking array indexes are within bounds, which absent proof
that the indexes are always in bound should be checked.
We have a viable mechanism for that. The index values are
loaded into a vector register in any event.
Set a mask on the compare of the index values with a scalar
bound.
b) handle a fault exceptions expeditiously by returning a mask of
would be faults.
This should be the exception case and therefore does not need
expediting.
As Krste says the OS level can provide a cooperative
mechanism to set a corresponding mask when exceptions occur, if it
is deemed the application [or system code] needs it.
But this feature does not mitigate the out of bound array
fetch. Many out of bounds locations can be in accessible memory.
Thus no need for instructions to have this feature whether
supporting all loop constructs or not.
I think this was not
included for security reasons rather
than ignored.
I don't think
there is any additional security implication.
I could be
wrong, as I'm not an expert on SVE, but I
believe that even if the gather operation is
done (somewhat) in parallel or in random order,
the instruction doesn't actually return a mask
indicating all the failed accesses. All mask
bits after the first element that was
inaccessible are also set to false. The
following code will process all the initial
elements and then invert the mask and loop back
and try to process the tail starting from the
first inaccessible element, which will then
actually fault if the loop didn't exit or skip
that element based on program logic.
|
|
On 2021-11-17 5:36 p.m., Krste Asanovic
wrote:
The primary reason was lack of encoding space for
non-unit-stride fault-on-first instructions.
However, we did discuss its merit; if it would trump the encoding
dificulties, see below -
The security concern was being able to probe addresses to find
accessible regions without free of being killed on touching a
prohibited region. It was noted that this is still present even
for unit-stride in supervisor mode when using translation to
arbitrarily probe supervisor physical space. However, I believe
these security concerns are manageable through control mechanisms
at higher privilege levels.
Krste
On Nov 17, 2021, at 2:21 PM, Bruce Hoult
< bruce@...> wrote:
On Thu, Nov 18, 2021 at 10:33 AM Bill
Huffman <huffman@...>
wrote:
From: Bruce Hoult <bruce@...>
Sent: Wednesday, November
17, 2021 4:24 PM
To: Krste Asanovic <krste@...>
Cc: Bill Huffman <huffman@...>;
Grigorios Magklis <grigorios.magklis@...>;
tech-vector-ext@...
Subject: Re: [RISC-V]
[tech-vector-ext] RISC-V Vector Extension
post-public review updates
Don't
forget some code may want to use a
mask in inverted sense for individual
instructions, without explicitly
creating a new mask. This was not
listed in the "wish list for 64 bits"
below, but it was in early RVV drafts.
Yes, that needs to be
considered as well.
I'm
not sure how common that really is,
and non-store uses can usually just
use a vmerge.vmm at the end anyway, at
the expense of possibly using extra
registers.
While
on the subject of future features, and
somewhat related ... the one big thing
I've noticed RVV lacking that SVE has
is a non-faulting version of indexed
loads ("gather") which creates a mask
showing which elements were
accessible. In SVE this goes into a
CSR which can then be moved into a
mask register, but of course with
sufficient encoding bits you could
directly put it into a normal
register.
Traditional
vector code doesn't really need this,
but SVE has an aim to be able to
vectorise all loops.
How does this
contribute to vectorizing all loops?
Because
otherwise you can't safely vectorise loops that do
indirect array accesses (e.g. a[b[i]]) with
data-dependent control flow.
there are two aspects here -
a) checking array indexes are within bounds, which absent proof
that the indexes are always in bound should be checked.
We have a viable mechanism for that. The index values are
loaded into a vector register in any event.
Set a mask on the compare of the index values with a scalar
bound.
b) handle a fault exceptions expeditiously by returning a mask of
would be faults.
This should be the exception case and therefore does not need
expediting.
As Krste says the OS level can provide a cooperative
mechanism to set a corresponding mask when exceptions occur, if it
is deemed the application [or system code] needs it.
But this feature does not mitigate the out of bound array
fetch. Many out of bounds locations can be in accessible memory.
Thus no need for instructions to have this feature whether
supporting all loop constructs or not.
I think this was not
included for security reasons rather
than ignored.
I don't think
there is any additional security implication.
I could be
wrong, as I'm not an expert on SVE, but I
believe that even if the gather operation is
done (somewhat) in parallel or in random order,
the instruction doesn't actually return a mask
indicating all the failed accesses. All mask
bits after the first element that was
inaccessible are also set to false. The
following code will process all the initial
elements and then invert the mask and loop back
and try to process the tail starting from the
first inaccessible element, which will then
actually fault if the loop didn't exit or skip
that element based on program logic.
|
|