Non-idempotent PMA and table walk accesses


David Kruckemyer
 

Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

Cheers,
David


andrew@...
 



On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."

I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged.  Banning it seems a little harsh, though I see where you're coming from.


Cheers,
David


Nikhil Rishiyur
 

Although I haven't seen any such implementation, I would imagine that a non-idempotent region that was, say, counting accesses to each address as a side-effect of each access may be a "benign" kind of non-idempotency for PTWs.

Nikhil

On Mon, May 18, 2020 at 6:26 PM Andrew Waterman <andrew@...> wrote:


On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."

I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged.  Banning it seems a little harsh, though I see where you're coming from.


Cheers,
David


David Kruckemyer
 

That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).

I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?

Cheers,
David


On Mon, May 18, 2020 at 4:16 PM Rishiyur Nikhil <nikhil@...> wrote:
Although I haven't seen any such implementation, I would imagine that a non-idempotent region that was, say, counting accesses to each address as a side-effect of each access may be a "benign" kind of non-idempotency for PTWs.

Nikhil

On Mon, May 18, 2020 at 6:26 PM Andrew Waterman <andrew@...> wrote:


On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."

I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged.  Banning it seems a little harsh, though I see where you're coming from.


Cheers,
David


Bill Huffman
 


On 5/18/20 5:10 PM, David Kruckemyer wrote:
EXTERNAL MAIL

That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).

I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?

Cheers,
David

I've always assumed that it included any side-effects that mattered to the program.  It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration.  I don't think it includes incrementing performance counters or shifting the results of predictors either.  Not sure how many things actually fit between your definition and mine.  Perhaps not many in real implementations.

      Bill



On Mon, May 18, 2020 at 4:16 PM Rishiyur Nikhil <nikhil@...> wrote:
Although I haven't seen any such implementation, I would imagine that a non-idempotent region that was, say, counting accesses to each address as a side-effect of each access may be a "benign" kind of non-idempotency for PTWs.

Nikhil

On Mon, May 18, 2020 at 6:26 PM Andrew Waterman <andrew@...> wrote:


On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."

I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged.  Banning it seems a little harsh, though I see where you're coming from.


Cheers,
David


Greg Favor
 

For reference here is ARMv8's definition of idempotency (which includes side-effects):

The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:
— Read accesses can be repeated with no side-effects.
— Repeated read accesses return the last value written to the resource being read.
— Read accesses can fetch additional memory locations with no side-effects.
— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.
— Unaligned accesses can be supported.
— Accesses can be merged before accessing the target memory system.

Put very concisely (and stripped down a bit):  Reads and writes can be repeated without side-effects, and reads return the last value written.  In addition accesses can be misaligned and can be merged.

Greg

On Mon, May 18, 2020 at 5:35 PM Bill Huffman <huffman@...> wrote:

On 5/18/20 5:10 PM, David Kruckemyer wrote:

EXTERNAL MAIL

That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).

I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?

Cheers,
David

I've always assumed that it included any side-effects that mattered to the program.  It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration.  I don't think it includes incrementing performance counters or shifting the results of predictors either.  Not sure how many things actually fit between your definition and mine.  Perhaps not many in real implementations.

      Bill


Greg Favor
 

P.S. This stripped down ARMv8 definition goes a little further than what is currently in the Privileged spec - which only says that accesses are non-idempotent if reads and/or writes have any side effects.

It seems like "reads return the last value written" also needs to be part of the explicit definition.  A location may have no side-effects, but reads still may not always return the last written value (a device status register being the classic example).

PMA-wise the Priv spec separates out alignment constraints (as a separate PMA), and makes no mention of mergability or non-mergability as a PMA.  Shouldn't the latter be an explicit PMA (akin to the write merging distinctions that x86 and ARMv8 draw between their various memory types)?

Greg

On Mon, May 18, 2020 at 6:42 PM Greg Favor <gfavor@...> wrote:
For reference here is ARMv8's definition of idempotency (which includes side-effects):

The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:
— Read accesses can be repeated with no side-effects.
— Repeated read accesses return the last value written to the resource being read.
— Read accesses can fetch additional memory locations with no side-effects.
— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.
— Unaligned accesses can be supported.
— Accesses can be merged before accessing the target memory system.

Put very concisely (and stripped down a bit):  Reads and writes can be repeated without side-effects, and reads return the last value written.  In addition accesses can be misaligned and can be merged.

Greg

On Mon, May 18, 2020 at 5:35 PM Bill Huffman <huffman@...> wrote:

On 5/18/20 5:10 PM, David Kruckemyer wrote:

EXTERNAL MAIL

That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).

I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?

Cheers,
David

I've always assumed that it included any side-effects that mattered to the program.  It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration.  I don't think it includes incrementing performance counters or shifting the results of predictors either.  Not sure how many things actually fit between your definition and mine.  Perhaps not many in real implementations.

      Bill


David Kruckemyer
 

To clarify, in my previous email, I was taking a rather narrow view of the idempotent/non-idempotent attribute as applied to a memory location. One could broaden that definition to include system state, so that even if the underlying memory location was idempotent in the "reads return the last value written" sense, system side-effects would make the location non-idempotent. But that doesn't neatly apply to performance counters (and possibly other state).

This just seems to highlight is that architectural specification is hard.

Returning to my initial query, I'm still trying to understand what table walks to non-idempotent locations mean generally. If we assume the colloquial meaning of non-idempotent is "don't speculate" (ignoring the side-effects of the location or on other state), it would seem that a table walk access to a non-idempotent location could not be performed until all previous instructions were guaranteed not to trap. Is that the intent?

Regardless, I'll put in another plug for commentary stating that the architecture strongly recommends that memory management data structures be located in idempotent regions.

Cheers,
David


On Mon, May 18, 2020 at 6:56 PM Greg Favor <gfavor@...> wrote:
P.S. This stripped down ARMv8 definition goes a little further than what is currently in the Privileged spec - which only says that accesses are non-idempotent if reads and/or writes have any side effects.

It seems like "reads return the last value written" also needs to be part of the explicit definition.  A location may have no side-effects, but reads still may not always return the last written value (a device status register being the classic example).

PMA-wise the Priv spec separates out alignment constraints (as a separate PMA), and makes no mention of mergability or non-mergability as a PMA.  Shouldn't the latter be an explicit PMA (akin to the write merging distinctions that x86 and ARMv8 draw between their various memory types)?

Greg

On Mon, May 18, 2020 at 6:42 PM Greg Favor <gfavor@...> wrote:
For reference here is ARMv8's definition of idempotency (which includes side-effects):

The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:
— Read accesses can be repeated with no side-effects.
— Repeated read accesses return the last value written to the resource being read.
— Read accesses can fetch additional memory locations with no side-effects.
— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.
— Unaligned accesses can be supported.
— Accesses can be merged before accessing the target memory system.

Put very concisely (and stripped down a bit):  Reads and writes can be repeated without side-effects, and reads return the last value written.  In addition accesses can be misaligned and can be merged.

Greg

On Mon, May 18, 2020 at 5:35 PM Bill Huffman <huffman@...> wrote:

On 5/18/20 5:10 PM, David Kruckemyer wrote:

EXTERNAL MAIL

That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).

I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?

Cheers,
David

I've always assumed that it included any side-effects that mattered to the program.  It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration.  I don't think it includes incrementing performance counters or shifting the results of predictors either.  Not sure how many things actually fit between your definition and mine.  Perhaps not many in real implementations.

      Bill


andrew@...
 



On Mon, May 18, 2020 at 3:26 PM Andrew Waterman <andrew@...> wrote:


On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,

I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?

The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?

If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?

The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."

I'd support adding a note that permitting page-table accesses to idempotent regions is

Of course I meant “non-idempotent”... discouraging page-table accesses to idempotent regions might raise some hackles.

discouraged.  Banning it seems a little harsh, though I see where you're coming from.


Cheers,
David


Nikhil Rishiyur
 

>    That sounds a bit like a performance counter to me, but it does
>    raise an interesting question whether "idempotent" in the
>    architectural sense is idempotent in a mathematical sense
>    (i.e. operations are repeatable with the same result) or in a
>    broader sense (e.g. inclusive of any side-effects even if the
>    values at the location don't change).

Whether one treats a performance counter as a 'benign side effect' or
not depends on what one does with it.  It could be merely passive
instrumentation to obtaim meta-information about program behavior. But
if the counter itself is observable to the program and can affect its
future behavior, then it's more serious.

I would avoid _requiring_ anything about itempotency in the spec, and
just leave it to commentary like this for the system designer to be
aware of the issues and to decide whether PTWs in their system can
happen in idempotent regions or not.

Nikhil


David Kruckemyer
 



On Tue, May 19, 2020 at 5:10 AM Nikhil Rishiyur <nikhil@...> wrote:
>    That sounds a bit like a performance counter to me, but it does
>    raise an interesting question whether "idempotent" in the
>    architectural sense is idempotent in a mathematical sense
>    (i.e. operations are repeatable with the same result) or in a
>    broader sense (e.g. inclusive of any side-effects even if the
>    values at the location don't change).

Whether one treats a performance counter as a 'benign side effect' or
not depends on what one does with it.  It could be merely passive
instrumentation to obtaim meta-information about program behavior. But
if the counter itself is observable to the program and can affect its
future behavior, then it's more serious.

Agreed that the counter in your example can be important to system operation.
 

I would avoid _requiring_ anything about itempotency in the spec, and
just leave it to commentary like this for the system designer to be
aware of the issues and to decide whether PTWs in their system can
happen in idempotent regions or not.

OK, I'm fine with this. I still think this PMA confounds the notion of idempotency with speculation control (i.e. whether a location has side-effects can be independent of whether SW wants to limit speculation to that location), which is ultimately a separate issue from the PTW issue.
 

Nikhil