Non-idempotent PMA and table walk accesses
Hi all,I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?
Cheers,David
On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:Hi all,I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged. Banning it seems a little harsh, though I see where you're coming from.Cheers,David
Although I haven't seen any such implementation, I would imagine that a non-idempotent region that was, say, counting accesses to each address as a side-effect of each access may be a "benign" kind of non-idempotency for PTWs.NikhilOn Mon, May 18, 2020 at 6:26 PM Andrew Waterman <andrew@...> wrote:On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:Hi all,I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged. Banning it seems a little harsh, though I see where you're coming from.Cheers,David
EXTERNAL MAIL
That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).
I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?
Cheers,David
I've always assumed that it included any side-effects that mattered to the program. It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration. I don't think it includes incrementing performance counters or shifting the results of predictors either. Not sure how many things actually fit between your definition and mine. Perhaps not many in real implementations.
Bill
On Mon, May 18, 2020 at 4:16 PM Rishiyur Nikhil <nikhil@...> wrote:
Although I haven't seen any such implementation, I would imagine that a non-idempotent region that was, say, counting accesses to each address as a side-effect of each access may be a "benign" kind of non-idempotency for PTWs.
Nikhil
On Mon, May 18, 2020 at 6:26 PM Andrew Waterman <andrew@...> wrote:
On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:
Hi all,
I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?
The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?
If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?
The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."
I'd support adding a note that permitting page-table accesses to idempotent regions is discouraged. Banning it seems a little harsh, though I see where you're coming from.
Cheers,David
The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:— Read accesses can be repeated with no side-effects.— Repeated read accesses return the last value written to the resource being read.— Read accesses can fetch additional memory locations with no side-effects.— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.— Unaligned accesses can be supported.— Accesses can be merged before accessing the target memory system.
On 5/18/20 5:10 PM, David Kruckemyer wrote:
EXTERNAL MAIL
That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).
I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?
Cheers,DavidI've always assumed that it included any side-effects that mattered to the program. It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration. I don't think it includes incrementing performance counters or shifting the results of predictors either. Not sure how many things actually fit between your definition and mine. Perhaps not many in real implementations.
Bill
For reference here is ARMv8's definition of idempotency (which includes side-effects):The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:— Read accesses can be repeated with no side-effects.— Repeated read accesses return the last value written to the resource being read.— Read accesses can fetch additional memory locations with no side-effects.— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.— Unaligned accesses can be supported.— Accesses can be merged before accessing the target memory system.Put very concisely (and stripped down a bit): Reads and writes can be repeated without side-effects, and reads return the last value written. In addition accesses can be misaligned and can be merged.GregOn Mon, May 18, 2020 at 5:35 PM Bill Huffman <huffman@...> wrote:On 5/18/20 5:10 PM, David Kruckemyer wrote:
EXTERNAL MAIL
That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).
I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?
Cheers,DavidI've always assumed that it included any side-effects that mattered to the program. It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration. I don't think it includes incrementing performance counters or shifting the results of predictors either. Not sure how many things actually fit between your definition and mine. Perhaps not many in real implementations.
Bill
P.S. This stripped down ARMv8 definition goes a little further than what is currently in the Privileged spec - which only says that accesses are non-idempotent if reads and/or writes have any side effects.It seems like "reads return the last value written" also needs to be part of the explicit definition. A location may have no side-effects, but reads still may not always return the last written value (a device status register being the classic example).PMA-wise the Priv spec separates out alignment constraints (as a separate PMA), and makes no mention of mergability or non-mergability as a PMA. Shouldn't the latter be an explicit PMA (akin to the write merging distinctions that x86 and ARMv8 draw between their various memory types)?GregOn Mon, May 18, 2020 at 6:42 PM Greg Favor <gfavor@...> wrote:For reference here is ARMv8's definition of idempotency (which includes side-effects):The Normal memory attribute is appropriate for locations of memory that are idempotent, meaning that they exhibit all of the following properties:— Read accesses can be repeated with no side-effects.— Repeated read accesses return the last value written to the resource being read.— Read accesses can fetch additional memory locations with no side-effects.— Write accesses can be repeated with no side-effects if the contents of the location accessed are unchanged between the repeated writes or as the result of an exception, as described in this section.— Unaligned accesses can be supported.— Accesses can be merged before accessing the target memory system.Put very concisely (and stripped down a bit): Reads and writes can be repeated without side-effects, and reads return the last value written. In addition accesses can be misaligned and can be merged.GregOn Mon, May 18, 2020 at 5:35 PM Bill Huffman <huffman@...> wrote:On 5/18/20 5:10 PM, David Kruckemyer wrote:
EXTERNAL MAIL
That sounds a bit like a performance counter to me, but it does raise an interesting question whether "idempotent" in the architectural sense is idempotent in a mathematical sense (i.e. operations are repeatable with the same result) or in a broader sense (e.g. inclusive of any side-effects even if the values at the location don't change).
I've always assumed the "non-idempotent" attribute meant that a read may not return the last value written or that repeated reads may not return the same value, not that the behavior included side-effects that are observable elsewhere. What is the consensus regarding this?
Cheers,DavidI've always assumed that it included any side-effects that mattered to the program. It obviously does not include bringing the demise of a chip nearer with tiny amounts of electromigration. I don't think it includes incrementing performance counters or shifting the results of predictors either. Not sure how many things actually fit between your definition and mine. Perhaps not many in real implementations.
Bill
On Mon, May 18, 2020 at 2:58 PM David Kruckemyer <dkruckemyer@...> wrote:Hi all,I have a simple question: does the architecture allow table walk accesses (reads or writes) to regions with the non-idempotent PMA?The architecture doesn't explicitly disallow it, so the answer is probably "yes." However, I'm having a hard time understanding a system design in which such a table walk would be practical. Can someone provide a practical use-case for walking non-idempotent locations?If no such use-case exists, would people object to imposing a restriction on table walk accesses to locations with the non-idempotent PMA? Or at least a comment strongly suggesting that platforms won't support that behavior?The specification machinery exists to allow implementations to impose such a restriction: "For systems with page-based virtual memory, I/O and memory regions can specify which combinations of hardware page-table reads and hardware page-table writes are supported."I'd support adding a note that permitting page-table accesses to idempotent regions is
discouraged. Banning it seems a little harsh, though I see where you're coming from.Cheers,David
> raise an interesting question whether "idempotent" in the
> architectural sense is idempotent in a mathematical sense
> (i.e. operations are repeatable with the same result) or in a
> broader sense (e.g. inclusive of any side-effects even if the
> values at the location don't change).
Whether one treats a performance counter as a 'benign side effect' or
not depends on what one does with it. It could be merely passive
instrumentation to obtaim meta-information about program behavior. But
if the counter itself is observable to the program and can affect its
future behavior, then it's more serious.
I would avoid _requiring_ anything about itempotency in the spec, and
just leave it to commentary like this for the system designer to be
aware of the issues and to decide whether PTWs in their system can
happen in idempotent regions or not.
Nikhil
> That sounds a bit like a performance counter to me, but it does
> raise an interesting question whether "idempotent" in the
> architectural sense is idempotent in a mathematical sense
> (i.e. operations are repeatable with the same result) or in a
> broader sense (e.g. inclusive of any side-effects even if the
> values at the location don't change).
Whether one treats a performance counter as a 'benign side effect' or
not depends on what one does with it. It could be merely passive
instrumentation to obtaim meta-information about program behavior. But
if the counter itself is observable to the program and can affect its
future behavior, then it's more serious.
I would avoid _requiring_ anything about itempotency in the spec, and
just leave it to commentary like this for the system designer to be
aware of the issues and to decide whether PTWs in their system can
happen in idempotent regions or not.
Nikhil