Re: 32-bit accesses to mtime/mtimecmp under RV64
I assumed that Dr Mark Hill was talking about 256 bit atomic loads and stores to ask the FIFO, not LR/SC.
Also, double width CAS (and other double width atomics) is used not just for A-B-A problems, but also for things like atomically inserting into circular lists (e.g. where the list itself has pointers to both the first and the last elements singly link elements of the circle).
In general, if your word or address width is W
For atomic read modify writes:
You need W+V bits or A-B-A problems, where V is whatever number of bits you need for versions or epochs
2W bits for list heads
of course, 2W subsumes W+V, so we often don't make the distinction
And the other big user of extra width atomics RMWs being page tables, e.g. 32-bit virtual addresses with 40 bit physical addresses (stored in 64-bit PTEs).
Non-read modify write, atomic loads and stores of nearly any width – W, 2W, 4W - are useful for active memory devices like FIFOs.
From: tech-privileged@... <tech-privileged@...> On Behalf Of striker@...
Sent: Wednesday, April 22, 2020 18:42
To: mark.hill@...
Cc: andrew@...; dkruckemyer@...; gfavor@...; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] 32-bit accesses to mtime/mtimecmp under RV64
Interesting. If you had a H/W FIFO, seems like it would be easier to make it work with single-copy atomic loads or stores to read from or write to the FIFO rather than bothering with the tedium of LR/SC pairs?
Yes, you can have multiple HARTs going after the "device" concurrently, but the single-copy atomicity of the loads or stores would seem to keep those accesses to the device separated rather than LR/SC which is more to do an atomic RMW of memory.
I'm having trouble seeing how LR/SC would fit there?
Also, I'll assume you really do intend to use the "double-wide" LR/SC for the CAS emulation?
Aside from whatever this FIFO example might turn out to be?
Derek
----- Original message -----
From: Mark Hill <mark.hill@...>
To: "striker@..." <striker@...>
Cc: "andrew@..." <andrew@...>, "dkruckemyer@..." <dkruckemyer@...>, "gfavor@..." <gfavor@...>, "tech-privileged@..." <tech-privileged@...>
Subject: [EXTERNAL] RE: [RISC-V] [tech-privileged] 32-bit accesses to mtime/mtimecmp under RV64
Date: Wed, Apr 22, 2020 4:36 AM
Another possible use case is access sensitive devices, for example a FIFO of 128-bit records with multiple RV64 harts reading from the FIFO.
From: tech-privileged@... [mailto:tech-privileged@...] On Behalf Of striker@...
Sent: 22 April 2020 06:21
To: Mark Hill <mark.hill@...>
Cc: andrew@...; dkruckemyer@...; gfavor@...; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] 32-bit accesses to mtime/mtimecmp under RV64
To widen your question even further Mark (no pun intended), do we need 256 bits for RV128?
Yes, RV128 is a bit speculative, but it does at least rate being in the book, so best to have all the consequences of the request here on the table.
Also, I'm curious what you intend to use the bigger ones for?
The only answer here I know of is emulating CAS with the ticket/epoch/whatever counter next to the actual data element to solve CAS A-B-A problems (which, handily LR/SC naturally avoids anyways).
Is that the one you're after? (Asking because if there's another reason beyond that one, I'm interested in hearing about it).
Derek Williams
----- Original message -----
From: "Dr Mark Hill" <mark.hill@...>
Sent by: tech-privileged@...
To: Andrew Waterman <andrew@...>, David Kruckemyer <dkruckemyer@...>
Cc: Greg Favor <gfavor@...>, "tech-privileged@..." <tech-privileged@...>
Subject: [EXTERNAL] Re: [RISC-V] [tech-privileged] 32-bit accesses to mtime/mtimecmp under RV64
Date: Tue, Apr 21, 2020 1:55 AM
To widen the question slightly further are there any plans to provide atomic load/store pair operations (128-bits for RV64, 64-bits for RV32)?
From: tech-privileged@... [mailto:tech-privileged@...] On Behalf Of Andrew Waterman
Sent: 20 April 2020 23:49
To: David Kruckemyer <dkruckemyer@...>
Cc: Greg Favor <gfavor@...>; tech-privileged@...
Subject: Re: [RISC-V] [tech-privileged] 32-bit accesses to mtime/mtimecmp under RV64
On Mon, Apr 20, 2020 at 3:28 PM David Kruckemyer <dkruckemyer@...> wrote:
On Mon, Apr 20, 2020 at 2:38 PM Andrew Waterman <andrew@...> wrote:
On Mon, Apr 20, 2020 at 11:32 AM David Kruckemyer <dkruckemyer@...> wrote:
On Fri, Apr 17, 2020 at 7:31 PM Andrew Waterman <andrew@...> wrote:
On Fri, Apr 17, 2020 at 7:00 PM Greg Favor <gfavor@...> wrote:
The mtime and mtimecmp registers are defined as 64-bit memory-mapped registers. The priv spec says that - in RV32 - mtimecmp can be written as a pair of 32-bit registers. Since this was made specific to RV32, is there an intended implication in the spec that in RV64 the system must support atomic 64-bit accesses to these registers? Or is it allowable for only non-atomic 64-bit accesses to be supported (i.e. a 64-bit access by a CPU is performed as two 32-bit accesses out in the SoC where mtime/mtimecmp are located)?
The spec strongly implies by omission that 64-bit accesses are atomic for RV64, in that it gives an unusually detailed RV32-specific code example to cope with non-atomicity, but mentions nothing of the sort for RV64. I will add the additional sentence that makes this implication explicit.
Put differently, must RV64 software not assume that a 64-bit load/store will atomically read/write the register. (Note: ARMv8 explicitly says software must not make such an atomicity assumption for accesses to memory-mapped 64-bit registers.)
In general, this depends on the peripheral and the platform. We aren't trying to preclude interfacing with legacy devices and buses, so of course some 64-bit accesses to some devices will either become non-atomic or signal some sort of error. But it's really quite useful to be able to assume that 64-bit accesses are atomic when interfacing with more modern peripherals that use 64-bit addresses, so we definitely do not want to preclude that, either.
Asking this slightly differently (I think) to clarify....
With respect to mtime/mtimecmp, does an RV64 processor place constraints on the platform, or can the platform place constraints on the RV64 processor? If the former, the implication is that the platform must provide a way for the RV64 processors to access the registers atomically with a 64b load or store. If the latter, the implication is that the platform can require the RV64 processor to access the registers non-atomically with 32b loads or stores, a la RV32.
The second half of my answer was addressing the more general matter. For mtime and mtimecmp specifically, the spec is now clear:
So the only constraint is that when a 64b naturally-aligned access is made to mtime/mtimecmp, the access must be completed atomically if the platform allows 64b naturally-aligned accesses to those registers? A platform is still allowed to signal an error on such accesses and to force an RV64 processor to access those registers with 32b loads and stores, right?
I think your interpretation of that sentence is accurate. FWIW, the insufficiently described Linux platform does assume such accesses are legal (more precisely, the various SBI implementations make that assumption).
Cheers,
David
Cheers,
David
Greg