Re: Proposal for "Multiple LR/SC forward progress guarantee levels."
As the guy who probably worked the hardest to get the forward progress guarantees for LR/SC significantly weakened in the RISC-V ISA, I feel I’m going to have to comment here.
I’ll start with the premise of the presentation (pg 6 and pg 16) that some fair forward progress guarantee (however that’s defined) is somehow “required” for the “queued-spinlock” and for transactional memory.
I do not accept from a vague comment In the linux kernel that fair forward progress is absolutely required for this queued-spinlock to work. I can accept that if you build a bad implementation of LR/SC that’s unbalanced and unfair you will likely have problems with this algorithm, but you’ll also likely just have problems with heavily contended basic locks as well. Don’t do that.
I also have a personal reason to doubt any sort of hard fairness requirement. Power machines (quite large ones: fully coherent with up to 240 cores with 1920 threads in total currently) have run this algorithm in linux for a long time with no particular problem (I own the majority of the LL/SC logic in Power, I would have known if there were significant issues and my kernel maintainer as of yesterday said this was no big deal for him). So, there’s at least one practical proof point that this doesn’t seem to demand a hard guarantee even in a very demanding environment.
So, before anything goes anywhere here, can you, or someone helping you, explain in exquisite detail what the progress problem here really is, if any. I suspect there may be some issue here but I’m not sure it rises to the level that a fully fair forward guarantee is absolutely required.
But even if there is a forward progress problem here, I cannot even begin to see what the proposed return codes are going to do to help you? Just what are you going to do with those codes?
Additionally, I don’t think it will be practical to implement a machine with Level 2 (Strong) or Level 3 (Absolute) guarantees (pg 10 of your presentation) at anything approaching a rational cost in any event. I include below the 70+ page presentation deck we worked up years ago to convince people to NOT do a guarantee. I haven’t re-read this in years, so I may disagree with myself in places, and it was never intended for an audience quite this large, but in general it should hold up.
Additionally, even if you use CAS instead of LR/SC, say, how does the queued_spinlock have a forward progress guarantee? CAS, as far as I understand it, checks the “compare” value and won’t swap if it mismatches. What’s to keep one thread (especially in high contention) from always losing by having an “old” compare value from the last CAS it did that’s already been overwritten by another thread’s CAS.? It’ll fail every time and not progress, right? IF you can handle the CAS failing regularly by some other software path, why can’t that same path be used for the SC that fails?
SO, to summarize,
1) I’m not sure you actually have a problem here. Much more detailed proof is required.
2) Even if there is a problem, I’m not sure how these return codes help or that we really a fair forward progress guarantee to deal with this?
3) You’re going to have a very hard time building a machine with LR/SC guarantees meaningfully stronger than what RISC-V currently has.
From: tech-privileged@... <tech-privileged@...> on behalf of Guo Ren <guoren@...>
Sent: Monday, December 12, 2022 7:00 AM
To: tech-privileged@... <tech-privileged@...>
Subject: [EXTERNAL] [RISC-V] [tech-privileged] Proposal for "Multiple LR/SC forward progress guarantee levels."
This Message Is From an External Sender
This message came from outside your organization.
[Edited Message Follows]
Atomic Forward Guarantee is required in many OS to touch contended variables. Linux queued-spinlock is the atomic forward guarantee user who solves the fairness and cache-line bouncing problems and ensures performance and size. RISC-V Linux has been trying to port qspinlock support since 2019, but it is pended on the RISC-V flexible LR/SC forward Guarantee issue. In this presentation, we will introduce the troubles caused by the LR/SC Forward Guarantee definition in the RISC-V ISA to Linux qspinlock and propose our solution. Use klitmus to define the LR/SC & cmpxchg forward progress guarantee cases, which gives clear guidelines for micro-architecture designers and software programmers.
The current LR/SC forward progress guarantee is written in RISC-V ISA spec - "Eventual Success of Store-Conditional Instructions":
HART 0 HART 1
By contrast, if other harts or devices continue to write to that reservation set, it is not guaranteed that any hart will exit its LR/SC loop. (Written in RISC-V ISA spec: "Eventual Success of Store-Conditional Instructions")
HART 0 HART 1
No Guarantee here! The current LR/SC forward progress guarantee is a weak definition. But Linux queued-spinlock needs forward progress guarantee.
Here is the comment written in Linux qspinlock.h:
First, LR/SC requirement in qspinlock when NR_CPUS < 16K.
RISC-V doesn’t have the sub-word amoswap, so use LR/SC equivalent:
Linux cmpxchg is the CAS primitive, and RISC-V uses LR/SC equivalent.
The second cmpxchg should cause a loop break in any contended situation!
Proposal for discussion:
In current riscv spec: “A” Standard Extension for Atomic Instructions - Load-Reserved/Store-Conditional Instructions
"The failure code with value 1 is reserved to encode an unspecified failure. Other failure codes are reserved at this time, and portable software should only assume the failure code will be non-zero. We reserve a failure code of 1 to mean “unspecified” so that simple implementations may return this value using the existing mux required for the SLT/SLTU instructions. More specific failure codes might be defined in future versions or extensions to the ISA."
Add a failure code of 2 to store-conditional instruction:
Store-conditional returns local failure code
0 - Success
Then, define three forward progress guarantee levels:
Level 1 (Weak) - SC returns 0/1/2
Litmus Tests for (Level 1~3, cmpxchg forward progress guarantee)
Level 3 - Absolute:
Level 2 - Strong:
Level 1 - Weak:
cmpxchg forward progress guarantee:
RISCV LEVE_3-LR-LR-SC (Absolute Level)
P0 | P1 ;
RISCV LEVEL_2-LR-LR-SC (Strong Level)
P0 | P1 ;
Here is the presentation for detail: