[riscv-CMOs:master] reported: Interaction between management instructions and dirty bit #github #CMOs #risv


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By dkruckemyer-ventana:

Because an invalidate is a programmer-initiated instruction to potentially change memory, which is a dirtying action in my book

Ah, but the semantic intent of invalidate is to discard stale cached data, either to expose data in memory that have been written by a "non-coherent" agent or to drop data written by a hart but are no longer useful. In the former case, the hart is a consumer (i.e. reader) of data produced by the non-coherent agent, and in the latter case, the hart is executing the op as a performance optimization to prevent consuming memory bandwidth to write back stale data. Although the latter case clearly requires write permission (and therefore it would be appropriate to require write permission for the invalidate), the system may not want to grant write permission to the consumer in the former case, so requiring write permission for the invalidate is not necessarily appropriate.

One could certainly argue that those two use-cases have different semantic intent, and therefore each requires a separate instruction, mostly to handle the permissions/security issues differently.

(What I think this highlights is that "write permission" is fundamentally different from "invalidate permission," but there's not sufficient justification to make this distinction in the page tables. Hence, we have a global switch in CSRs to map the invalidate instruction to a different, "safe" operation for the consumer case.)

Regardless, in both cases, the invalidate is not creating "new" data in memory, like an explicit store does, so the dirty bit is not checked or set on these accesses.


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By jrtc27:

You could equally view it as an uncached (cache-bypassing) load, cached store of the uncached data and clean. Look, I get the argument, but I do not agree with it, and I believe it's going to come back to bite. There is surely a reason that Arm switched from requiring read to requiring write permission when moving from AArch32 to AArch64.


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By dkruckemyer-ventana:

Still not following. Here's what I think is happening:

1. Userspace writes value 1 to page

The value 1 is either in a cache or in memory. Regardless, that's what's visible to a hart.

2. Kernel marks page as dirty
3. Userspace writes 2 to page

Now the value 2 is either in a cache or in memory. The value 1 could be in another cache or in memory (assuming the value 2 didn't get written to memory), but a hart will only ever see the value 2.

4. Kernel swaps page out

At this point, the value 2 must be written out to disk, since that is the most up-to-date value, and whatever coherence mechanism must ensure that property. Whether any copies remain with either value is system dependent.

5. Userspace reads 2 from page (kernel swaps page in but leaves page on disk)

So the value 2 should be in memory (and any cached copies) and on disk, and the page is clean. Swapping the page in entails invalidating/updating any cached copies, since between steps 4 and 5, the physical addresses could be used for a page from a different context. The value 1 should no longer be readable.

6. Userspace invalidates page
7. Userspace reads back the original 1 because that never left the LLC (unlikely, but possible)

I'm still not sure how the value 1 remains in a cached copy. The only value that the hart should be able to read is 2.

8. Kernel evicts page again, but doesn't re-write out to disk because still dirty

Do you mean still clean (or not dirty)?

9. Userspace reads back the 2, despite having seen the 1 (kernel swaps page in)


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By dkruckemyer-ventana:

5. Userspace reads 2 from page (kernel swaps page in but leaves page on disk)

So the value 2 should be in memory (and any cached copies) and on disk, and the page is clean

This is the point where you go wrong: there's no reason it needs to be in both memory and cache. Polled IO will put it into the cache, as will boring DMA that acts as an LLC client (it probably wouldn't on miss, but on hit sure it'll just update the line), but that line won't necessarily be written back to replace the 1 in memory. Yes, in the "I did a DMA from disk to the actual physical page via the interconnect behind the LLC" case the actual point of coherence will have 2, but that isn't the only case that can happen.

I get your point. In that case, the value 1 isn't the only value to be concerned about. It could be a bank account number, for example, since the swapped-in page could be located anywhere in the physical address space. At that point, I believe it's the responsibility of the privileged software to ensure one of the following:

  1. The invalidate instruction is mapped to a flush
  2. The copy in physical memory is consistent with all the cached copies (i.e. perform a clean or flush as part of a swap)
  3. The software that can perform an invalidate is trusted enough to observe whatever values it might find

But, given the load value rules, the last one may be out of bounds, since the rules don't really allow random values to appear. (It does appear that the rules need to be tightened up to disallow the scenario where, once an invalidate "rolls back" data to an old value, previously newer values cannot be observed again.)

Ultimately, the execution environment bears some responsibility for ensuring that memory state is consistent and safe with respect to the invalidate. (There's even a non-normative statement to this effect in the current spec, though arguably it's not thorough enough, or even in the best location of the spec.)


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By dkruckemyer-ventana:

In the former case, the hart is a consumer (i.e. reader) of data produced by the non-coherent agent

The sequence for this is:

1. invalidate

2. initiate + wait for DMA to finish

To be complete.... software needs to perform an invalidate or flush here, since speculation and/or hardware prefetching can cause soon-to-be-stale data to be cached before the non-coherent DMA writes the memory location.

3. read data

The act of DMA'ing into the buffer is notionally a write. Yes, this is an external agent, so the MMU doesn't apply here, but whoever gave you access to that DMA-capable device is already granting you the ability to write to the pages, so I don't see why giving write permission from the MMU's perspective is a problem. Not to mention that in a "real" system you'll also have an IOMMU in the way that you'll have to ask the kernel to set up for you with write permission for those pages. But CBO.INVAL allows you to totally bypass that for any page you can read.

I understand your argument, and now it's my turn for a contrived example.... :)

The consumer may not necessarily be the driver that controls the device. In this case, you have three agents: the consumer, which has only read permission and simply reads memory; the driver, which may actually have write permission and asks for write permission on behalf of the device; and the device, which writes the memory.

In reality, the general consensus has been that the architecture should guide software away from using invalidate for most use-cases, because the instruction has subtle, undesirable security properties. Note that, for non-coherent producer-consumer communication, a flush is sufficient and can be equally performant; however, the "discard stale write data" case has measurable benefits in some systems, so there were arguments for keeping invalidate in the set of management CBOs.


tech-cmo@lists.riscv.org Integration <tech-cmo@...>
 

[riscv-CMOs:master] New Comment on Issue #45 Interaction between management instructions and dirty bit
By jrtc27:

Why wouldn't the driver do the invalidate then in that situation? It knows best whether the DMA is coherent, and is the one initiating it, so it should be the one to ensure coherence when the agent is non-coherent.