Re: cache block copies and clean operations


David Kruckemyer
 

Hi Guy,

I think some context is missing in your quotes. In particular, the bulleted items in section 2.2 are prefaced by the following (italics mine, your quotes in bold):

"A cache-block management instruction performs one of the following operations, relative to the copy of a given
cache block allocated in a given cache
:
An invalidate operation deallocates the copy of the cache block
A clean operation performs a write transfer to another cache or to memory if the data in the copy of the
cache block have been modified by a store operation

• A flush operation atomically performs a clean operation followed by an invalidate operation"

So the operation definition in this section pertains to a single cache and describes the behavior of an individual cache. Later sections go on to expand the extent of operations across the set of coherent caches and define the behavior of multiple caches in aggregate. I suppose one way to address your concerns is to describe operations in plural for the operations on coherent caches.

Additionally, earlier in Section 2.2, the whole paragraph describes what "identifying a cache block" implies:

"A CBO instruction causes one or more operations to be performed on the cache blocks identified by the
instruction. In general, a CBO instruction may identify one or more cache blocks; however, in the initial set of
CMO extensions, CBO instructions identify a single cache block only."

Also, near the definition of a cache block in section 2.1:

"A cache block is identified by a physical address
corresponding to the underlying memory locations."

The point of this text is to say that CBO operations operate on a one cache block address, not a range, list, etc. of cache block addresses.

With respect to the definition of invalidate and clean operations, they are allowed to perform any operation within the bounds of normal cache behavior as long as the stated behaviors are performed. For example, an invalidate operation could perform a write transfer all the way to memory (assuming the line was previously modified by a store), but it must deallocate the line from all the relevant caches. We actually take advantage of this allowed behavior to map CBO.INVAL to a flush operation.

Finally, the whole point of the definitions is to allow implementations with non-coherent caches (those are the caches outside the set of coherent caches); however, as you point out, the system is required to implement protocol or memory access transactions that perform the proper operations all the way to "memory." In reality, "memory" just needs to be a point where all agents in the system are guaranteed to access the same values (a "point of coherence," if you will), which may mean that certain IP can only be utilized beyond that point. In short, an implementer has to choose the IP that satisfies the semantics of the instructions and deploy it appropriately.

Hope that clarifies things.

Cheers,
David


On Wed, May 11, 2022 at 9:31 AM Guy Lemieux <guy.lemieux@...> wrote:
The spec seems to have some ambiguity about cache block "copies", and
there is an incongruity with the "clean" operation.

--------

COPIES

Consider these two quotes from background.adoc:

"A cache block is identified by a physical address corresponding to
the underlying memory locations."

"... CBO instructions identify a single cache block only."

From the above, the term "single cache block only" is ambiguous, as it
does not address the fact there may be multiple copies within the
coherence domain. Perhaps a better definition would be "CBO
instructions identify and operate upon all copies of a cache block
that are identified by the same physical address within the given
coherence domain" ?

Consider two more quotes from background.adoc:

"CBO instruction is defined to operate only on the copies of a cache
block that are cached in the caches accessible by the explicit memory
accesses performed by the set of coherent agents"

"An invalidate operation deallocates the copy of the cache block"

The second quote suggests only a single copy of the cache block is
deallocated, but the first quote suggests that it operates on all
copies.

I would like to further request clarification on what is done when
INVAL is applied to dirty cache blocks. It appears the implication is
that the modified data gets dropped. However, if this requires an
operation on all copies, then the underlying coherence protocol must
support an "invalidate without writeback" operation. This underlying
assumption should be clearly stated.

Here are a few additional uses which agree with the fact that CBO
operations apply to all copies within the coherence domain:

"Caches introduce the possibility that multiple copies of a given
cache block may be present"

"... if a coherent agent in the set executes a CBO instruction that
specifies the cache block, the resulting operation shall apply to any
and all of the copies in the caches ..."

"... a CBO instruction is defined to operate only on the copies of a
cache block that are cached in the caches ..."

It remains unclear how to adapt the CBO operations to work within a
system that does not have coherent caches. I would assume CBO
operations would only work on all downstream levels of caches in such
a system, which means they do not apply to ALL copies but only to
downstream ones, but if there is no coherence protocol then there is
no way to actually enforce this property on downstream caches; it can
only be applied to the levels of cache that fall within the processor
designer's control (eg, caches external to the processor are
excluded). Was defining behaviour in systems without coherent caches
considered to be out-of-scope?

------

CLEAN

"A clean operation performs a write transfer to another cache or to
memory if the data in the copy of the cache block have been modified
by a store operation"

This CLEAN operation behaves differently than FLUSH/INVAL. With
FLUSH/INVAL, you are guaranteed that no copies of the cache block
remain within the coherence domain. With CLEAN, it appears that it is
OK to merely move the dirty data to another processor cache within the
coherence domain (as long as it remains dirty). This is incongruent
with the statement that:

"... if a coherent agent in the set executes a CBO instruction that
specifies the cache block, the resulting operation shall apply to any
and all of the copies in the caches ..."

which implies the only acceptable operation is writing back to memory
(not another cache).





Join tech-cmo@lists.riscv.org to automatically receive all group messages.