I understand your timing concern for generating an exception based on data values. However, the concern is for data values directly read out from the general registers (GPR/FPR) by the instruction itself and the exception is generated based on these values or some computations of these values, so the timing is tight between the data read out and computation of an exception (such as divide by zero). The concern is usually not for using data values (1 or 2 bits as control) in CSRs written by a previous instruction or even relaxed, by a previous instruction in a different privileged mode (i.e., relaxed in terms of the length of time between the write and the use).
The current mcounteren CSR value is already used to control the generation of an exception of the read operation of the hmpcounter CSRs for S and U-mode. The timing should be similar, if we have a mcounterwen CSR and use its value to control the generation of an exception of the write operation of a counter CSR for S and U-mode.
This is a general discussion. I am fine with dropping the S-mode direct counter update proposal based on its limited use case.