If I understand correctly, this proposal will not cause the problem of your 2nd point because not all the MMIO will trap to user space in this proposal. The proposal still allow MMIO traps to kernel.
We can use the PTE.MMIO field in this proposal only for the MMIO traps that required to be delegated to user-level. They will not be transferred by the G-stage.
The in-kernel MMIO traps will go to the G-stage because their PTE.MMIO is clear by the hypervisor. Then it will cause page fault on G-stage and trap to kernel as usual.