Date   

xTVAL Compliance restriction proposal

Allen Baum
 

When address-related exceptions occur, the xTVAL CSR is written with the faulting effective address.  However, the spec also says:

  Implementations may convert some invalid address patterns into other invalid addresses prior to writing them to xTVAL.

The purpose of this statement is to allow implementations to reduce the number of xTVAL storage bits when addresses are smaller than XLEN.  (If your implementation implements all XLEN bits of mtval, this point is moot, and you can skip to the last paragraph.)

We propose to constrain this behavior to simplify the task of compliance-testing the xTVAL registers.  In particular, allowing implementations to arbitrarily convert any invalid address to any other invalid address is impractical to test.

Some implementations, including the Rocket core, make xTVAL a two's complement value that's just 1-bit wider than the largest virtual address.  This allows xTVAL to represent all valid virtual addresses with the minimal number of physical storage bits, yet still allows it to distinguish properly sign-extended VAs from improperly sign-extended VAs.  

The scheme replaces improperly sign-extended bits with the inverse of a properly signed address[VA] bit. Or (in pseudo code):

  xTVAL[XLEN-1:VASIZE] = (address[XLEN-1:VASIZE] == sign_extend( address[VASIZE-1]))
    ?    sign_extend(address[VASIZE-1]
    :   ~sign_extend(address[VASIZE-1]; 

 The effect of this scheme is that valid virtual addresses are always preserved, whereas invalid virtual addresses always remain invalid, even though their MSBs are discarded.

We are proposing to restrict xTVAL implementations with fewer than XLEN physical storage bits to support only this invalid address conversion scheme.

-->   If your implementation conforms to this scheme, or if your xTVAL register can represent all XLEN bits unconditionally,  please select (reply with X following) "Yes."  
-->  If your implementation doesn't conform to this scheme, but still conforms to the ISA spec, please select (reply with X following) "No", and carefully explain in the comments field how your implementation recodes invalid addresses.  

"No" responses without a sufficient explanation in the comments will not be considered.

Yes: ___ (our implementation conforms to this constraint or implements XLEN bits)
No:  ___ (our implementation is ISA compliant but doesn't conform to this constraint)

Comment (if you select "no" above): 








Re: xTVAL Compliance restriction proposal

Andrew Waterman
 



On Sun, Jun 14, 2020 at 2:39 PM Allen Baum <allen.baum@...> wrote:
When address-related exceptions occur, the xTVAL CSR is written with the faulting effective address.  However, the spec also says:

  Implementations may convert some invalid address patterns into other invalid addresses prior to writing them to xTVAL.

The purpose of this statement is to allow implementations to reduce the number of xTVAL storage bits when addresses are smaller than XLEN.  (If your implementation implements all XLEN bits of mtval, this point is moot, and you can skip to the last paragraph.)

We propose to constrain this behavior to simplify the task of compliance-testing the xTVAL registers.  In particular, allowing implementations to arbitrarily convert any invalid address to any other invalid address is impractical to test.

Some implementations, including the Rocket core, make xTVAL a two's complement value that's just 1-bit wider than the largest virtual address.  This allows xTVAL to represent all valid virtual addresses with the minimal number of physical storage bits, yet still allows it to distinguish properly sign-extended VAs from improperly sign-extended VAs.  

The scheme replaces improperly sign-extended bits with the inverse of a properly signed address[VA] bit. Or (in pseudo code):

  xTVAL[XLEN-1:VASIZE] = (address[XLEN-1:VASIZE] == sign_extend( address[VASIZE-1]))
    ?    sign_extend(address[VASIZE-1]
    :   ~sign_extend(address[VASIZE-1]; 

 The effect of this scheme is that valid virtual addresses are always preserved, whereas invalid virtual addresses always remain invalid, even though their MSBs are discarded.

We are proposing to restrict xTVAL implementations with fewer than XLEN physical storage bits to support only this invalid address conversion scheme.

-->   If your implementation conforms to this scheme, or if your xTVAL register can represent all XLEN bits unconditionally,  please select (reply with X following) "Yes."  
-->  If your implementation doesn't conform to this scheme, but still conforms to the ISA spec, please select (reply with X following) "No", and carefully explain in the comments field how your implementation recodes invalid addresses.  

"No" responses without a sufficient explanation in the comments will not be considered.

Yes: X (our implementation conforms to this constraint or implements XLEN bits)
No:  ___ (our implementation is ISA compliant but doesn't conform to this constraint)

Comment (if you select "no" above): 








Re: xTVAL Compliance restriction proposal

Greg Favor
 

Yes: X (our implementation conforms to this constraint or implements XLEN bits)


On Sun, Jun 14, 2020, 2:39 PM Allen Baum <allen.baum@...> wrote:
When address-related exceptions occur, the xTVAL CSR is written with the faulting effective address.  However, the spec also says:

  Implementations may convert some invalid address patterns into other invalid addresses prior to writing them to xTVAL.

The purpose of this statement is to allow implementations to reduce the number of xTVAL storage bits when addresses are smaller than XLEN.  (If your implementation implements all XLEN bits of mtval, this point is moot, and you can skip to the last paragraph.)

We propose to constrain this behavior to simplify the task of compliance-testing the xTVAL registers.  In particular, allowing implementations to arbitrarily convert any invalid address to any other invalid address is impractical to test.

Some implementations, including the Rocket core, make xTVAL a two's complement value that's just 1-bit wider than the largest virtual address.  This allows xTVAL to represent all valid virtual addresses with the minimal number of physical storage bits, yet still allows it to distinguish properly sign-extended VAs from improperly sign-extended VAs.  

The scheme replaces improperly sign-extended bits with the inverse of a properly signed address[VA] bit. Or (in pseudo code):

  xTVAL[XLEN-1:VASIZE] = (address[XLEN-1:VASIZE] == sign_extend( address[VASIZE-1]))
    ?    sign_extend(address[VASIZE-1]
    :   ~sign_extend(address[VASIZE-1]; 

 The effect of this scheme is that valid virtual addresses are always preserved, whereas invalid virtual addresses always remain invalid, even though their MSBs are discarded.

We are proposing to restrict xTVAL implementations with fewer than XLEN physical storage bits to support only this invalid address conversion scheme.

-->   If your implementation conforms to this scheme, or if your xTVAL register can represent all XLEN bits unconditionally,  please select (reply with X following) "Yes."  
-->  If your implementation doesn't conform to this scheme, but still conforms to the ISA spec, please select (reply with X following) "No", and carefully explain in the comments field how your implementation recodes invalid addresses.  

"No" responses without a sufficient explanation in the comments will not be considered.

Yes: ___ (our implementation conforms to this constraint or implements XLEN bits)
No:  ___ (our implementation is ISA compliant but doesn't conform to this constraint)

Comment (if you select "no" above): 








Re: xTVAL Compliance restriction proposal

Greg Favor
 

Allen,

How is compliance testing going to handle this issue for software (versus hardware) writes to *tval CSRs?  For example, when software writes stval with an address, how is the validity judged?  Is it based on S-mode and satp.MODE, or on the state of the current privilege and translation modes?  Similarly, what happens when HS-mode writes vstval?

This is further ambiguity in the Priv spec that needs to be resolved - possibly (?) by you proposing a behavior that compliance will test for and thus require of any compliant implementations (and which the arch spec would then need to incorporate).  I suspect there is no great answer; each has its own ugly points for software and/or for hardware.

Or should the specified behavior for software writes simply be that the written value is always treated as if valid, i.e store the low N bits in the storage flops and set the special N+1 flop as sign-extension of the low N bits?  Or that hardware checks whether written bits [63:N] are sign-extension of the low N bits?  Both of these have their own uglinesses, and both would be wrong for Bare and M modes.

Greg

On Sun, Jun 14, 2020 at 2:39 PM Allen Baum <allen.baum@...> wrote:
When address-related exceptions occur, the xTVAL CSR is written with the faulting effective address.  However, the spec also says:

  Implementations may convert some invalid address patterns into other invalid addresses prior to writing them to xTVAL.

The purpose of this statement is to allow implementations to reduce the number of xTVAL storage bits when addresses are smaller than XLEN.  (If your implementation implements all XLEN bits of mtval, this point is moot, and you can skip to the last paragraph.)

We propose to constrain this behavior to simplify the task of compliance-testing the xTVAL registers.  In particular, allowing implementations to arbitrarily convert any invalid address to any other invalid address is impractical to test.

Some implementations, including the Rocket core, make xTVAL a two's complement value that's just 1-bit wider than the largest virtual address.  This allows xTVAL to represent all valid virtual addresses with the minimal number of physical storage bits, yet still allows it to distinguish properly sign-extended VAs from improperly sign-extended VAs.  

The scheme replaces improperly sign-extended bits with the inverse of a properly signed address[VA] bit. Or (in pseudo code):

  xTVAL[XLEN-1:VASIZE] = (address[XLEN-1:VASIZE] == sign_extend( address[VASIZE-1]))
    ?    sign_extend(address[VASIZE-1]
    :   ~sign_extend(address[VASIZE-1]; 

 The effect of this scheme is that valid virtual addresses are always preserved, whereas invalid virtual addresses always remain invalid, even though their MSBs are discarded.

We are proposing to restrict xTVAL implementations with fewer than XLEN physical storage bits to support only this invalid address conversion scheme.

-->   If your implementation conforms to this scheme, or if your xTVAL register can represent all XLEN bits unconditionally,  please select (reply with X following) "Yes."  
-->  If your implementation doesn't conform to this scheme, but still conforms to the ISA spec, please select (reply with X following) "No", and carefully explain in the comments field how your implementation recodes invalid addresses.  

"No" responses without a sufficient explanation in the comments will not be considered.

Yes: ___ (our implementation conforms to this constraint or implements XLEN bits)
No:  ___ (our implementation is ISA compliant but doesn't conform to this constraint)

Comment (if you select "no" above): 








Re: xTVAL Compliance restriction proposal

Andrew Waterman
 



On Mon, Jun 15, 2020 at 10:40 PM Greg Favor <gfavor@...> wrote:
Allen,

How is compliance testing going to handle this issue for software (versus hardware) writes to *tval CSRs?  For example, when software writes stval with an address, how is the validity judged?  Is it based on S-mode and satp.MODE, or on the state of the current privilege and translation modes?  Similarly, what happens when HS-mode writes vstval?

This is further ambiguity in the Priv spec that needs to be resolved - possibly (?) by you proposing a behavior that compliance will test for and thus require of any compliant implementations (and which the arch spec would then need to incorporate).  I suspect there is no great answer; each has its own ugly points for software and/or for hardware.

Or should the specified behavior for software writes simply be that the written value is always treated as if valid, i.e store the low N bits in the storage flops and set the special N+1 flop as sign-extension of the low N bits?  Or that hardware checks whether written bits [63:N] are sign-extension of the low N bits?  Both of these have their own uglinesses, and both would be wrong for Bare and M modes.

I focused on the HW writes to mtval when I described the Rocket core's behavior to Allen.  The behavior for SW writes is that it unconditionally sign-extends from the most-significant implemented bit: in other words, the behavior that the MSBs are checked for a proper sign-extension only applies to HW writes.  This is a reasonable implementation, and in some respects a preferable one.  The purpose of the recoding is to provide information to SW.  Enforcing the same constraints on SW writes is arbitrary, and might even be detrimental in some virtualization cases.


Greg

On Sun, Jun 14, 2020 at 2:39 PM Allen Baum <allen.baum@...> wrote:
When address-related exceptions occur, the xTVAL CSR is written with the faulting effective address.  However, the spec also says:

  Implementations may convert some invalid address patterns into other invalid addresses prior to writing them to xTVAL.

The purpose of this statement is to allow implementations to reduce the number of xTVAL storage bits when addresses are smaller than XLEN.  (If your implementation implements all XLEN bits of mtval, this point is moot, and you can skip to the last paragraph.)

We propose to constrain this behavior to simplify the task of compliance-testing the xTVAL registers.  In particular, allowing implementations to arbitrarily convert any invalid address to any other invalid address is impractical to test.

Some implementations, including the Rocket core, make xTVAL a two's complement value that's just 1-bit wider than the largest virtual address.  This allows xTVAL to represent all valid virtual addresses with the minimal number of physical storage bits, yet still allows it to distinguish properly sign-extended VAs from improperly sign-extended VAs.  

The scheme replaces improperly sign-extended bits with the inverse of a properly signed address[VA] bit. Or (in pseudo code):

  xTVAL[XLEN-1:VASIZE] = (address[XLEN-1:VASIZE] == sign_extend( address[VASIZE-1]))
    ?    sign_extend(address[VASIZE-1]
    :   ~sign_extend(address[VASIZE-1]; 

 The effect of this scheme is that valid virtual addresses are always preserved, whereas invalid virtual addresses always remain invalid, even though their MSBs are discarded.

We are proposing to restrict xTVAL implementations with fewer than XLEN physical storage bits to support only this invalid address conversion scheme.

-->   If your implementation conforms to this scheme, or if your xTVAL register can represent all XLEN bits unconditionally,  please select (reply with X following) "Yes."  
-->  If your implementation doesn't conform to this scheme, but still conforms to the ISA spec, please select (reply with X following) "No", and carefully explain in the comments field how your implementation recodes invalid addresses.  

"No" responses without a sufficient explanation in the comments will not be considered.

Yes: ___ (our implementation conforms to this constraint or implements XLEN bits)
No:  ___ (our implementation is ISA compliant but doesn't conform to this constraint)

Comment (if you select "no" above): 








Re: xTVAL Compliance restriction proposal

Greg Favor
 

On Mon, Jun 15, 2020 at 11:33 PM Andrew Waterman <andrew@...> wrote:
I focused on the HW writes to mtval when I described the Rocket core's behavior to Allen.  The behavior for SW writes is that it unconditionally sign-extends from the most-significant implemented bit: in other words, the behavior that the MSBs are checked for a proper sign-extension only applies to HW writes.  This is a reasonable implementation, and in some respects a preferable one.  The purpose of the recoding is to provide information to SW.  Enforcing the same constraints on SW writes is arbitrary, and might even be detrimental in some virtualization cases.

That sounds good.

It would be great to see this software write behavior added to the Priv spec (along with whatever hardware write behavior that Allen's polling concludes with).

Thanks,
Greg


Re: xTVAL Compliance restriction proposal

Andrew Waterman
 



On Tue, Jun 16, 2020 at 12:06 AM Greg Favor <gfavor@...> wrote:
On Mon, Jun 15, 2020 at 11:33 PM Andrew Waterman <andrew@...> wrote:
I focused on the HW writes to mtval when I described the Rocket core's behavior to Allen.  The behavior for SW writes is that it unconditionally sign-extends from the most-significant implemented bit: in other words, the behavior that the MSBs are checked for a proper sign-extension only applies to HW writes.  This is a reasonable implementation, and in some respects a preferable one.  The purpose of the recoding is to provide information to SW.  Enforcing the same constraints on SW writes is arbitrary, and might even be detrimental in some virtualization cases.

That sounds good.

It would be great to see this software write behavior added to the Priv spec (along with whatever hardware write behavior that Allen's polling concludes with).

Thanks for your input, as always.

I agree that, if we ultimately agree to tighten the behavior here, we need to specify the behavior for both HW and SW writes.


Thanks,
Greg


mtvec question

Joe Xie
 

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Andrew Waterman
 

I have been bitten by this, too, but I have little in the way of advice.

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.

Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Joe Xie
 

Are we going to use bit1 soon in the future? We are wondering if we can use bit1 to indicate there’s illegal value (WLRL) – if bit1 is written with 1 then fire exception.

 

From: <tech-privileged@...> on behalf of Andrew Waterman <andrew@...>
Date: Friday, June 19, 2020 at 9:37 AM
To: Joe Xie <joxie@...>
Cc: "tech-privileged@..." <tech-privileged@...>, "James Xu (SW-GPU)" <jamesx@...>, Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

I have been bitten by this, too, but I have little in the way of advice.

 

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.


Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

 

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Andrew Waterman
 


On Thu, Jun 18, 2020 at 10:16 PM Joe Xie <joxie@...> wrote:

Are we going to use bit1 soon in the future? We are wondering if we can use bit1 to indicate there’s illegal value (WLRL) – if bit1 is written with 1 then fire exception.

 

From: <tech-privileged@...> on behalf of Andrew Waterman <andrew@...>
Date: Friday, June 19, 2020 at 9:37 AM
To: Joe Xie <joxie@...>
Cc: "tech-privileged@..." <tech-privileged@...>, "James Xu (SW-GPU)" <jamesx@...>, Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

I have been bitten by this, too, but I have little in the way of advice.

 

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.


Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

 

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Joe Xie
 

Lol

 

Do you feel it is worth to add a bit in sstatus to restrict csrw stvec to 1) mask bit1~0; 2) fire an exception when writing non-0 value to bit1~0?

 

A separate elf section can work in some cases, however the concern is that it may be difficult to force everyone to follow the guidance and it is pretty annoying to debug the issue on Silicon – It is a debug nightmare if that instruction is a jmp to some random address.

 


From: Andrew Waterman <andrew@...>
Sent: Friday, June 19, 2020 1:31:19 PM
To: Joe Xie <joxie@...>
Cc: tech-privileged@... <tech-privileged@...>; James Xu (SW-GPU) <jamesx@...>; Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

On Thu, Jun 18, 2020 at 10:16 PM Joe Xie <joxie@...> wrote:

Are we going to use bit1 soon in the future? We are wondering if we can use bit1 to indicate there’s illegal value (WLRL) – if bit1 is written with 1 then fire exception.

 

From: <tech-privileged@...> on behalf of Andrew Waterman <andrew@...>
Date: Friday, June 19, 2020 at 9:37 AM
To: Joe Xie <joxie@...>
Cc: "tech-privileged@..." <tech-privileged@...>, "James Xu (SW-GPU)" <jamesx@...>, Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

I have been bitten by this, too, but I have little in the way of advice.

 

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.


Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

 

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Andrew Waterman
 

I think this is one of dozens of little mistakes you can make in bare-metal RISC-V programming, and adding an sstatus bit for it is IMO not a great allocation of resources.

Hopefully you are developing your M-mode code with the help of a software simulator, in which case you could just add a feature to your software simulator to catch writes to mtvec that set mtvec[1] and issue a warning to the programmer.

On Thu, Jun 18, 2020 at 11:29 PM Joe Xie <joxie@...> wrote:

Lol

 

Do you feel it is worth to add a bit in sstatus to restrict csrw stvec to 1) mask bit1~0; 2) fire an exception when writing non-0 value to bit1~0?

 

A separate elf section can work in some cases, however the concern is that it may be difficult to force everyone to follow the guidance and it is pretty annoying to debug the issue on Silicon – It is a debug nightmare if that instruction is a jmp to some random address.

 


From: Andrew Waterman <andrew@...>
Sent: Friday, June 19, 2020 1:31:19 PM
To: Joe Xie <joxie@...>
Cc: tech-privileged@... <tech-privileged@...>; James Xu (SW-GPU) <jamesx@...>; Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

 

On Thu, Jun 18, 2020 at 10:16 PM Joe Xie <joxie@...> wrote:

Are we going to use bit1 soon in the future? We are wondering if we can use bit1 to indicate there’s illegal value (WLRL) – if bit1 is written with 1 then fire exception.

 

From: <tech-privileged@...> on behalf of Andrew Waterman <andrew@...>
Date: Friday, June 19, 2020 at 9:37 AM
To: Joe Xie <joxie@...>
Cc: "tech-privileged@..." <tech-privileged@...>, "James Xu (SW-GPU)" <jamesx@...>, Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

I have been bitten by this, too, but I have little in the way of advice.

 

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.


Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

 

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: mtvec question

Allen Baum
 

I think the solution is even simpler.

Even if MTVEC[1]  had not been used, data dependent traps are prohibited in Risc-V (that might be too strong a word; I don't know that its explicited prohibited, but certainly is discouraged for good reason)
Note that Divide by 0 doesn't trap, and if anything would trap, you would think that would be first in line.

Nevertheless - MTVEC[1:0] is WARL, and you can restrict the legal values as you see fit (that is, transform anything that is not legal into a legal value, which can include leaving the value unchanged) - but you cannot trap.

And, the implementation gets to decide  what is legal.
Your implementation can declare that mtvec[1] is read-only zero (and so won't support CLIC)
Your implementation can declare that attempts to write 10 or 11 will be ignored, or will be converted to the legal 00 and 01 or even always to 00, or declare that only 00 is legal (If you didn't support vectored interrupts) so it is read-only 0.

So you can prevent that case you're worried about fairly easily.

You could go further and restrict the address to be cache line aligned regradless of vectoring (so mtvec [5:2]=0; or page aligned (mtvec[11:2]=0) (see the last note in priv spec 3.1.7 that discusses this).


On Fri, Jun 19, 2020 at 12:01 AM Andrew Waterman <andrew@...> wrote:
I think this is one of dozens of little mistakes you can make in bare-metal RISC-V programming, and adding an sstatus bit for it is IMO not a great allocation of resources.

Hopefully you are developing your M-mode code with the help of a software simulator, in which case you could just add a feature to your software simulator to catch writes to mtvec that set mtvec[1] and issue a warning to the programmer.

On Thu, Jun 18, 2020 at 11:29 PM Joe Xie <joxie@...> wrote:

Lol

 

Do you feel it is worth to add a bit in sstatus to restrict csrw stvec to 1) mask bit1~0; 2) fire an exception when writing non-0 value to bit1~0?

 

A separate elf section can work in some cases, however the concern is that it may be difficult to force everyone to follow the guidance and it is pretty annoying to debug the issue on Silicon – It is a debug nightmare if that instruction is a jmp to some random address.

 


From: Andrew Waterman <andrew@...>
Sent: Friday, June 19, 2020 1:31:19 PM
To: Joe Xie <joxie@...>
Cc: tech-privileged@... <tech-privileged@...>; James Xu (SW-GPU) <jamesx@...>; Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

 

On Thu, Jun 18, 2020 at 10:16 PM Joe Xie <joxie@...> wrote:

Are we going to use bit1 soon in the future? We are wondering if we can use bit1 to indicate there’s illegal value (WLRL) – if bit1 is written with 1 then fire exception.

 

From: <tech-privileged@...> on behalf of Andrew Waterman <andrew@...>
Date: Friday, June 19, 2020 at 9:37 AM
To: Joe Xie <joxie@...>
Cc: "tech-privileged@..." <tech-privileged@...>, "James Xu (SW-GPU)" <jamesx@...>, Lucien Dunning <ldunning@...>
Subject: Re: [RISC-V] [tech-privileged] mtvec question

 

External email: Use caution opening links or attachments

 

I have been bitten by this, too, but I have little in the way of advice.

 

There are various software approaches to reduce the likelihood of encountering this problem, even if the programmer forgets to insert the alignment directive.  The first one that comes to mind is to put a trap handler in its own ELF section so that the linker script can forcibly align them.


Unfortunately, sometimes the best you can do with these nitty-gritty low-level systems programming issues is: "don't fuck up".  Debugging them is inherently painful.

 

On Thu, Jun 18, 2020 at 6:22 PM Joe Xie <joxie@...> wrote:

Hi Andrew, all,

 

The current priv spec reserves lower 2bits of mtvec (ad stvec) to indicate vectored interrupts, there’s an issue that if exception handler is word aligned but SW mis-program the lower 2bits to be non-0 value, it is still a valid value.

 

So now exception will start from an incorrect PC, and depending on the instruction word on that address you will see weird behaviors or the core will just fall into infinite loop. It is very annoying to debug this kind of issue.

 

Any advice on how to overcome the issue other than SW check?

 

 

 

 


Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

John Hauser
 

Greg Favor wrote:
The Hypervisor extension adds bits to some of the existing M-mode CSR's.
When this extension is not implemented, these bits are hardwired to zero.
When the extension _is_ implemented these bits become either read/write or
(in a few cases) hardwired to one.

On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
extension is "disabled"), "the hart behaves as though this extension were
not implemented". But where these various added M-mode CSR bits are
described, they are defined to exist when "the hypervisor extension is
implemented".

The former statement implies that these new bits must appear to be
hardwired to zero when misa.H=0, while the latter statement implies that
these new bits appear to be read/write or hardwired to one irrespective of
misa.H (although presumably they have no functional effects when misa.H=0).

Which is the correct architectural intention?
The overriding statement in the document is this one:

When misa[7] (bit H) is clear, the hart behaves as though this
extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled. However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented. If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document. For example, concerning the FS
field in mstatus, the document says:

In systems that do not implement S-mode and do not have a
floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented. So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

- John Hauser


Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

Greg Favor
 

Thanks.  This is the interpretation we expected to be correct.

For the sake of some future readers of the spec that may not apply the broadest meaning of "behaves" when reading "behaves as though this extension were not implemented", it may be worth a few words to note that "behaves" also means that new CSR bits/fields defined by the extension must have simple read/write behavior the same as when the extension is not implemented (i.e. not just that these bits have the same effect or lack of effect as if the extension is not implemented).

Greg


On Mon, Jun 22, 2020 at 2:58 PM John Hauser <jh.riscv@...> wrote:
Greg Favor wrote:
> The Hypervisor extension adds bits to some of the existing M-mode CSR's.
> When this extension is not implemented, these bits are hardwired to zero.
> When the extension _is_ implemented these bits become either read/write or
> (in a few cases) hardwired to one.
>
> On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
> extension is "disabled"), "the hart behaves as though this extension were
> not implemented".  But where these various added M-mode CSR bits are
> described, they are defined to exist when "the hypervisor extension is
> implemented".
>
> The former statement implies that these new bits must appear to be
> hardwired to zero when misa.H=0, while the latter statement implies that
> these new bits appear to be read/write or hardwired to one irrespective of
> misa.H (although presumably they have no functional effects when misa.H=0).
>
> Which is the correct architectural intention?

The overriding statement in the document is this one:

    When misa[7] (bit H) is clear, the hart behaves as though this
    extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled.  However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented.  If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document.  For example, concerning the FS
field in mstatus, the document says:

    In systems that do not implement S-mode and do not have a
    floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented.  So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

    - John Hauser




Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

Allen Baum
 

And we should be careful to define the corner cases, e.g. the values that are in the registers when the features are enabled: the values they last held, or undefined, or.... something else.

On Mon, Jun 22, 2020 at 4:08 PM Greg Favor <gfavor@...> wrote:
Thanks.  This is the interpretation we expected to be correct.

For the sake of some future readers of the spec that may not apply the broadest meaning of "behaves" when reading "behaves as though this extension were not implemented", it may be worth a few words to note that "behaves" also means that new CSR bits/fields defined by the extension must have simple read/write behavior the same as when the extension is not implemented (i.e. not just that these bits have the same effect or lack of effect as if the extension is not implemented).

Greg

On Mon, Jun 22, 2020 at 2:58 PM John Hauser <jh.riscv@...> wrote:
Greg Favor wrote:
> The Hypervisor extension adds bits to some of the existing M-mode CSR's.
> When this extension is not implemented, these bits are hardwired to zero.
> When the extension _is_ implemented these bits become either read/write or
> (in a few cases) hardwired to one.
>
> On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
> extension is "disabled"), "the hart behaves as though this extension were
> not implemented".  But where these various added M-mode CSR bits are
> described, they are defined to exist when "the hypervisor extension is
> implemented".
>
> The former statement implies that these new bits must appear to be
> hardwired to zero when misa.H=0, while the latter statement implies that
> these new bits appear to be read/write or hardwired to one irrespective of
> misa.H (although presumably they have no functional effects when misa.H=0).
>
> Which is the correct architectural intention?

The overriding statement in the document is this one:

    When misa[7] (bit H) is clear, the hart behaves as though this
    extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled.  However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented.  If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document.  For example, concerning the FS
field in mstatus, the document says:

    In systems that do not implement S-mode and do not have a
    floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented.  So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

    - John Hauser




Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

Greg Favor
 

While I would have to go through all the bits/fields affected by the H extension to double-check, the main issue are bits/fields that were previously reserved (and hardwired to zero).  So there isn't the general issue of what the "unimplemented" read values should be.  Similarly, the "unimplemented" behavior for CSR's added by the H extension is straightforward.

Greg


On Mon, Jun 22, 2020 at 5:43 PM Allen Baum <allen.baum@...> wrote:
And we should be careful to define the corner cases, e.g. the values that are in the registers when the features are enabled: the values they last held, or undefined, or.... something else.

On Mon, Jun 22, 2020 at 4:08 PM Greg Favor <gfavor@...> wrote:
Thanks.  This is the interpretation we expected to be correct.

For the sake of some future readers of the spec that may not apply the broadest meaning of "behaves" when reading "behaves as though this extension were not implemented", it may be worth a few words to note that "behaves" also means that new CSR bits/fields defined by the extension must have simple read/write behavior the same as when the extension is not implemented (i.e. not just that these bits have the same effect or lack of effect as if the extension is not implemented).

Greg

On Mon, Jun 22, 2020 at 2:58 PM John Hauser <jh.riscv@...> wrote:
Greg Favor wrote:
> The Hypervisor extension adds bits to some of the existing M-mode CSR's.
> When this extension is not implemented, these bits are hardwired to zero.
> When the extension _is_ implemented these bits become either read/write or
> (in a few cases) hardwired to one.
>
> On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
> extension is "disabled"), "the hart behaves as though this extension were
> not implemented".  But where these various added M-mode CSR bits are
> described, they are defined to exist when "the hypervisor extension is
> implemented".
>
> The former statement implies that these new bits must appear to be
> hardwired to zero when misa.H=0, while the latter statement implies that
> these new bits appear to be read/write or hardwired to one irrespective of
> misa.H (although presumably they have no functional effects when misa.H=0).
>
> Which is the correct architectural intention?

The overriding statement in the document is this one:

    When misa[7] (bit H) is clear, the hart behaves as though this
    extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled.  However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented.  If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document.  For example, concerning the FS
field in mstatus, the document says:

    In systems that do not implement S-mode and do not have a
    floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented.  So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

    - John Hauser




Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

Jonathan Behrens <behrensj@...>
 

Couldn't you just change the wording to be "disabled" when referring to having misa.H=0 and leave "unimplemented" to mean having misa.H hardwired to 0?

Jonathan

On Mon, Jun 22, 2020 at 8:55 PM Greg Favor via lists.riscv.org <gfavor=ventanamicro.com@...> wrote:
While I would have to go through all the bits/fields affected by the H extension to double-check, the main issue are bits/fields that were previously reserved (and hardwired to zero).  So there isn't the general issue of what the "unimplemented" read values should be.  Similarly, the "unimplemented" behavior for CSR's added by the H extension is straightforward.

Greg

On Mon, Jun 22, 2020 at 5:43 PM Allen Baum <allen.baum@...> wrote:
And we should be careful to define the corner cases, e.g. the values that are in the registers when the features are enabled: the values they last held, or undefined, or.... something else.

On Mon, Jun 22, 2020 at 4:08 PM Greg Favor <gfavor@...> wrote:
Thanks.  This is the interpretation we expected to be correct.

For the sake of some future readers of the spec that may not apply the broadest meaning of "behaves" when reading "behaves as though this extension were not implemented", it may be worth a few words to note that "behaves" also means that new CSR bits/fields defined by the extension must have simple read/write behavior the same as when the extension is not implemented (i.e. not just that these bits have the same effect or lack of effect as if the extension is not implemented).

Greg

On Mon, Jun 22, 2020 at 2:58 PM John Hauser <jh.riscv@...> wrote:
Greg Favor wrote:
> The Hypervisor extension adds bits to some of the existing M-mode CSR's.
> When this extension is not implemented, these bits are hardwired to zero.
> When the extension _is_ implemented these bits become either read/write or
> (in a few cases) hardwired to one.
>
> On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
> extension is "disabled"), "the hart behaves as though this extension were
> not implemented".  But where these various added M-mode CSR bits are
> described, they are defined to exist when "the hypervisor extension is
> implemented".
>
> The former statement implies that these new bits must appear to be
> hardwired to zero when misa.H=0, while the latter statement implies that
> these new bits appear to be read/write or hardwired to one irrespective of
> misa.H (although presumably they have no functional effects when misa.H=0).
>
> Which is the correct architectural intention?

The overriding statement in the document is this one:

    When misa[7] (bit H) is clear, the hart behaves as though this
    extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled.  However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented.  If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document.  For example, concerning the FS
field in mstatus, the document says:

    In systems that do not implement S-mode and do not have a
    floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented.  So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

    - John Hauser




Re: Appearance of new M-mode CSR bits when Hypervisor is disabled

Greg Favor
 

My response to Allen's post and my use of "unimplemented" in quotes was referring to the H extension spec's statement that "When misa[7] is clear, the hart behaves as though this extension were not implemented."  The key point is that the "disabled" values for the CSR bits/fields in question must match their "unimplemented" values.

Greg


On Mon, Jun 22, 2020 at 6:16 PM Jonathan Behrens <behrensj@...> wrote:
Couldn't you just change the wording to be "disabled" when referring to having misa.H=0 and leave "unimplemented" to mean having misa.H hardwired to 0?

Jonathan

On Mon, Jun 22, 2020 at 8:55 PM Greg Favor via lists.riscv.org <gfavor=ventanamicro.com@...> wrote:
While I would have to go through all the bits/fields affected by the H extension to double-check, the main issue are bits/fields that were previously reserved (and hardwired to zero).  So there isn't the general issue of what the "unimplemented" read values should be.  Similarly, the "unimplemented" behavior for CSR's added by the H extension is straightforward.

Greg

On Mon, Jun 22, 2020 at 5:43 PM Allen Baum <allen.baum@...> wrote:
And we should be careful to define the corner cases, e.g. the values that are in the registers when the features are enabled: the values they last held, or undefined, or.... something else.

On Mon, Jun 22, 2020 at 4:08 PM Greg Favor <gfavor@...> wrote:
Thanks.  This is the interpretation we expected to be correct.

For the sake of some future readers of the spec that may not apply the broadest meaning of "behaves" when reading "behaves as though this extension were not implemented", it may be worth a few words to note that "behaves" also means that new CSR bits/fields defined by the extension must have simple read/write behavior the same as when the extension is not implemented (i.e. not just that these bits have the same effect or lack of effect as if the extension is not implemented).

Greg

On Mon, Jun 22, 2020 at 2:58 PM John Hauser <jh.riscv@...> wrote:
Greg Favor wrote:
> The Hypervisor extension adds bits to some of the existing M-mode CSR's.
> When this extension is not implemented, these bits are hardwired to zero.
> When the extension _is_ implemented these bits become either read/write or
> (in a few cases) hardwired to one.
>
> On the one hand the hypervisor spec says that when misa.H=0 (i.e. the
> extension is "disabled"), "the hart behaves as though this extension were
> not implemented".  But where these various added M-mode CSR bits are
> described, they are defined to exist when "the hypervisor extension is
> implemented".
>
> The former statement implies that these new bits must appear to be
> hardwired to zero when misa.H=0, while the latter statement implies that
> these new bits appear to be read/write or hardwired to one irrespective of
> misa.H (although presumably they have no functional effects when misa.H=0).
>
> Which is the correct architectural intention?

The overriding statement in the document is this one:

    When misa[7] (bit H) is clear, the hart behaves as though this
    extension were not implemented, ....

I sympathize with the desire to enforce an intuitive view that, if
misa.H is writable and set to 0, the hypervisor extension really _is_
implemented, just disabled.  However, I believe the statement above
is clear that when misa.H = 0, the hardware must act the same as when
the hypervisor extension is not implemented.  If that statement didn't
override an intuitive interpretation of _implemented_ everywhere in the
chapter, then the statement would be null-and-void, and that can't be
right.

Allowing "implementation" to be configurable at run-time may be
non-intuitive, but I claim the hypervisor chapter is consistent with
similar other uses in the document.  For example, concerning the FS
field in mstatus, the document says:

    In systems that do not implement S-mode and do not have a
    floating-point unit, the FS field is hardwired to zero.

What about when misa.F and misa.S are both writable and set to zero?
In that case I believe the specification requires that mstatus.FS
be read-only zero, the same as when the F extension and S mode are
not implemented.  So what does the document really mean by "do not
implement S-mode" and "do not have a floating-point unit"?

If anything, I think the hypervisor chapter is being slightly more
careful to document the consequences of modifying misa.

    - John Hauser



161 - 180 of 1210