Date   

Re: Review request: New EFI_RISCV_BOOT_PROTOCOL

Sunil V L
 

On Mon, Jan 10, 2022 at 06:27:37PM +0100, Heinrich Schuchardt wrote:
On 1/10/22 18:02, Sunil V L wrote:
Hi All,

As we discussed in the Platform HSC meeting today, here is the document which details a new RISC-V specific EFI protocol.

https://github.com/riscv-non-isa/riscv-uefi/releases/download/0.1/EFI_RISCV_BOOT_PROTOCOL.pdf

Currently, the main use case of this protocol is to pass the boot hartid to the OS. But this can be extended in future if required. A PoC has been developed using EDK2 and Linux.

More details of this requirement and alternatives discussed are available at http://lists.infradead.org/pipermail/linux-riscv/2021-December/010604.html.

I request your review and will be great if you provide the feedback by 01/17.

Thanks!
Sunil


Dear Sunil,

thank you for drafting the protocol specification.
Hi Heinrich,
Thank you very much for your constant feedback and pointers while
drafting this spec.

The interface of a protocol may change from version to version. Therefore I
understand why there must be a path to convey this information. But using a
function like EFI_RISCV_BOOT_PROTOCOL.GetProtocolVersion() makes accessing
this information unnecessarily complicated. Instead consider adding a
version field as first element of the interface like many other UEFI
protocols do. This will also decrease the implementation size. For alignment
reasons make this field UINT64. Other protocols call such a field
"Revision". Please, provide a define for the current version. E.g.

#define EFI_RISCV_BOOT_PROTOCOL_REVISION 0x00010000
#define EFI_RISCV_BOOT_PROTOCOL_LATEST_VERSION \
EFI_RISCV_BOOT_PROTOCOL_REVISION
This makes perfect sense. Will update the spec.

Function EFI_RISCV_BOOT_PROTOCOL.GetBootHartId() looks ok to me and is well
described.
Jessica has given an input to change the parameter from UINT32 * to
UINTN * since mhartid can be of XLEN. I think this is also a great
feedback. Will incorporate it in the next update of the spec.

Thanks!
Sunil


Best regards

Heinrich



Re: Review request: New EFI_RISCV_BOOT_PROTOCOL

Heinrich Schuchardt
 

On 1/11/22 02:07, Chang, Abner (HPS SW/FW Technologist) wrote:
Hi Sunil,
Instead of having a spec for EFI_RISCV_BOOT_PROTOCOL specifically, I suggest to have a RISC-V EFI Protocols Specification. This spec accommodates EFI_RISCV_BOOT_PROTOCOL, the future EFI protocols for RISC-V platform, and those we had implemented in edk2.
Which protocols in EDK II do you refer to?

"git grep -n RISC edk2/ | grep PRO" yields no result.

Once we have agreed on the EFI_RISCV_BOOT_PROTOCOL we should create an issue in bugzilla.tianocore.org and create a Mantis entry to get it merged into the UEFI specification.

Best regards

Heinrich

Thanks
Abner

-----Original Message-----
From: Heinrich Schuchardt <heinrich.schuchardt@...>
Sent: Tuesday, January 11, 2022 1:28 AM
To: Sunil V L <sunilvl@...>
Cc: tech-unixplatformspec@...; Chang, Abner (HPS SW/FW
Technologist) <abner.chang@...>; Anup Patel
<apatel@...>; Atish Patra <atishp@...>; Jessica
Clarke <jrtc27@...>
Subject: Re: Review request: New EFI_RISCV_BOOT_PROTOCOL

On 1/10/22 18:02, Sunil V L wrote:
Hi All,

As we discussed in the Platform HSC meeting today, here is the document
which details a new RISC-V specific EFI protocol.

https://github.com/riscv-non-isa/riscv-
uefi/releases/download/0.1/EFI_RISCV_BOOT_PROTOCOL.pdf

Currently, the main use case of this protocol is to pass the boot hartid to
the OS. But this can be extended in future if required. A PoC has been
developed using EDK2 and Linux.

More details of this requirement and alternatives discussed are available at
http://lists.infradead.org/pipermail/linux-riscv/2021-December/010604.html.

I request your review and will be great if you provide the feedback by
01/17.

Thanks!
Sunil


Dear Sunil,

thank you for drafting the protocol specification.

The interface of a protocol may change from version to version.
Therefore I understand why there must be a path to convey this
information. But using a function like
EFI_RISCV_BOOT_PROTOCOL.GetProtocolVersion() makes accessing this
information unnecessarily complicated. Instead consider adding a version
field as first element of the interface like many other UEFI protocols
do. This will also decrease the implementation size. For alignment
reasons make this field UINT64. Other protocols call such a field
"Revision". Please, provide a define for the current version. E.g.

#define EFI_RISCV_BOOT_PROTOCOL_REVISION 0x00010000
#define EFI_RISCV_BOOT_PROTOCOL_LATEST_VERSION \
EFI_RISCV_BOOT_PROTOCOL_REVISION

Function EFI_RISCV_BOOT_PROTOCOL.GetBootHartId() looks ok to me and is
well described.

Best regards

Heinrich


Re: Review request: New EFI_RISCV_BOOT_PROTOCOL

Heinrich Schuchardt
 

On 1/10/22 18:02, Sunil V L wrote:
Hi All,
As we discussed in the Platform HSC meeting today, here is the document which details a new RISC-V specific EFI protocol.
https://github.com/riscv-non-isa/riscv-uefi/releases/download/0.1/EFI_RISCV_BOOT_PROTOCOL.pdf
Currently, the main use case of this protocol is to pass the boot hartid to the OS. But this can be extended in future if required. A PoC has been developed using EDK2 and Linux.
More details of this requirement and alternatives discussed are available at http://lists.infradead.org/pipermail/linux-riscv/2021-December/010604.html.
I request your review and will be great if you provide the feedback by 01/17.
Thanks!
Sunil
Dear Sunil,

thank you for drafting the protocol specification.

The interface of a protocol may change from version to version. Therefore I understand why there must be a path to convey this information. But using a function like EFI_RISCV_BOOT_PROTOCOL.GetProtocolVersion() makes accessing this information unnecessarily complicated. Instead consider adding a version field as first element of the interface like many other UEFI protocols do. This will also decrease the implementation size. For alignment reasons make this field UINT64. Other protocols call such a field "Revision". Please, provide a define for the current version. E.g.

#define EFI_RISCV_BOOT_PROTOCOL_REVISION 0x00010000
#define EFI_RISCV_BOOT_PROTOCOL_LATEST_VERSION \
EFI_RISCV_BOOT_PROTOCOL_REVISION

Function EFI_RISCV_BOOT_PROTOCOL.GetBootHartId() looks ok to me and is well described.

Best regards

Heinrich


Review request: New EFI_RISCV_BOOT_PROTOCOL

Sunil V L
 

Hi All,

As we discussed in the Platform HSC meeting today, here is the document which details a new RISC-V specific EFI protocol.

https://github.com/riscv-non-isa/riscv-uefi/releases/download/0.1/EFI_RISCV_BOOT_PROTOCOL.pdf

Currently, the main use case of this protocol is to pass the boot hartid to the OS. But this can be extended in future if required. A PoC has been developed using EDK2 and Linux.

More details of this requirement and alternatives discussed are available at http://lists.infradead.org/pipermail/linux-riscv/2021-December/010604.html.

I request your review and will be great if you provide the feedback by 01/17.

Thanks!
Sunil


Next Platform HSC Meeting on Mon Jan 10th 2022 8AM PST

Kumar Sankaran
 

Hi All,
The next platform HSC meeting is scheduled on Mon Jan 10th 2022 at 8AM PST.

Here are the details:

Agenda and minutes kept on the github wiki:
https://github.com/riscv/riscv-platform-specs/wiki

Here are the slides:
https://docs.google.com/presentation/d/1SNKpnOshmPO-Ornj0PKACKfHGbXmS2WIHxkFYL7MO_s/edit#slide=id.gce1d523d02_0_0

Meeting info
Zoom meeting: https://zoom.us/j/2786028446
Passcode: 901897

Or iPhone one-tap :
US: +16465588656,,2786028466# or +16699006833,,2786028466# Or Telephone:
Dial(for higher quality, dial a number based on your current location):
US: +1 646 558 8656 or +1 669 900 6833
Meeting ID: 278 602 8446
International numbers available:
https://zoom.us/zoomconference?m=_R0jyyScMETN7-xDLLRkUFxRAP07A-_

Regards
Kumar


Re: OS-A PCIe Questions

Mayuresh Chitale
 





On Wed, Jan 5, 2022 at 6:43 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hi, thanks for the response.

My thinking on intX is that if the HW can provide all the required MSI support, then SW won’t need to fall back to MSIs.  The intent of the PCIe spec was to depricate INTx messages eventually.  For new risc-v platforms it seems reasonable to build a forward looking system that only supports MSI interrupts.  For the topology of a RCiEP and RCEC that have MSI or MSI-X support adding legacy INTx support as a requirement is just make work, since all modern software has the MSI support.   For Root Ports, it’s a bit more complicated because the platform designer has less control of what gets plugged in, still it seems reasonable at this time to make INTx support optional.

You are correct that RO and NS are separate bits and it’s legal for a card to set one and not the other.  My point is that in implementing the logic above the host bridge, if NS is set, but RO is not it doesn’t provide much value, because the transaction must be strongly ordered with previous writes.  In practice its easiest to send such transactions (NS=1, RO=0)  through the snooped path to maintain ordering.  

The key point is NS is a hint so the statement “Memory that is cacheable by harts is not kept coherent by hardware…” may or may not be true in an implementation.  So here’s a smaller change proposal that would make the text more accurate:
“ Memory that is cacheable by harts may not be kept coherent by hardware when PCIe transactions to that memory are marked with a No_Snoop bit of one.   On systems that honor the No Snoop hint, software must manage coherence on such memory; otherwise, software coherence management is not required.”
Agreed ! I will send a patch for this change.

At the very least the typo on the No Snoop polarity should be corrected.

My final point on topology was that the the spec seems to make it illegal to implement a single host bridge that has both root ports and RCiEP attached.  Maybe that wasn’t the intent.  It would be nice to see a 3rd topology drawing with a single host bridge, a RCiEP, an RCEC and a Root Port all on Bus 0.  If we do add that topology the final rule of section 4.7.3.5 becomes not-needed:
Ok. Will do that.
  • If both the topologies mentioned above are supported then RCiEP and RCEC must be implemented in a separate PCIe domain and must be addressable via a separate ECAM I/O region.

Intel south bridges, as an example, mix RCiEP and root ports in the same PCIe domain addressable by a single ECAM I/O region.  What is the motivation for excluding this topology?

Cheers,
Michael.

On Jan 4, 2022, at 11:16 AM, Mayuresh Chitale <mchitale@...> wrote:

On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.
The intention behind requiring intx was to ensure better compatibility. There are few cases where we fallback to INTx if MSI is not available. For e.g:
But if such cases are considered to be rare and not required to be supported then this requirement can be removed.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

 

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?
Actually the requirement is to support atleast one of them ie RCiEP (+RCEC) or Root port but an RC can certainly  support both.


Cheers,
Michael.




Re: OS-A PCIe Questions

Michael Klinglesmith
 

I agree the comments about RO and NS interactions are non-normative clarifications (and not required to be added, it’s covered in the PCIe spec).  The key point is the the text today implies the host bridge MUST honor the No Snoop hint.  So, are we saying RiscV OS-A server platforms must honor the no snoop hint?  If not I think the text as written is miss leading.

Cheers,
Michael.

On Jan 4, 2022, at 5:14 PM, Greg Favor <gfavor@...> wrote:

On Tue, Jan 4, 2022 at 11:16 AM Mayuresh Chitale <mchitale@...> wrote:
On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

Fwiw, through seven generations of ARM SBSA and then the BSA, they talk about No_snoop in a similar way to what we do, but make no corresponding mentions about Relaxed Order (other than mentions of Relaxed Order and No Snoop enable/disable control bits in Device Control Registers).  In essence they don't mandate or even suggest any particular connections between the two features.

So it wouldn't be unreasonable for us to not say much and essentially leave complete flexibility to the system designer.  Or we could include some non-normative commentary or suggestions.

Greg




On Tue, Jan 4, 2022 at 11:16 AM Mayuresh Chitale <mchitale@...> wrote:
On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.
The intention behind requiring intx was to ensure better compatibility. There are few cases where we fallback to INTx if MSI is not available. For e.g:
But if such cases are considered to be rare and not required to be supported then this requirement can be removed.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

 

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?
Actually the requirement is to support atleast one of them ie RCiEP (+RCEC) or Root port but an RC can certainly  support both.


Cheers,
Michael.






Re: OS-A PCIe Questions

Greg Favor
 

On Tue, Jan 4, 2022 at 11:16 AM Mayuresh Chitale <mchitale@...> wrote:
On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

Fwiw, through seven generations of ARM SBSA and then the BSA, they talk about No_snoop in a similar way to what we do, but make no corresponding mentions about Relaxed Order (other than mentions of Relaxed Order and No Snoop enable/disable control bits in Device Control Registers).  In essence they don't mandate or even suggest any particular connections between the two features.

So it wouldn't be unreasonable for us to not say much and essentially leave complete flexibility to the system designer.  Or we could include some non-normative commentary or suggestions.

Greg




On Tue, Jan 4, 2022 at 11:16 AM Mayuresh Chitale <mchitale@...> wrote:
On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.
The intention behind requiring intx was to ensure better compatibility. There are few cases where we fallback to INTx if MSI is not available. For e.g:
But if such cases are considered to be rare and not required to be supported then this requirement can be removed.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

 

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?
Actually the requirement is to support atleast one of them ie RCiEP (+RCEC) or Root port but an RC can certainly  support both.


Cheers,
Michael.


Re: OS-A PCIe Questions

Michael Klinglesmith
 

Hi, thanks for the response.

My thinking on intX is that if the HW can provide all the required MSI support, then SW won’t need to fall back to MSIs.  The intent of the PCIe spec was to depricate INTx messages eventually.  For new risc-v platforms it seems reasonable to build a forward looking system that only supports MSI interrupts.  For the topology of a RCiEP and RCEC that have MSI or MSI-X support adding legacy INTx support as a requirement is just make work, since all modern software has the MSI support.   For Root Ports, it’s a bit more complicated because the platform designer has less control of what gets plugged in, still it seems reasonable at this time to make INTx support optional.

You are correct that RO and NS are separate bits and it’s legal for a card to set one and not the other.  My point is that in implementing the logic above the host bridge, if NS is set, but RO is not it doesn’t provide much value, because the transaction must be strongly ordered with previous writes.  In practice its easiest to send such transactions (NS=1, RO=0)  through the snooped path to maintain ordering.  

The key point is NS is a hint so the statement “Memory that is cacheable by harts is not kept coherent by hardware…” may or may not be true in an implementation.  So here’s a smaller change proposal that would make the text more accurate:
“ Memory that is cacheable by harts may not be kept coherent by hardware when PCIe transactions to that memory are marked with a No_Snoop bit of one.   On systems that honor the No Snoop hint, software must manage coherence on such memory; otherwise, software coherence management is not required.”

At the very least the typo on the No Snoop polarity should be corrected.

My final point on topology was that the the spec seems to make it illegal to implement a single host bridge that has both root ports and RCiEP attached.  Maybe that wasn’t the intent.  It would be nice to see a 3rd topology drawing with a single host bridge, a RCiEP, an RCEC and a Root Port all on Bus 0.  If we do add that topology the final rule of section 4.7.3.5 becomes not-needed:
  • If both the topologies mentioned above are supported then RCiEP and RCEC must be implemented in a separate PCIe domain and must be addressable via a separate ECAM I/O region.

Intel south bridges, as an example, mix RCiEP and root ports in the same PCIe domain addressable by a single ECAM I/O region.  What is the motivation for excluding this topology?

Cheers,
Michael.

On Jan 4, 2022, at 11:16 AM, Mayuresh Chitale <mchitale@...> wrote:

On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.
The intention behind requiring intx was to ensure better compatibility. There are few cases where we fallback to INTx if MSI is not available. For e.g:
But if such cases are considered to be rare and not required to be supported then this requirement can be removed.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

 

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?
Actually the requirement is to support atleast one of them ie RCiEP (+RCEC) or Root port but an RC can certainly  support both.


Cheers,
Michael.




Re: OS-A PCIe Questions

Mayuresh Chitale
 

On Thu, Dec 30, 2021 at 3:37 AM Michael Klinglesmith <michael.klinglesmith@...> wrote:
Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.
The intention behind requiring intx was to ensure better compatibility. There are few cases where we fallback to INTx if MSI is not available. For e.g:
But if such cases are considered to be rare and not required to be supported then this requirement can be removed.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.
 
AFAIK, these are two separate attributes and a requester could set either or both of them.  I am not sure if we can link the two together. Please correct me if I am wrong but does it mean that ,for e.g, all the mem write requests which may have the NS attribute set must also have the RO attribute set?

 

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?
Actually the requirement is to support atleast one of them ie RCiEP (+RCEC) or Root port but an RC can certainly  support both.


Cheers,
Michael.


Re: OS-A platform stoptime requirement

atishp@...
 



On Mon, Jan 3, 2022 at 9:43 AM Beeman Strong <beeman@...> wrote:
Back to the original topic, it seems there is broad agreement that stoptime=1 shouldn't be a requirement for the OS-A (or server) platform spec.  Is there a formal mechanism by which an issue should be filed to get that changed?

Send a patch :)


On Wed, Dec 29, 2021 at 9:44 AM Tim Newsome <tim@...> wrote:
On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim 


Re: OS-A platform stoptime requirement

Beeman Strong
 

Back to the original topic, it seems there is broad agreement that stoptime=1 shouldn't be a requirement for the OS-A (or server) platform spec.  Is there a formal mechanism by which an issue should be filed to get that changed?


On Wed, Dec 29, 2021 at 9:44 AM Tim Newsome <tim@...> wrote:
On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim 


OS-A PCIe Questions

Michael Klinglesmith
 

Hello,

I’m new to participating in the platform WG.  I’m working at SiFive now.  I spent the last 20 years doing PCIe compliant IO fabrics for Intel chipsets.

I have a few comments / questions about the PCIe:

1) section 4.7.3.3: Why are we requiring INTx?  We should allow a modern system to be built without the legacy INTx requirements.

2) section 4.7.3.4: A discussion of No_snoop must also include Relaxed Order (RO).  When a root port forwards a mix of traffic with NS=0, RO=0 and NS=1 , RO=0.  The requirement to enforce ordering tends to lead to ignoring the NS hint and snooping the trarffic anyway.

Proposed text for 4.7.3.4:
A system is required to provide hardware managed coherence for PCIe traffic. A system may ignore the No_snoop hint bit and treat all PCIe traffic as HW coherent.  A system may choose to honor the No_snoop (NS) hint bit on incoming PCIe transactions.  In this case software must manage coherence for the memory used by the device issuing the transactions.  Such systems must also honor the relaxed order (RO) hint bit.  To take full advantage of SW coherence the device should set both NS and RO to 1.  Otherwise, when RO=0 the snooped and non-snooped traffic arriving through a host bridge must be kept ordered, eliminating the benefit of routing some traffic around the cache hierarchy to memory.

2) section 4.7.3.5: Why do we exclude the possibility of a Host bridge that supports both RCiEP and Root Ports?


Cheers,
Michael.


Re: OS-A platform stoptime requirement

Tim Newsome
 

On Tue, Dec 28, 2021 at 4:19 PM Ved Shanbhogue <ved@...> wrote:
On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
>stoptime is nice on single-hart systems, but not really practical in
>multi-hart systems where one hart can be running while another is halted.
>The main benefit is that it allows you to single step through code with a
>timer interrupt enabled, without going into that timer interrupt every time
>you single step.

Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

It works fine, but it's not as nice as the system itself slowing down to your debugging speed. (Although slowing the system down will generally be imperfect in any case, because systems have other peripherals that will not stop generating interrupts/counting time/whatever.)

Tim 


Re: OS-A platform stoptime requirement

Ved Shanbhogue
 

On Tue, Dec 28, 2021 at 10:04:41AM -0800, Tim Newsome wrote:
stoptime is nice on single-hart systems, but not really practical in
multi-hart systems where one hart can be running while another is halted.
The main benefit is that it allows you to single step through code with a
timer interrupt enabled, without going into that timer interrupt every time
you single step.
Are there downsides to the debugger inhibiting the timer interrupt by setting STIE to 0?
This seems like would provide similar benefit even for a multi-hart system...

regards
ved


Re: OS-A platform stoptime requirement

Tim Newsome
 

stoptime is nice on single-hart systems, but not really practical in multi-hart systems where one hart can be running while another is halted. The main benefit is that it allows you to single step through code with a timer interrupt enabled, without going into that timer interrupt every time you single step.

Tim

On Tue, Dec 21, 2021 at 9:57 AM Greg Favor <gfavor@...> wrote:
I'm cc'ing Tim Newsome and Paul Donahue (chairs of the Debug TG).  

Tim or Paul can comment on the debug value in sometimes being able to stop the local hart time CSR from advancing while in Debug mode (using dcsr.stoptime).

Also, Paul was involved with distilling out of the enormous amount of optionality in the Debug spec, what would be suitable to require in OS-A platforms.  So he can comment about the following debug-related OS-A platform requirement, and in particular the stoptime requirement:
       dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1

Greg

On Mon, Dec 20, 2021 at 3:19 PM Beeman Strong <beeman@...> wrote:
Thanks, I definitely misunderstood the intent.  So the expectation is that, in Debug Mode, reads to mtime will see time continue to progress, but reads to the time CSR will see a frozen value.  Reads of the time CSR by software running outside debug mode should not be impacted, and will see a value synchronized with mtime.

I suppose I can imagine usages where keeping the time CSR frozen has value to a debugger, but it does add complexity and latency in requiring a resync with mtime on debug mode exit.  Does the value really rise to the level of being a platform requirement?  Is there some important debug functionality that breaks if we keep it simple and let the time CSR keep running in debug mode?

On Mon, Dec 20, 2021 at 2:05 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 3:42 PM Greg Favor <gfavor@...> wrote:
I think there's a little bit of confusion going on.  The 'stoptime' bit is defined as "Don’t increment any hart-local timers while in Debug Mode."  I take this to clearly not be referring to MTIME, but to the local time CSR.

I fully agree that expecting a debug action on a core to have to reach out to wherever in a system MTIME may be, is inappropriate.  Which also affects other still active harts - which is probably very inappropriate (i.e. debugging just one hart shouldn't inherently affect operation of all harts).

Oops, it has been a while since I've read this spec.  I withdraw my comment, if it's indeed the case that shared implementations of mtime need not be affected by stoptime.


Whereas stopping the local time CSR for the duration of being in Debug mode would be easy to implement, i.e. in_debug_mode inhibits the time CSR from advancing.  Presumably, once the hart exits Debug mode, the time CSR effectively immediately catches back up with the current time value that has been broadcast to it from MTIME.

Greg


On Mon, Dec 20, 2021 at 1:19 PM Andrew Waterman <andrew@...> wrote:


On Mon, Dec 20, 2021 at 12:11 PM Beeman Strong <beeman@...> wrote:
Hi there,

In the OS-A platform spec I see the following requirement:

• dcsr.stopcount and dcsr.stoptime must be supported and the reset value of each must be 1
◦ Rationale: The architecture has strict requirements on minstret which may be perturbed by an external debugger in a way that’s visible to software. The default should allow code that’s sensitive to these requirements to be debugged.

The rationale justifies the requirement for stopcount=1, but I don't see any rationale for stoptime=1.

The debug spec refers to stoptime=1 stopping "timers", which I interpret to mean the mtime counter.  This timer is expected to by synchronized across harts in a system ("The real-time clocks of all harts in a single user application should be synchronized to within one tick of the real-time clock.")  In a system with multiple harts, where a subset of harts may be halted at a given time, this stoptime=1 requirement risks violating this ISA requirement and confusing software by causing wall-clock time to get out of sync.

Can we remove "and dcsr.stoptime" from this platform requirement?

FWIW, although I appreciate the motivation behind this requirement, I also support removing it.  For the case that mtime is centrally implemented, this requirement is quite onerous to implement.  For the case that mtime is decentralized, this requirement is easy to satisfy, but is differently problematic, as the spec mentions ("risks violating this ISA requirement").  I dislike disadvantaging the centralized-mtime implementations for a feature we've already admitted is problematic at the ISA level.
 

thanks,
beeman


Re: Platform specification questions

Kumar Sankaran
 

Hi Ved,
Are we ready to finalize the changes after all the comments and
discussions on the list of questions you had on this thread? If yes,
can you send a PR for these changes please?
I see the PCIe INTx question is still open as per your last comment
below. If you wish, we can keep the PCIe INTx question open while we
resolve all the others.

Regards
Kumar

On Thu, Dec 16, 2021 at 1:23 PM Vedvyas Shanbhogue <ved@...> wrote:

Greg HI -

On Tue, Dec 14, 2021 at 05:32:08PM -0800, Greg Favor wrote:
The following two items in Ved's email didn't get any response, so I offer
my own below ...

On Sun, Dec 12, 2021 at 4:15 PM Vedvyas Shanbhogue <ved@...> wrote:

Section 2.3.7.3.2 - PCIe memory space:
The requirement to not have any address translation for inbound accesses
to any component in system address space is restrictive. If direct
assignment of devices is supported then the IOMMU would be required to do
the address translation for inbound accesses. Further for hart originated
accesses where the PCIe memory is mapped into virtual address space there
needs to be a translation through the first and/or second level page
tables. Please help clarify why PCie memory must not be mapped into
virtual address space and why use of IOMMU to do translation is disallowed
by the specification.
I think where this came from is learnings in the ARM "server" ecosystem (as
then got captured in SBSA). In particular, one wants devices and software
on harts to have the same view of system physical address space so that,
for example, pointers can be easily passed around. Which doesn't conflict
with having address translation by IOMMUs. Maybe the current text needs to
be better worded, but I think the ideas to be expressed are:

For inbound PCIe transactions:

- There should be no hardware modifications of PCIe addresses outside of an
IOMMU (as some vendors way back in early ARM SBSA days were wont to do).

- If there is not an IOMMU associated with the PCIe interface, then PCIe
devices will have the same view of PA space as the harts.

- If there is an IOMMU associated with the PCIe interface, then system
software can trust that all address modifications are under its control via
hart page tables and IOMMU page tables.

For outbound PCIe transactions, system software is free to set up VA-to-PA
translations in hart page tables. I think the mandate against outbound
address translation was accidentally mistaken. The key point is that there
is one common view of system physical address space. Hart and IOMMU page
tables may translate from hart VA's and device addresses to system physical
address space, but the above ensures that "standard" system software has
full control over this and doesn't have non-standard address
transformations happening that it isn't aware of and doesn't know how to
control.
Thanks. I think this is very clear.



Section 2.3.7.3.3 - PCIe interrupts:
It seems unnecessary to require platforms built for the '22 version of the
platform to have to support running software that is not MSI aware. Please
clarify why supporting the INTx emulation for legacy/Pre-PCIe software
compatibility a required and not an optional capability for RISC-v
platforms?

This one seems questionable to me as well, although I'm not the expert to
reliably proclaim that INTx support is no longer a necessity in some
server-class systems. I can imagine that back in earlier ARM "server" days
this legacy issue was a bigger deal and hence was mandated in SBSA. But
maybe that is no longer an issue? Or at least for 2022+ systems - to the
point where mandating this legacy support is an unnecessary burden on many
or the majority of such systems.

If this is a fair view going forward, then the INTx requirements should
just become recommendations for systems that do feel the need to care about
INTx support.
I think the recommendation could be changed to require MSI and make supporting INTx emulation optional. I am hoping to hear from BIOS and OS experts if we would need support OS/BIOS that are `22 platform compatible but are not MSI capable.

regards
ved




--
Regards
Kumar


Re: SBI: We can fast handle some SBI functions for extreme performance in assembly code implementation if SBI extension‘s FID equals to zero

Anup Patel
 

On Wed, Dec 22, 2021 at 9:55 AM 洛佳 Luo Jia <me@...> wrote:

[Edited Message Follows]
[Reason: Typo fix]

RISC-V SBI provides platform agnostic functions for kernels. The nomal handling procedure in an SBI implementation would include context switch and call higher language code, e.g. Rust or C code. However when SBI function has FID=0 (no matter what extension module EID is), we can figure out another way to provide such SBI functions without context switch. The assembly code would write as follows:

#[naked]
unsafe fn trap_begin() {
asm!(
"bnez a6, 1f", // set_timer EID(a7): 0x54494D45, FID(a6): 0
"li a6, 0x54494D45",
"bne a6, a7, 1f",
"csrr a6, mcause",
"li a7, 9",
"bne a6, a7, 1f", // if mcause != supervisor ecall, jump to conventional way of handling
"li a6, 0x200bff8", // CLINT mtimecmp address (if device tree match this address, use trap_begin as mtvec, otherwise don't use it then it would be performance loss only but still correct)
This is platform specific address and it will break for platforms
having MTIME register at some other address. Such optimizations,
increasingly make a SBI implementation platform specific.

"sd a0, 0(a6)", // a0: stime_value
This is another place where this code is broken because it always
program's MTIMECMP register at offset 0x0.

On SMP systems, the mapping of MTIMECMP registers to HART could be
totally arbitrary. In fact, this mapping will not be related to
mhartid for systems with sparse hartids. The only source of truth for
MTIMECMP to HART mapping is the ACLINT/CLINT DT node.

"mret", // return to supervisor without context restore
"1:",
"csrrw sp, mscratch, sp",
"sd ra, 0(sp)",
// ...context save...
"call {rust_trap}",
"ld ra, 0(sp)",
// ...context restore...
"csrrw sp, mscratch, sp",
"mret"
}
}

The core idea of this assembly code is that the condition of entry of certain SBI function (in this example, set_timer) can be concluded as: mcause == 9 && EID == 0x54494D45 && FID == 0. Such comparison of register equals a constant non-zero value requires another register to store the constant value; but equals zero does not, because RISC-V provides the `zero` register where we can compare to, so `bnez` would run without any auxiliary registers to store constant. The arithmetic condition `&&` allows to switch equal comparisons in math, usually we compare `mcause` first, but comparing FID first is also correct in arithmetic result. Then in this way after FID == 0 is compared, we can compare EID and following mcause as well, using the `a6` register formally as a temporary register storing the FID value.

In this way we can accelerate such SBI calls faster, as only few assembly code is run, no context switch and higher programming language calls is required. But such ’irregular‘ way only come into effect if any comparison requires Value == 0, in SBI it would be FID == 0 (EID == 0 means a legacy module).
Well, an SBI implementation can become more-n-more platform specific
and do more stuff in assembly but the platform vendor will end-up
maintaining their platform specific SBI implementation.

Instead of creating such platform specific SBI implementation for
optimizing SBI timer calls, platforms can instead go for Priv Sstc
extension using which SBI Timer calls can be avoided. Same thing can
be done by platforms to avoid SBI IPI calls using ACLINT SSWI or AIA
IMSIC devices.


In future SBI extensions and vendor defined extensions, it might be better if we suggest any function that requires extreme performance or is called freqently has an FID that equals zero. The current SBI 1.0-rc extensions has already defined most performance required functions (FENCE.I ipi, set_timer, etc.) as FID == 0. If it's possible, we can set a rule or a formal advice in SBI standard that performance functions in extensions should be best to define as FID == 0; or if any SBI module includes only one function, the function's FID should best be FID == 0 to help the implementations to improve SBI call performace.
If an SBI extension is in hot-path for OSes then it's functionality
will be eventually replaced by some ISA or non-ISA specification. For
example, the SBI IPI and Timer extension functionality is already
replaced by AIA, ACLINT, and Priv Sstc specifications.

We only define a SBI extension for a functionality when there is no
other way left and corresponding ISA or non-ISA specifications are not
desired/possible in near future (1-2 years). Further, defining a new
SBI extension also requires a detailed proof-of-concept implementation
(such as QEMU, Linux, OpenSBI, etc) which allows us to further refine
the SBI extension based on real-world experience.

Regards,
Anup


SBI: We can fast handle some SBI functions for extreme performance in assembly code implementation if SBI extension‘s FID equals to zero

洛佳 Luo Jia
 
Edited

RISC-V SBI provides platform agnostic functions for kernels. The nomal handling procedure in an SBI implementation would include context switch and call higher language code, e.g. Rust or C code. However when SBI function has FID=0 (no matter what extension module EID is), we can figure out another way to provide such SBI functions without context switch. The assembly code would write as follows:
#[naked]
unsafe fn trap_begin() {
    asm!(
        "bnez a6, 1f", // set_timer EID(a7): 0x54494D45, FID(a6): 0
        "li   a6, 0x54494D45",
        "bne  a6, a7, 1f",
        "csrr a6, mcause",
        "li   a7, 9",
        "bne  a6, a7, 1f", // if mcause != supervisor ecall, jump to conventional way of handling
        "li   a6, 0x200bff8", // CLINT mtimecmp address (if device tree match this address, use trap_begin as mtvec, otherwise don't use it then it would be performance loss only but still correct)
        "sd   a0, 0(a6)", // a0: stime_value
        "mret", // return to supervisor without context restore
        "1:", 
"csrrw sp, mscratch, sp",
"sd ra, 0(sp)", // ...context save... "call {rust_trap}",
"ld ra, 0(sp)", // ...context restore...
"csrrw sp, mscratch, sp", "mret" } }
The core idea of this assembly code is that the condition of entry of certain SBI function (in this example, set_timer) can be concluded as: mcause == 9 && EID == 0x54494D45 && FID == 0. Such comparison of register equals a constant non-zero value requires another register to store the constant value; but equals zero does not, because RISC-V provides the `zero` register where we can compare to, so `bnez` would run without any auxiliary registers to store constant. The arithmetic condition `&&` allows to switch equal comparisons in math, usually we compare `mcause` first, but comparing FID first is also correct in arithmetic result. Then in this way after FID == 0 is compared, we can compare EID and following mcause as well, using the `a6` register formally as a temporary register storing the FID value.

In this way we can accelerate such SBI calls faster, as only few assembly code is run, no context switch and higher programming language calls is required. But such ’irregular‘ way only come into effect if any comparison requires Value == 0, in SBI it would be FID == 0 (EID == 0 means a legacy module).

In future SBI extensions and vendor defined extensions, it might be better if we suggest any function that requires extreme performance or is called freqently has an FID that equals zero. The current SBI 1.0-rc extensions has already defined most performance required functions (FENCE.I ipi, set_timer, etc.) as FID == 0. If it's possible, we can set a rule or a formal advice in SBI standard that performance functions in extensions should be best to define as FID == 0; or if any SBI module includes only one function, the function's FID should best be FID == 0 to help the implementations to improve SBI call performace.


Re: OS-A platform stoptime requirement

Paul Donahue
 

I agree that the stoptime=1 requirement should be removed.  I can think of a case where stoptime might be useful but it would be a contrived example that's not useful in the real world.

I should have proposed removing this from the platform spec when I made this comment about discouraging stoptime=1:


Thanks,

-Paul


On Tue, Dec 21, 2021 at 9:59 AM Greg Favor <gfavor@...> wrote:
On Tue, Dec 21, 2021 at 9:18 AM Vedvyas Shanbhogue <ved@...> wrote:
So there is real value to stopping time for debug and expectation is
that there will be a "synchronization"/"catch back up" action on MRET
from debug mode?

Btw, that would be a DRET to exit Debug mode (not an MRET).

Greg
 

221 - 240 of 1850