|
Re: VFRECIP/VFRSQRT instructions
Now annotated version --detail
https://github.com/David-Horner/recip/blob/master/vrecip.cc
For the 7x7 below notice the biased value does not exceed 21 for recip (5 of 7 bits) and 15
Now annotated version --detail
https://github.com/David-Horner/recip/blob/master/vrecip.cc
For the 7x7 below notice the biased value does not exceed 21 for recip (5 of 7 bits) and 15
|
By
David Horner
·
#307
·
|
|
Re: VFRECIP/VFRSQRT instructions
I should have said that my results are for the 7/7 case. And it sounds like we're in agreement then. We probably have the same table.
Bill
On 8/2/20 9:50 AM, DSHORNER wrote:
I should have said that my results are for the 7/7 case. And it sounds like we're in agreement then. We probably have the same table.
Bill
On 8/2/20 9:50 AM, DSHORNER wrote:
|
By
Bill Huffman
·
#306
·
|
|
Re: VFRECIP/VFRSQRT instructions
This is the link to the revised code that does n by m LUT
https://github.com/David-Horner/recip/blob/master/vrecip.cc
On 2020-08-01 4:51 p.m., David Horner via
This is the link to the revised code that does n by m LUT
https://github.com/David-Horner/recip/blob/master/vrecip.cc
On 2020-08-01 4:51 p.m., David Horner via
|
By
David Horner
·
#305
·
|
|
Re: VFRECIP/VFRSQRT instructions
What I initially posted was a compilation from days previously, and I pulled in some bogus test results.
Here is a fresh run :
./a.out 7 5 ;./a.out 7 6 ;./a.out 7 7 ;./a.out 7 8
What I initially posted was a compilation from days previously, and I pulled in some bogus test results.
Here is a fresh run :
./a.out 7 5 ;./a.out 7 6 ;./a.out 7 7 ;./a.out 7 8
|
By
David Horner
·
#304
·
|
|
Re: VFRECIP/VFRSQRT instructions
David,
Here are a series of statements leading to my worst case answer:
For the mantissa range 0xF5_0000 to 0xF5_FFFF, the reciprocal estimate is 0x85_0000
The largest error is for 0xF5_0000
The
David,
Here are a series of statements leading to my worst case answer:
For the mantissa range 0xF5_0000 to 0xF5_FFFF, the reciprocal estimate is 0x85_0000
The largest error is for 0xF5_0000
The
|
By
Bill Huffman
·
#303
·
|
|
Re: VFRECIP/VFRSQRT instructions
This is the program Andrew wrote.
https://github.com/riscv/riscv-v-spec/blob/vfrecip/recip.cc
On 2020-07-31 4:46 p.m., Bill Huffman wrote:
That is correct,
This is the program Andrew wrote.
https://github.com/riscv/riscv-v-spec/blob/vfrecip/recip.cc
On 2020-07-31 4:46 p.m., Bill Huffman wrote:
That is correct,
|
By
David Horner
·
#302
·
|
|
Re: VFRECIP/VFRSQRT instructions
David,
Because of the errors you get, I'm assuming your "output width" and "input width" do not include the hidden bit. Right?
It's interesting. I did a similar exercise a number of years ago and
David,
Because of the errors you get, I'm assuming your "output width" and "input width" do not include the hidden bit. Right?
It's interesting. I did a similar exercise a number of years ago and
|
By
Bill Huffman
·
#301
·
|
|
Re: VFRECIP/VFRSQRT instructions
The error is relative error.
The calculation is unchanged from Andrew's original. (Although I explicitly force double even when it shouldn't matter).
The test range is from 0.5 to 1 inclusive.
Again I
The error is relative error.
The calculation is unchanged from Andrew's original. (Although I explicitly force double even when it shouldn't matter).
The test range is from 0.5 to 1 inclusive.
Again I
|
By
David Horner
·
#300
·
|
|
Re: VFRECIP/VFRSQRT instructions
David,
Are the max errors absolute? Or relative to the recip or rsqrt, which
is presumably in the range (1.0, 2.0]?
That you use [0.5, 1] when you might have meant [0.5, 1) leaves some
question
David,
Are the max errors absolute? Or relative to the recip or rsqrt, which
is presumably in the range (1.0, 2.0]?
That you use [0.5, 1] when you might have meant [0.5, 1) leaves some
question
|
By
Bill Huffman
·
#299
·
|
|
Re: VFRECIP/VFRSQRT instructions
The current LUT generator assumes N-by-N look up table.
I will load in my github Andrew's program modified to take input (index size) and output (estimate number of bits) arguments.
(--verilog
The current LUT generator assumes N-by-N look up table.
I will load in my github Andrew's program modified to take input (index size) and output (estimate number of bits) arguments.
(--verilog
|
By
David Horner
·
#298
·
|
|
Vector TG Minutes for 2020/7/24 meeting
Date: 2020/7/24
Task Group: Vector Extension
Chair: Krste Asanovic
Co-Chair: Roger Espasa
Number of Attendees: ~18
Current issues on github: https://github.com/riscv/riscv-v-spec
Issues
Date: 2020/7/24
Task Group: Vector Extension
Chair: Krste Asanovic
Co-Chair: Roger Espasa
Number of Attendees: ~18
Current issues on github: https://github.com/riscv/riscv-v-spec
Issues
|
By
Krste Asanovic
·
#297
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
To be clear, this is SEW/LMUL ratio change, correct?
All other values being valid and the "SEW and LMUL" combination itself being valid.
Providing an invalid SEW and LMUL combination will set vill
To be clear, this is SEW/LMUL ratio change, correct?
All other values being valid and the "SEW and LMUL" combination itself being valid.
Providing an invalid SEW and LMUL combination will set vill
|
By
David Horner
·
#296
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
this is in the x0,x0 case? I see this as the only case that needs to be considered.
The EE does not have to both set vill and establish a saved vl value in the same instruction.
A sequence of vsetvl
this is in the x0,x0 case? I see this as the only case that needs to be considered.
The EE does not have to both set vill and establish a saved vl value in the same instruction.
A sequence of vsetvl
|
By
David Horner
·
#295
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
| On Wed, Jul 22, 2020 at 11:42 PM <krste@...> wrote:
|||||| On Wed, 22 Jul 2020 23:37:02 -0700, Andrew Waterman <andrew@...> said:
| | On Wed, Jul 22, 2020 at 11:19 PM David
| On Wed, Jul 22, 2020 at 11:42 PM <krste@...> wrote:
|||||| On Wed, 22 Jul 2020 23:37:02 -0700, Andrew Waterman <andrew@...> said:
| | On Wed, Jul 22, 2020 at 11:19 PM David
|
By
Krste Asanovic
·
#294
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
Not a hill for me to die on, but I believe vsetvli x0,x0 is sufficiently important that even this aspect should be fully vetted.
Other vsetvl[i] instructions are essentially different beasts than this
Not a hill for me to die on, but I believe vsetvli x0,x0 is sufficiently important that even this aspect should be fully vetted.
Other vsetvl[i] instructions are essentially different beasts than this
|
By
David Horner
·
#293
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
On 2020-07-23 6:27 a.m., Andrew Waterman wrote:
In my response to your prior post I stated that #1 and #3 are needed to guarantee vl invariance in speculative cases.
I agree
On 2020-07-23 6:27 a.m., Andrew Waterman wrote:
In my response to your prior post I stated that #1 and #3 are needed to guarantee vl invariance in speculative cases.
I agree
|
By
David Horner
·
#292
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
see the rest of the thread for more context.
On 2020-07-23 2:37 a.m., Andrew Waterman wrote:
To clarify for the list:
The RAW (Read after Write) hazard already exists
see the rest of the thread for more context.
On 2020-07-23 2:37 a.m., Andrew Waterman wrote:
To clarify for the list:
The RAW (Read after Write) hazard already exists
|
By
David Horner
·
#291
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
Disagreed. It’s fine to treat vsetvl instructions that set vill as pipeline flushes. Uarch can therefore assume vl isn’t changed.
Disagreed. It’s fine to treat vsetvl instructions that set vill as pipeline flushes. Uarch can therefore assume vl isn’t changed.
|
By
andrew@...
·
#290
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
| On Wed, Jul 22, 2020 at 11:19 PM David Horner <ds2horner@...> wrote:
| #3) If vill is set should vl remain unchanged? (I vote for yes).
| Other vsetvl[i] instructions that set vill=1
| On Wed, Jul 22, 2020 at 11:19 PM David Horner <ds2horner@...> wrote:
| #3) If vill is set should vl remain unchanged? (I vote for yes).
| Other vsetvl[i] instructions that set vill=1
|
By
Krste Asanovic
·
#289
·
|
|
Re: Issue #365 vsetvl{i} x0, x0 instruction forms
Other vsetvl[i] instructions that set vill=1 also set vl=0. Deviating from that course would be needlessly painful and not especially beneficial.
It's not just about the cost of the comparators; it's
Other vsetvl[i] instructions that set vill=1 also set vl=0. Deviating from that course would be needlessly painful and not especially beneficial.
It's not just about the cost of the comparators; it's
|
By
andrew@...
·
#288
·
|