Re: VFRECIP/VFRSQRT instructions

Bill Huffman

The recip table matches mine as does the worst case error.

I have one different entry in the square root table.  For entry 77, where you have 36, I have 37.  I'm not sure whether it matters.  Also, ages ago, I got a very small difference in worst case error of 2^-7.317 but I haven't gone back to trace anything down about that.

Bill

On 8/3/20 11:38 AM, DSHORNER wrote:

EXTERNAL MAIL

Now annotated version --detail
https://github.com/David-Horner/recip/blob/master/vrecip.cc

For the 7x7 below notice the biased value does not exceed 21 for recip (5 of 7 bits) and 15 for rsqrt (4 of 7 bits).

ip 7 op 7 LUT #bits 896 verilog 0  test/test-long 1
Recip7x7LUT (input [6:0] in, output reg [6:0] out);
in[6:0]  corresponds to sig[S-1:S-6]
out[6:0] corresponds to sig[S-1:S-6]
biased : ((ipN-1) - in) << (op - ip) // or >> if neg
base bias 127  left-shift 0 right-shift 0
0: out = 127 biased 0; lerr 0.00390625 rerr 0.00387573 larg 0.5 rarg 0.503906
1: out = 125 biased 1; lerr 0.0039978 rerr 0.00372314 larg 0.503906 rarg 0.507812
2: out = 123 biased 2; lerr 0.00421143 rerr 0.00344849 larg 0.507812 rarg 0.511719
3: out = 121 biased 3; lerr 0.00454712 rerr 0.00305176 larg 0.511719 rarg 0.515625
4: out = 119 biased 4; lerr 0.00500488 rerr 0.00253296 larg 0.515625 rarg 0.519531
5: out = 117 biased 5; lerr 0.00558472 rerr 0.00189209 larg 0.519531 rarg 0.523438
6: out = 116 biased 5; lerr 0.00219727 rerr 0.00524902 larg 0.523438 rarg 0.527344
7: out = 114 biased 6; lerr 0.00299072 rerr 0.00439453 larg 0.527344 rarg 0.53125
8: out = 112 biased 7; lerr 0.00390625 rerr 0.00341797 larg 0.53125 rarg 0.535156
9: out = 110 biased 8; lerr 0.00494385 rerr 0.00231934 larg 0.535156 rarg 0.539062
10: out = 109 biased 8; lerr 0.00189209 rerr 0.00534058 larg 0.539062 rarg 0.542969
11: out = 107 biased 9; lerr 0.00314331 rerr 0.00402832 larg 0.542969 rarg 0.546875
12: out = 105 biased 10; lerr 0.0045166 rerr 0.00259399 larg 0.546875 rarg 0.550781
13: out = 104 biased 10; lerr 0.00170898 rerr 0.00537109 larg 0.550781 rarg 0.554688
14: out = 102 biased 11; lerr 0.0032959 rerr 0.00372314 larg 0.554688 rarg 0.558594
15: out = 100 biased 12; lerr 0.00500488 rerr 0.00195312 larg 0.558594 rarg 0.5625
16: out = 99 biased 12; lerr 0.00244141 rerr 0.00448608 larg 0.5625 rarg 0.566406
17: out = 97 biased 13; lerr 0.00436401 rerr 0.00250244 larg 0.566406 rarg 0.570312
18: out = 96 biased 13; lerr 0.00195312 rerr 0.00488281 larg 0.570312 rarg 0.574219
19: out = 94 biased 14; lerr 0.00408936 rerr 0.00268555 larg 0.574219 rarg 0.578125
20: out = 93 biased 14; lerr 0.00183105 rerr 0.00491333 larg 0.578125 rarg 0.582031
21: out = 91 biased 15; lerr 0.00418091 rerr 0.00250244 larg 0.582031 rarg 0.585938
22: out = 90 biased 15; lerr 0.0020752 rerr 0.00457764 larg 0.585938 rarg 0.589844
23: out = 88 biased 16; lerr 0.00463867 rerr 0.00195312 larg 0.589844 rarg 0.59375
24: out = 87 biased 16; lerr 0.00268555 rerr 0.00387573 larg 0.59375 rarg 0.597656
25: out = 85 biased 17; lerr 0.00546265 rerr 0.0010376 larg 0.597656 rarg 0.601562
26: out = 84 biased 17; lerr 0.00366211 rerr 0.00280762 larg 0.601562 rarg 0.605469
27: out = 83 biased 17; lerr 0.00192261 rerr 0.0045166 larg 0.605469 rarg 0.609375
28: out = 81 biased 18; lerr 0.00500488 rerr 0.00137329 larg 0.609375 rarg 0.613281
29: out = 80 biased 18; lerr 0.00341797 rerr 0.00292969 larg 0.613281 rarg 0.617188
30: out = 79 biased 18; lerr 0.00189209 rerr 0.00442505 larg 0.617188 rarg 0.621094
31: out = 77 biased 19; lerr 0.00527954 rerr 0.000976562 larg 0.621094 rarg 0.625
32: out = 76 biased 19; lerr 0.00390625 rerr 0.00231934 larg 0.625 rarg 0.628906
33: out = 75 biased 19; lerr 0.00259399 rerr 0.00360107 larg 0.628906 rarg 0.632812
34: out = 74 biased 19; lerr 0.00134277 rerr 0.00482178 larg 0.632812 rarg 0.636719
35: out = 72 biased 20; lerr 0.00512695 rerr 0.000976562 larg 0.636719 rarg 0.640625
36: out = 71 biased 20; lerr 0.00402832 rerr 0.00204468 larg 0.640625 rarg 0.644531
37: out = 70 biased 20; lerr 0.00299072 rerr 0.00305176 larg 0.644531 rarg 0.648438
38: out = 69 biased 20; lerr 0.00201416 rerr 0.0039978 larg 0.648438 rarg 0.652344
39: out = 68 biased 20; lerr 0.00109863 rerr 0.00488281 larg 0.652344 rarg 0.65625
40: out = 66 biased 21; lerr 0.00537109 rerr 0.000549316 larg 0.65625 rarg 0.660156
41: out = 65 biased 21; lerr 0.00460815 rerr 0.00128174 larg 0.660156 rarg 0.664062
42: out = 64 biased 21; lerr 0.00390625 rerr 0.00195312 larg 0.664062 rarg 0.667969
43: out = 63 biased 21; lerr 0.00326538 rerr 0.00256348 larg 0.667969 rarg 0.671875
44: out = 62 biased 21; lerr 0.00268555 rerr 0.00311279 larg 0.671875 rarg 0.675781
45: out = 61 biased 21; lerr 0.00216675 rerr 0.00360107 larg 0.675781 rarg 0.679688
46: out = 60 biased 21; lerr 0.00170898 rerr 0.00402832 larg 0.679688 rarg 0.683594
47: out = 59 biased 21; lerr 0.00131226 rerr 0.00439453 larg 0.683594 rarg 0.6875
48: out = 58 biased 21; lerr 0.000976562 rerr 0.00469971 larg 0.6875 rarg 0.691406
49: out = 57 biased 21; lerr 0.000701904 rerr 0.00494385 larg 0.691406 rarg 0.695312
50: out = 56 biased 21; lerr 0.000488281 rerr 0.00512695 larg 0.695312 rarg 0.699219
51: out = 55 biased 21; lerr 0.000335693 rerr 0.00524902 larg 0.699219 rarg 0.703125
52: out = 54 biased 21; lerr 0.000244141 rerr 0.00531006 larg 0.703125 rarg 0.707031
53: out = 53 biased 21; lerr 0.000213623 rerr 0.00531006 larg 0.707031 rarg 0.710938
54: out = 52 biased 21; lerr 0.000244141 rerr 0.00524902 larg 0.710938 rarg 0.714844
55: out = 51 biased 21; lerr 0.000335693 rerr 0.00512695 larg 0.714844 rarg 0.71875
56: out = 50 biased 21; lerr 0.000488281 rerr 0.00494385 larg 0.71875 rarg 0.722656
57: out = 49 biased 21; lerr 0.000701904 rerr 0.00469971 larg 0.722656 rarg 0.726562
58: out = 48 biased 21; lerr 0.000976562 rerr 0.00439453 larg 0.726562 rarg 0.730469
59: out = 47 biased 21; lerr 0.00131226 rerr 0.00402832 larg 0.730469 rarg 0.734375
60: out = 46 biased 21; lerr 0.00170898 rerr 0.00360107 larg 0.734375 rarg 0.738281
61: out = 45 biased 21; lerr 0.00216675 rerr 0.00311279 larg 0.738281 rarg 0.742188
62: out = 44 biased 21; lerr 0.00268555 rerr 0.00256348 larg 0.742188 rarg 0.746094
63: out = 43 biased 21; lerr 0.00326538 rerr 0.00195312 larg 0.746094 rarg 0.75
64: out = 42 biased 21; lerr 0.00390625 rerr 0.00128174 larg 0.75 rarg 0.753906
65: out = 41 biased 21; lerr 0.00460815 rerr 0.000549316 larg 0.753906 rarg 0.757812
66: out = 40 biased 21; lerr 0.00537109 rerr 0.000244141 larg 0.757812 rarg 0.761719
67: out = 40 biased 20; lerr 0.000244141 rerr 0.00488281 larg 0.761719 rarg 0.765625
68: out = 39 biased 20; lerr 0.00109863 rerr 0.0039978 larg 0.765625 rarg 0.769531
69: out = 38 biased 20; lerr 0.00201416 rerr 0.00305176 larg 0.769531 rarg 0.773438
70: out = 37 biased 20; lerr 0.00299072 rerr 0.00204468 larg 0.773438 rarg 0.777344
71: out = 36 biased 20; lerr 0.00402832 rerr 0.000976562 larg 0.777344 rarg 0.78125
72: out = 35 biased 20; lerr 0.00512695 rerr 0.000152588 larg 0.78125 rarg 0.785156
73: out = 35 biased 19; lerr 0.000152588 rerr 0.00482178 larg 0.785156 rarg 0.789062
74: out = 34 biased 19; lerr 0.00134277 rerr 0.00360107 larg 0.789062 rarg 0.792969
75: out = 33 biased 19; lerr 0.00259399 rerr 0.00231934 larg 0.792969 rarg 0.796875
76: out = 32 biased 19; lerr 0.00390625 rerr 0.000976562 larg 0.796875 rarg 0.800781
77: out = 31 biased 19; lerr 0.00527954 rerr 0.000427246 larg 0.800781 rarg 0.804688
78: out = 31 biased 18; lerr 0.000427246 rerr 0.00442505 larg 0.804688 rarg 0.808594
79: out = 30 biased 18; lerr 0.00189209 rerr 0.00292969 larg 0.808594 rarg 0.8125
80: out = 29 biased 18; lerr 0.00341797 rerr 0.00137329 larg 0.8125 rarg 0.816406
81: out = 28 biased 18; lerr 0.00500488 rerr 0.000244141 larg 0.816406 rarg 0.820312
82: out = 28 biased 17; lerr 0.000244141 rerr 0.0045166 larg 0.820312 rarg 0.824219
83: out = 27 biased 17; lerr 0.00192261 rerr 0.00280762 larg 0.824219 rarg 0.828125
84: out = 26 biased 17; lerr 0.00366211 rerr 0.0010376 larg 0.828125 rarg 0.832031
85: out = 25 biased 17; lerr 0.00546265 rerr 0.000793457 larg 0.832031 rarg 0.835938
86: out = 25 biased 16; lerr 0.000793457 rerr 0.00387573 larg 0.835938 rarg 0.839844
87: out = 24 biased 16; lerr 0.00268555 rerr 0.00195312 larg 0.839844 rarg 0.84375
88: out = 23 biased 16; lerr 0.00463867 rerr 3.05176E-05 larg 0.84375 rarg 0.847656
89: out = 23 biased 15; lerr 3.05176E-05 rerr 0.00457764 larg 0.847656 rarg 0.851562
90: out = 22 biased 15; lerr 0.0020752 rerr 0.00250244 larg 0.851562 rarg 0.855469
91: out = 21 biased 15; lerr 0.00418091 rerr 0.000366211 larg 0.855469 rarg 0.859375
92: out = 21 biased 14; lerr 0.000366211 rerr 0.00491333 larg 0.859375 rarg 0.863281
93: out = 20 biased 14; lerr 0.00183105 rerr 0.00268555 larg 0.863281 rarg 0.867188
94: out = 19 biased 14; lerr 0.00408936 rerr 0.000396729 larg 0.867188 rarg 0.871094
95: out = 19 biased 13; lerr 0.000396729 rerr 0.00488281 larg 0.871094 rarg 0.875
96: out = 18 biased 13; lerr 0.00195312 rerr 0.00250244 larg 0.875 rarg 0.878906
97: out = 17 biased 13; lerr 0.00436401 rerr 6.10352E-05 larg 0.878906 rarg 0.882812
98: out = 17 biased 12; lerr 6.10352E-05 rerr 0.00448608 larg 0.882812 rarg 0.886719
99: out = 16 biased 12; lerr 0.00244141 rerr 0.00195312 larg 0.886719 rarg 0.890625
100: out = 15 biased 12; lerr 0.00500488 rerr 0.000640869 larg 0.890625 rarg 0.894531
101: out = 15 biased 11; lerr 0.000640869 rerr 0.00372314 larg 0.894531 rarg 0.898438
102: out = 14 biased 11; lerr 0.0032959 rerr 0.0010376 larg 0.898438 rarg 0.902344
103: out = 14 biased 10; lerr 0.0010376 rerr 0.00537109 larg 0.902344 rarg 0.90625
104: out = 13 biased 10; lerr 0.00170898 rerr 0.00259399 larg 0.90625 rarg 0.910156
105: out = 12 biased 10; lerr 0.0045166 rerr 0.000244141 larg 0.910156 rarg 0.914062
106: out = 12 biased 9; lerr 0.000244141 rerr 0.00402832 larg 0.914062 rarg 0.917969
107: out = 11 biased 9; lerr 0.00314331 rerr 0.00109863 larg 0.917969 rarg 0.921875
108: out = 11 biased 8; lerr 0.00109863 rerr 0.00534058 larg 0.921875 rarg 0.925781
109: out = 10 biased 8; lerr 0.00189209 rerr 0.00231934 larg 0.925781 rarg 0.929688
110: out = 9 biased 8; lerr 0.00494385 rerr 0.000762939 larg 0.929688 rarg 0.933594
111: out = 9 biased 7; lerr 0.000762939 rerr 0.00341797 larg 0.933594 rarg 0.9375
112: out = 8 biased 7; lerr 0.00390625 rerr 0.000244141 larg 0.9375 rarg 0.941406
113: out = 8 biased 6; lerr 0.000244141 rerr 0.00439453 larg 0.941406 rarg 0.945312
114: out = 7 biased 6; lerr 0.00299072 rerr 0.00112915 larg 0.945312 rarg 0.949219
115: out = 7 biased 5; lerr 0.00112915 rerr 0.00524902 larg 0.949219 rarg 0.953125
116: out = 6 biased 5; lerr 0.00219727 rerr 0.00189209 larg 0.953125 rarg 0.957031
117: out = 5 biased 5; lerr 0.00558472 rerr 0.00152588 larg 0.957031 rarg 0.960938
118: out = 5 biased 4; lerr 0.00152588 rerr 0.00253296 larg 0.960938 rarg 0.964844
119: out = 4 biased 4; lerr 0.00500488 rerr 0.000976562 larg 0.964844 rarg 0.96875
120: out = 4 biased 3; lerr 0.000976562 rerr 0.00305176 larg 0.96875 rarg 0.972656
121: out = 3 biased 3; lerr 0.00454712 rerr 0.000549316 larg 0.972656 rarg 0.976562
122: out = 3 biased 2; lerr 0.000549316 rerr 0.00344849 larg 0.976562 rarg 0.980469
123: out = 2 biased 2; lerr 0.00421143 rerr 0.000244141 larg 0.980469 rarg 0.984375
124: out = 2 biased 1; lerr 0.000244141 rerr 0.00372314 larg 0.984375 rarg 0.988281
125: out = 1 biased 1; lerr 0.0039978 rerr 6.10352E-05 larg 0.988281 rarg 0.992188
126: out = 1 biased 0; lerr 6.10352E-05 rerr 0.00387573 larg 0.992188 rarg 0.996094
127: out = 0 biased 0; lerr 0.00390625 rerr 0 larg 0.996094 rarg 1

... [removed hex data dumping]

RSqrt7x7LUT (input [6:0] in, output reg [6:0] out);
// in[6] corresponds to exp[0]
// in[5:0] corresponds to sig[S-1:S-5]
// out[6:0] corresponds to sig[S-1:S-6]
// biased : ((ipN-1) - in) << (op - ip)
0: out 127 biased 0; lerr 0.00390625 rerr 0.00384557 larg 0.25 rarg 0.253906
1: out 125 biased 1; lerr 0.00402773 rerr 0.00360435 larg 0.253906 rarg 0.257812
2: out 123 biased 2; lerr 0.00432928 rerr 0.00318533 larg 0.257812 rarg 0.261719
3: out 121 biased 3; lerr 0.00480818 rerr 0.00259111 larg 0.261719 rarg 0.265625
4: out 119 biased 4; lerr 0.00546183 rerr 0.00182426 larg 0.265625 rarg 0.269531
5: out 118 biased 4; lerr 0.0022317 rerr 0.00497249 larg 0.269531 rarg 0.273438
6: out 116 biased 5; lerr 0.00319802 rerr 0.00389675 larg 0.273438 rarg 0.277344
7: out 114 biased 6; lerr 0.00433191 rerr 0.00265532 larg 0.277344 rarg 0.28125
8: out 113 biased 6; lerr 0.00148789 rerr 0.00542232 larg 0.28125 rarg 0.285156
9: out 111 biased 7; lerr 0.00292144 rerr 0.00388464 larg 0.285156 rarg 0.289062
10: out 109 biased 8; lerr 0.00451607 rerr 0.0021876 larg 0.289062 rarg 0.292969
11: out 108 biased 8; lerr 0.00204104 rerr 0.00458999 larg 0.292969 rarg 0.296875
12: out 106 biased 9; lerr 0.00392348 rerr 0.00260824 larg 0.296875 rarg 0.300781
13: out 105 biased 9; lerr 0.00167641 rerr 0.00478529 larg 0.300781 rarg 0.304688
14: out 103 biased 10; lerr 0.00383947 rerr 0.00252584 larg 0.304688 rarg 0.308594
15: out 102 biased 10; lerr 0.0018141 rerr 0.00448366 larg 0.308594 rarg 0.3125
16: out 100 biased 11; lerr 0.00425098 rerr 0.00195312 larg 0.3125 rarg 0.316406
17: out 99 biased 11; lerr 0.00244141 rerr 0.00369747 larg 0.316406 rarg 0.320312
18: out 97 biased 12; lerr 0.00514568 rerr 0.000902127 larg 0.320312 rarg 0.324219
19: out 96 biased 12; lerr 0.00354633 rerr 0.00243843 larg 0.324219 rarg 0.328125
20: out 95 biased 12; lerr 0.00203674 rerr 0.00388594 larg 0.328125 rarg 0.332031
21: out 93 biased 13; lerr 0.00511752 rerr 0.000717621 larg 0.332031 rarg 0.335938
22: out 92 biased 13; lerr 0.00381051 rerr 0.00196455 larg 0.335938 rarg 0.339844
23: out 91 biased 13; lerr 0.00258984 rerr 0.00312603 larg 0.339844 rarg 0.34375
24: out 90 biased 13; lerr 0.00145446 rerr 0.00420307 larg 0.34375 rarg 0.347656
25: out 88 biased 14; lerr 0.0050098 rerr 0.000564416 larg 0.347656 rarg 0.351562
26: out 87 biased 14; lerr 0.00406783 rerr 0.00144985 larg 0.351562 rarg 0.355469
27: out 86 biased 14; lerr 0.00320806 rerr 0.00225385 larg 0.355469 rarg 0.359375
28: out 85 biased 14; lerr 0.00242958 rerr 0.00297735 larg 0.359375 rarg 0.363281
29: out 84 biased 14; lerr 0.00173146 rerr 0.00362122 larg 0.363281 rarg 0.367188
30: out 83 biased 14; lerr 0.00111284 rerr 0.00418633 larg 0.367188 rarg 0.371094
31: out 82 biased 14; lerr 0.000572846 rerr 0.00467353 larg 0.371094 rarg 0.375
32: out 80 biased 15; lerr 0.00489479 rerr 0.00027462 larg 0.375 rarg 0.378906
33: out 79 biased 15; lerr 0.00453439 rerr 0.000583717 larg 0.378906 rarg 0.382812
34: out 78 biased 15; lerr 0.00425002 rerr 0.000817442 larg 0.382812 rarg 0.386719
35: out 77 biased 15; lerr 0.0040409 rerr 0.000976562 larg 0.386719 rarg 0.390625
36: out 76 biased 15; lerr 0.00390625 rerr 0.00106183 larg 0.390625 rarg 0.394531
37: out 75 biased 15; lerr 0.00384534 rerr 0.00107398 larg 0.394531 rarg 0.398438
38: out 74 biased 15; lerr 0.00385742 rerr 0.00101372 larg 0.398438 rarg 0.402344
39: out 73 biased 15; lerr 0.00394179 rerr 0.00088176 larg 0.402344 rarg 0.40625
40: out 72 biased 15; lerr 0.00409775 rerr 0.000678786 larg 0.40625 rarg 0.410156
41: out 71 biased 15; lerr 0.00432461 rerr 0.000405468 larg 0.410156 rarg 0.414062
42: out 70 biased 15; lerr 0.0046217 rerr 6.24637E-05 larg 0.414062 rarg 0.417969
43: out 70 biased 14; lerr 6.24637E-05 rerr 0.00472478 larg 0.417969 rarg 0.421875
44: out 69 biased 14; lerr 0.000349583 rerr 0.00426776 larg 0.421875 rarg 0.425781
45: out 68 biased 14; lerr 0.000830041 rerr 0.00374284 larg 0.425781 rarg 0.429688
46: out 67 biased 14; lerr 0.00137829 rerr 0.00315063 larg 0.429688 rarg 0.433594
47: out 66 biased 14; lerr 0.00199374 rerr 0.00249171 larg 0.433594 rarg 0.4375
48: out 65 biased 14; lerr 0.00267578 rerr 0.00176667 larg 0.4375 rarg 0.441406
49: out 64 biased 14; lerr 0.00342383 rerr 0.000976086 larg 0.441406 rarg 0.445312
50: out 63 biased 14; lerr 0.00423733 rerr 0.000120513 larg 0.445312 rarg 0.449219
51: out 63 biased 13; lerr 0.000120513 rerr 0.00445945 larg 0.449219 rarg 0.453125
52: out 62 biased 13; lerr 0.000799499 rerr 0.00349816 larg 0.453125 rarg 0.457031
53: out 61 biased 13; lerr 0.00178341 rerr 0.00247339 larg 0.457031 rarg 0.460938
54: out 60 biased 13; lerr 0.0028307 rerr 0.00138568 larg 0.460938 rarg 0.464844
55: out 59 biased 13; lerr 0.00394084 rerr 0.00023553 larg 0.464844 rarg 0.46875
56: out 59 biased 12; lerr 0.00023553 rerr 0.00439453 larg 0.46875 rarg 0.472656
57: out 58 biased 12; lerr 0.000976562 rerr 0.00314314 larg 0.472656 rarg 0.476562
58: out 57 biased 12; lerr 0.0022501 rerr 0.00183069 larg 0.476562 rarg 0.480469
59: out 56 biased 12; lerr 0.00358461 rerr 0.000457659 larg 0.480469 rarg 0.484375
60: out 56 biased 11; lerr 0.000457659 rerr 0.00448366 larg 0.484375 rarg 0.488281
61: out 55 biased 11; lerr 0.000975489 rerr 0.00301265 larg 0.488281 rarg 0.492188
62: out 54 biased 11; lerr 0.00246829 rerr 0.00148234 larg 0.492188 rarg 0.496094
63: out 53 biased 11; lerr 0.00402031 rerr 0.000106817 larg 0.496094 rarg 0.5
64: out 52 biased 11; lerr 0.00563109 rerr 0.00210731 larg 0.5 rarg 0.507812
65: out 51 biased 11; lerr 0.00345996 rerr 0.00417648 larg 0.507812 rarg 0.515625
66: out 50 biased 11; lerr 0.00143345 rerr 0.00610301 larg 0.515625 rarg 0.523438
67: out 48 biased 12; lerr 0.00520152 rerr 0.00219486 larg 0.523438 rarg 0.53125
68: out 47 biased 12; lerr 0.00349943 rerr 0.00380104 larg 0.53125 rarg 0.539062
69: out 46 biased 12; lerr 0.00193497 rerr 0.00527137 larg 0.539062 rarg 0.546875
70: out 44 biased 13; lerr 0.00628347 rerr 0.000789331 larg 0.546875 rarg 0.554688
71: out 43 biased 13; lerr 0.00502921 rerr 0.00195312 larg 0.554688 rarg 0.5625
72: out 42 biased 13; lerr 0.00390625 rerr 0.00298721 larg 0.5625 rarg 0.570312
73: out 41 biased 13; lerr 0.00291271 rerr 0.00389343 larg 0.570312 rarg 0.578125
74: out 40 biased 13; lerr 0.00204677 rerr 0.00467353 larg 0.578125 rarg 0.585938
75: out 39 biased 13; lerr 0.00130667 rerr 0.00532924 larg 0.585938 rarg 0.59375
76: out 38 biased 13; lerr 0.000690699 rerr 0.00586222 larg 0.59375 rarg 0.601562
77: out 36 biased 14; lerr 0.0062566 rerr 0.000175461 larg 0.601562 rarg 0.609375
78: out 35 biased 14; lerr 0.00592317 rerr 0.000428823 larg 0.609375 rarg 0.617188
79: out 34 biased 14; lerr 0.00570878 rerr 0.000564416 larg 0.617188 rarg 0.625
80: out 33 biased 14; lerr 0.00561191 rerr 0.000583717 larg 0.625 rarg 0.632812
81: out 32 biased 14; lerr 0.00563109 rerr 0.000488162 larg 0.632812 rarg 0.640625
82: out 31 biased 14; lerr 0.00576489 rerr 0.000279149 larg 0.640625 rarg 0.648438
83: out 30 biased 14; lerr 0.00601191 rerr 4.19626E-05 larg 0.648438 rarg 0.65625
84: out 30 biased 13; lerr 4.19626E-05 rerr 0.00589256 larg 0.65625 rarg 0.664062
85: out 29 biased 13; lerr 0.00047385 rerr 0.00538852 larg 0.664062 rarg 0.671875
86: out 28 biased 13; lerr 0.00101522 rerr 0.00477604 larg 0.671875 rarg 0.679688
87: out 27 biased 13; lerr 0.00166483 rerr 0.00405633 larg 0.679688 rarg 0.6875
88: out 26 biased 13; lerr 0.00242145 rerr 0.0032306 larg 0.6875 rarg 0.695312
89: out 25 biased 13; lerr 0.00328389 rerr 0.0023 larg 0.695312 rarg 0.703125
90: out 24 biased 13; lerr 0.00425098 rerr 0.00126568 larg 0.703125 rarg 0.710938
91: out 23 biased 13; lerr 0.0053216 rerr 0.000128738 larg 0.710938 rarg 0.71875
92: out 23 biased 12; lerr 0.000128738 rerr 0.00554953 larg 0.71875 rarg 0.726562
93: out 22 biased 12; lerr 0.00110974 rerr 0.00424628 larg 0.726562 rarg 0.734375
94: out 21 biased 12; lerr 0.0024487 rerr 0.00284339 larg 0.734375 rarg 0.742188
95: out 20 biased 12; lerr 0.0038871 rerr 0.00134187 larg 0.742188 rarg 0.75
96: out 19 biased 12; lerr 0.00542395 rerr 0.000257287 larg 0.75 rarg 0.757812
97: out 19 biased 11; lerr 0.000257287 rerr 0.00488281 larg 0.757812 rarg 0.765625
98: out 18 biased 11; lerr 0.00195312 rerr 0.00312603 larg 0.765625 rarg 0.773438
99: out 17 biased 11; lerr 0.0037447 rerr 0.00127425 larg 0.773438 rarg 0.78125
100: out 16 biased 11; lerr 0.00563109 rerr 0.000671612 larg 0.78125 rarg 0.789062
101: out 16 biased 10; lerr 0.000671612 rerr 0.00426337 larg 0.789062 rarg 0.796875
102: out 15 biased 10; lerr 0.00271068 rerr 0.00216607 larg 0.796875 rarg 0.804688
103: out 14 biased 10; lerr 0.00484208 rerr 2.28884E-05 larg 0.804688 rarg 0.8125
104: out 14 biased 9; lerr 2.28884E-05 rerr 0.00477319 larg 0.8125 rarg 0.820312
105: out 13 biased 9; lerr 0.00230268 rerr 0.00243701 larg 0.820312 rarg 0.828125
106: out 12 biased 9; lerr 0.00467248 rerr 1.1444E-05 larg 0.828125 rarg 0.835938
107: out 12 biased 8; lerr 1.1444E-05 rerr 0.00467353 larg 0.835938 rarg 0.84375
108: out 11 biased 8; lerr 0.00250271 rerr 0.00210469 larg 0.84375 rarg 0.851562
109: out 10 biased 8; lerr 0.0051047 rerr 0.000551376 larg 0.851562 rarg 0.859375
110: out 10 biased 7; lerr 0.000551376 rerr 0.00398129 larg 0.859375 rarg 0.867188
111: out 9 biased 7; lerr 0.00329393 rerr 0.00118567 larg 0.867188 rarg 0.875
112: out 9 biased 6; lerr 0.00118567 rerr 0.00564531 larg 0.875 rarg 0.882812
113: out 8 biased 6; lerr 0.00169516 rerr 0.00271239 larg 0.882812 rarg 0.890625
114: out 7 biased 6; lerr 0.0046605 rerr 0.000304507 larg 0.890625 rarg 0.898438
115: out 7 biased 5; lerr 0.000304507 rerr 0.00403259 larg 0.898438 rarg 0.90625
116: out 6 biased 5; lerr 0.00340469 rerr 0.00088176 larg 0.90625 rarg 0.914062
117: out 6 biased 4; lerr 0.00088176 rerr 0.00514993 larg 0.914062 rarg 0.921875
118: out 5 biased 4; lerr 0.00235119 rerr 0.00186722 larg 0.921875 rarg 0.929688
119: out 4 biased 4; lerr 0.00566562 rerr 0.00149648 larg 0.929688 rarg 0.9375
120: out 4 biased 3; lerr 0.00149648 rerr 0.00265532 larg 0.9375 rarg 0.945312
121: out 3 biased 3; lerr 0.00494055 rerr 0.0008372 larg 0.945312 rarg 0.953125
122: out 3 biased 2; lerr 0.0008372 rerr 0.00324937 larg 0.953125 rarg 0.960938
123: out 2 biased 2; lerr 0.00440902 rerr 0.000370094 larg 0.960938 rarg 0.96875
124: out 2 biased 1; lerr 0.000370094 rerr 0.00365258 larg 0.96875 rarg 0.976562
125: out 1 biased 1; lerr 0.00406783 rerr 9.20338E-05 larg 0.976562 rarg 0.984375
126: out 1 biased 0; lerr 9.20338E-05 rerr 0.00386801 larg 0.984375 rarg 0.992188
127: out 0 biased 0; lerr 0.00391391 rerr 0 larg 0.992188 rarg 1

... [removed hex data dumping]

max recip 7x7 error at 0.519531: 0.00558472 or 2^-7.4843
max rsqrt 7x7 error at 0.546875: 0.00628347 or  2^-7.31422

On 2020-08-03 1:17 p.m., Bill Huffman wrote:

I should have said that my results are for the 7/7 case.  And it sounds like we're in agreement then.  We probably have the same table.

Bill

On 8/2/20 9:50 AM, DSHORNER wrote:
EXTERNAL MAIL

This is the link to the revised code that does n by m LUT

https://github.com/David-Horner/recip/blob/master/vrecip.cc

On 2020-08-01 4:51 p.m., David Horner via lists.riscv.org wrote:

Join {tech-vector-ext@lists.riscv.org to automatically receive all group messages.