Re: VFRECIP/VFRSQRT instructions


David Horner
 

Now annotated version --detail
https://github.com/David-Horner/recip/blob/master/vrecip.cc

For the 7x7 below notice the biased value does not exceed 21 for recip (5 of 7 bits) and 15 for rsqrt (4 of 7 bits).

ip 7 op 7 LUT #bits 896 verilog 0  test/test-long 1
Recip7x7LUT (input [6:0] in, output reg [6:0] out);
 in[6:0]  corresponds to sig[S-1:S-6]
 out[6:0] corresponds to sig[S-1:S-6]
 biased : ((ipN-1) - in) << (op - ip) // or >> if neg
 base bias 127  left-shift 0 right-shift 0
 0: out = 127 biased 0; lerr 0.00390625 rerr 0.00387573 larg 0.5 rarg 0.503906
 1: out = 125 biased 1; lerr 0.0039978 rerr 0.00372314 larg 0.503906 rarg 0.507812
 2: out = 123 biased 2; lerr 0.00421143 rerr 0.00344849 larg 0.507812 rarg 0.511719
 3: out = 121 biased 3; lerr 0.00454712 rerr 0.00305176 larg 0.511719 rarg 0.515625
 4: out = 119 biased 4; lerr 0.00500488 rerr 0.00253296 larg 0.515625 rarg 0.519531
 5: out = 117 biased 5; lerr 0.00558472 rerr 0.00189209 larg 0.519531 rarg 0.523438
 6: out = 116 biased 5; lerr 0.00219727 rerr 0.00524902 larg 0.523438 rarg 0.527344
 7: out = 114 biased 6; lerr 0.00299072 rerr 0.00439453 larg 0.527344 rarg 0.53125
 8: out = 112 biased 7; lerr 0.00390625 rerr 0.00341797 larg 0.53125 rarg 0.535156
 9: out = 110 biased 8; lerr 0.00494385 rerr 0.00231934 larg 0.535156 rarg 0.539062
 10: out = 109 biased 8; lerr 0.00189209 rerr 0.00534058 larg 0.539062 rarg 0.542969
 11: out = 107 biased 9; lerr 0.00314331 rerr 0.00402832 larg 0.542969 rarg 0.546875
 12: out = 105 biased 10; lerr 0.0045166 rerr 0.00259399 larg 0.546875 rarg 0.550781
 13: out = 104 biased 10; lerr 0.00170898 rerr 0.00537109 larg 0.550781 rarg 0.554688
 14: out = 102 biased 11; lerr 0.0032959 rerr 0.00372314 larg 0.554688 rarg 0.558594
 15: out = 100 biased 12; lerr 0.00500488 rerr 0.00195312 larg 0.558594 rarg 0.5625
 16: out = 99 biased 12; lerr 0.00244141 rerr 0.00448608 larg 0.5625 rarg 0.566406
 17: out = 97 biased 13; lerr 0.00436401 rerr 0.00250244 larg 0.566406 rarg 0.570312
 18: out = 96 biased 13; lerr 0.00195312 rerr 0.00488281 larg 0.570312 rarg 0.574219
 19: out = 94 biased 14; lerr 0.00408936 rerr 0.00268555 larg 0.574219 rarg 0.578125
 20: out = 93 biased 14; lerr 0.00183105 rerr 0.00491333 larg 0.578125 rarg 0.582031
 21: out = 91 biased 15; lerr 0.00418091 rerr 0.00250244 larg 0.582031 rarg 0.585938
 22: out = 90 biased 15; lerr 0.0020752 rerr 0.00457764 larg 0.585938 rarg 0.589844
 23: out = 88 biased 16; lerr 0.00463867 rerr 0.00195312 larg 0.589844 rarg 0.59375
 24: out = 87 biased 16; lerr 0.00268555 rerr 0.00387573 larg 0.59375 rarg 0.597656
 25: out = 85 biased 17; lerr 0.00546265 rerr 0.0010376 larg 0.597656 rarg 0.601562
 26: out = 84 biased 17; lerr 0.00366211 rerr 0.00280762 larg 0.601562 rarg 0.605469
 27: out = 83 biased 17; lerr 0.00192261 rerr 0.0045166 larg 0.605469 rarg 0.609375
 28: out = 81 biased 18; lerr 0.00500488 rerr 0.00137329 larg 0.609375 rarg 0.613281
 29: out = 80 biased 18; lerr 0.00341797 rerr 0.00292969 larg 0.613281 rarg 0.617188
 30: out = 79 biased 18; lerr 0.00189209 rerr 0.00442505 larg 0.617188 rarg 0.621094
 31: out = 77 biased 19; lerr 0.00527954 rerr 0.000976562 larg 0.621094 rarg 0.625
 32: out = 76 biased 19; lerr 0.00390625 rerr 0.00231934 larg 0.625 rarg 0.628906
 33: out = 75 biased 19; lerr 0.00259399 rerr 0.00360107 larg 0.628906 rarg 0.632812
 34: out = 74 biased 19; lerr 0.00134277 rerr 0.00482178 larg 0.632812 rarg 0.636719
 35: out = 72 biased 20; lerr 0.00512695 rerr 0.000976562 larg 0.636719 rarg 0.640625
 36: out = 71 biased 20; lerr 0.00402832 rerr 0.00204468 larg 0.640625 rarg 0.644531
 37: out = 70 biased 20; lerr 0.00299072 rerr 0.00305176 larg 0.644531 rarg 0.648438
 38: out = 69 biased 20; lerr 0.00201416 rerr 0.0039978 larg 0.648438 rarg 0.652344
 39: out = 68 biased 20; lerr 0.00109863 rerr 0.00488281 larg 0.652344 rarg 0.65625
 40: out = 66 biased 21; lerr 0.00537109 rerr 0.000549316 larg 0.65625 rarg 0.660156
 41: out = 65 biased 21; lerr 0.00460815 rerr 0.00128174 larg 0.660156 rarg 0.664062
 42: out = 64 biased 21; lerr 0.00390625 rerr 0.00195312 larg 0.664062 rarg 0.667969
 43: out = 63 biased 21; lerr 0.00326538 rerr 0.00256348 larg 0.667969 rarg 0.671875
 44: out = 62 biased 21; lerr 0.00268555 rerr 0.00311279 larg 0.671875 rarg 0.675781
 45: out = 61 biased 21; lerr 0.00216675 rerr 0.00360107 larg 0.675781 rarg 0.679688
 46: out = 60 biased 21; lerr 0.00170898 rerr 0.00402832 larg 0.679688 rarg 0.683594
 47: out = 59 biased 21; lerr 0.00131226 rerr 0.00439453 larg 0.683594 rarg 0.6875
 48: out = 58 biased 21; lerr 0.000976562 rerr 0.00469971 larg 0.6875 rarg 0.691406
 49: out = 57 biased 21; lerr 0.000701904 rerr 0.00494385 larg 0.691406 rarg 0.695312
 50: out = 56 biased 21; lerr 0.000488281 rerr 0.00512695 larg 0.695312 rarg 0.699219
 51: out = 55 biased 21; lerr 0.000335693 rerr 0.00524902 larg 0.699219 rarg 0.703125
 52: out = 54 biased 21; lerr 0.000244141 rerr 0.00531006 larg 0.703125 rarg 0.707031
 53: out = 53 biased 21; lerr 0.000213623 rerr 0.00531006 larg 0.707031 rarg 0.710938
 54: out = 52 biased 21; lerr 0.000244141 rerr 0.00524902 larg 0.710938 rarg 0.714844
 55: out = 51 biased 21; lerr 0.000335693 rerr 0.00512695 larg 0.714844 rarg 0.71875
 56: out = 50 biased 21; lerr 0.000488281 rerr 0.00494385 larg 0.71875 rarg 0.722656
 57: out = 49 biased 21; lerr 0.000701904 rerr 0.00469971 larg 0.722656 rarg 0.726562
 58: out = 48 biased 21; lerr 0.000976562 rerr 0.00439453 larg 0.726562 rarg 0.730469
 59: out = 47 biased 21; lerr 0.00131226 rerr 0.00402832 larg 0.730469 rarg 0.734375
 60: out = 46 biased 21; lerr 0.00170898 rerr 0.00360107 larg 0.734375 rarg 0.738281
 61: out = 45 biased 21; lerr 0.00216675 rerr 0.00311279 larg 0.738281 rarg 0.742188
 62: out = 44 biased 21; lerr 0.00268555 rerr 0.00256348 larg 0.742188 rarg 0.746094
 63: out = 43 biased 21; lerr 0.00326538 rerr 0.00195312 larg 0.746094 rarg 0.75
 64: out = 42 biased 21; lerr 0.00390625 rerr 0.00128174 larg 0.75 rarg 0.753906
 65: out = 41 biased 21; lerr 0.00460815 rerr 0.000549316 larg 0.753906 rarg 0.757812
 66: out = 40 biased 21; lerr 0.00537109 rerr 0.000244141 larg 0.757812 rarg 0.761719
 67: out = 40 biased 20; lerr 0.000244141 rerr 0.00488281 larg 0.761719 rarg 0.765625
 68: out = 39 biased 20; lerr 0.00109863 rerr 0.0039978 larg 0.765625 rarg 0.769531
 69: out = 38 biased 20; lerr 0.00201416 rerr 0.00305176 larg 0.769531 rarg 0.773438
 70: out = 37 biased 20; lerr 0.00299072 rerr 0.00204468 larg 0.773438 rarg 0.777344
 71: out = 36 biased 20; lerr 0.00402832 rerr 0.000976562 larg 0.777344 rarg 0.78125
 72: out = 35 biased 20; lerr 0.00512695 rerr 0.000152588 larg 0.78125 rarg 0.785156
 73: out = 35 biased 19; lerr 0.000152588 rerr 0.00482178 larg 0.785156 rarg 0.789062
 74: out = 34 biased 19; lerr 0.00134277 rerr 0.00360107 larg 0.789062 rarg 0.792969
 75: out = 33 biased 19; lerr 0.00259399 rerr 0.00231934 larg 0.792969 rarg 0.796875
 76: out = 32 biased 19; lerr 0.00390625 rerr 0.000976562 larg 0.796875 rarg 0.800781
 77: out = 31 biased 19; lerr 0.00527954 rerr 0.000427246 larg 0.800781 rarg 0.804688
 78: out = 31 biased 18; lerr 0.000427246 rerr 0.00442505 larg 0.804688 rarg 0.808594
 79: out = 30 biased 18; lerr 0.00189209 rerr 0.00292969 larg 0.808594 rarg 0.8125
 80: out = 29 biased 18; lerr 0.00341797 rerr 0.00137329 larg 0.8125 rarg 0.816406
 81: out = 28 biased 18; lerr 0.00500488 rerr 0.000244141 larg 0.816406 rarg 0.820312
 82: out = 28 biased 17; lerr 0.000244141 rerr 0.0045166 larg 0.820312 rarg 0.824219
 83: out = 27 biased 17; lerr 0.00192261 rerr 0.00280762 larg 0.824219 rarg 0.828125
 84: out = 26 biased 17; lerr 0.00366211 rerr 0.0010376 larg 0.828125 rarg 0.832031
 85: out = 25 biased 17; lerr 0.00546265 rerr 0.000793457 larg 0.832031 rarg 0.835938
 86: out = 25 biased 16; lerr 0.000793457 rerr 0.00387573 larg 0.835938 rarg 0.839844
 87: out = 24 biased 16; lerr 0.00268555 rerr 0.00195312 larg 0.839844 rarg 0.84375
 88: out = 23 biased 16; lerr 0.00463867 rerr 3.05176E-05 larg 0.84375 rarg 0.847656
 89: out = 23 biased 15; lerr 3.05176E-05 rerr 0.00457764 larg 0.847656 rarg 0.851562
 90: out = 22 biased 15; lerr 0.0020752 rerr 0.00250244 larg 0.851562 rarg 0.855469
 91: out = 21 biased 15; lerr 0.00418091 rerr 0.000366211 larg 0.855469 rarg 0.859375
 92: out = 21 biased 14; lerr 0.000366211 rerr 0.00491333 larg 0.859375 rarg 0.863281
 93: out = 20 biased 14; lerr 0.00183105 rerr 0.00268555 larg 0.863281 rarg 0.867188
 94: out = 19 biased 14; lerr 0.00408936 rerr 0.000396729 larg 0.867188 rarg 0.871094
 95: out = 19 biased 13; lerr 0.000396729 rerr 0.00488281 larg 0.871094 rarg 0.875
 96: out = 18 biased 13; lerr 0.00195312 rerr 0.00250244 larg 0.875 rarg 0.878906
 97: out = 17 biased 13; lerr 0.00436401 rerr 6.10352E-05 larg 0.878906 rarg 0.882812
 98: out = 17 biased 12; lerr 6.10352E-05 rerr 0.00448608 larg 0.882812 rarg 0.886719
 99: out = 16 biased 12; lerr 0.00244141 rerr 0.00195312 larg 0.886719 rarg 0.890625
 100: out = 15 biased 12; lerr 0.00500488 rerr 0.000640869 larg 0.890625 rarg 0.894531
 101: out = 15 biased 11; lerr 0.000640869 rerr 0.00372314 larg 0.894531 rarg 0.898438
 102: out = 14 biased 11; lerr 0.0032959 rerr 0.0010376 larg 0.898438 rarg 0.902344
 103: out = 14 biased 10; lerr 0.0010376 rerr 0.00537109 larg 0.902344 rarg 0.90625
 104: out = 13 biased 10; lerr 0.00170898 rerr 0.00259399 larg 0.90625 rarg 0.910156
 105: out = 12 biased 10; lerr 0.0045166 rerr 0.000244141 larg 0.910156 rarg 0.914062
 106: out = 12 biased 9; lerr 0.000244141 rerr 0.00402832 larg 0.914062 rarg 0.917969
 107: out = 11 biased 9; lerr 0.00314331 rerr 0.00109863 larg 0.917969 rarg 0.921875
 108: out = 11 biased 8; lerr 0.00109863 rerr 0.00534058 larg 0.921875 rarg 0.925781
 109: out = 10 biased 8; lerr 0.00189209 rerr 0.00231934 larg 0.925781 rarg 0.929688
 110: out = 9 biased 8; lerr 0.00494385 rerr 0.000762939 larg 0.929688 rarg 0.933594
 111: out = 9 biased 7; lerr 0.000762939 rerr 0.00341797 larg 0.933594 rarg 0.9375
 112: out = 8 biased 7; lerr 0.00390625 rerr 0.000244141 larg 0.9375 rarg 0.941406
 113: out = 8 biased 6; lerr 0.000244141 rerr 0.00439453 larg 0.941406 rarg 0.945312
 114: out = 7 biased 6; lerr 0.00299072 rerr 0.00112915 larg 0.945312 rarg 0.949219
 115: out = 7 biased 5; lerr 0.00112915 rerr 0.00524902 larg 0.949219 rarg 0.953125
 116: out = 6 biased 5; lerr 0.00219727 rerr 0.00189209 larg 0.953125 rarg 0.957031
 117: out = 5 biased 5; lerr 0.00558472 rerr 0.00152588 larg 0.957031 rarg 0.960938
 118: out = 5 biased 4; lerr 0.00152588 rerr 0.00253296 larg 0.960938 rarg 0.964844
 119: out = 4 biased 4; lerr 0.00500488 rerr 0.000976562 larg 0.964844 rarg 0.96875
 120: out = 4 biased 3; lerr 0.000976562 rerr 0.00305176 larg 0.96875 rarg 0.972656
 121: out = 3 biased 3; lerr 0.00454712 rerr 0.000549316 larg 0.972656 rarg 0.976562
 122: out = 3 biased 2; lerr 0.000549316 rerr 0.00344849 larg 0.976562 rarg 0.980469
 123: out = 2 biased 2; lerr 0.00421143 rerr 0.000244141 larg 0.980469 rarg 0.984375
 124: out = 2 biased 1; lerr 0.000244141 rerr 0.00372314 larg 0.984375 rarg 0.988281
 125: out = 1 biased 1; lerr 0.0039978 rerr 6.10352E-05 larg 0.988281 rarg 0.992188
 126: out = 1 biased 0; lerr 6.10352E-05 rerr 0.00387573 larg 0.992188 rarg 0.996094
 127: out = 0 biased 0; lerr 0.00390625 rerr 0 larg 0.996094 rarg 1

 ... [removed hex data dumping]

RSqrt7x7LUT (input [6:0] in, output reg [6:0] out);
  // in[6] corresponds to exp[0]
  // in[5:0] corresponds to sig[S-1:S-5]
  // out[6:0] corresponds to sig[S-1:S-6]
  // biased : ((ipN-1) - in) << (op - ip)
 0: out 127 biased 0; lerr 0.00390625 rerr 0.00384557 larg 0.25 rarg 0.253906
 1: out 125 biased 1; lerr 0.00402773 rerr 0.00360435 larg 0.253906 rarg 0.257812
 2: out 123 biased 2; lerr 0.00432928 rerr 0.00318533 larg 0.257812 rarg 0.261719
 3: out 121 biased 3; lerr 0.00480818 rerr 0.00259111 larg 0.261719 rarg 0.265625
 4: out 119 biased 4; lerr 0.00546183 rerr 0.00182426 larg 0.265625 rarg 0.269531
 5: out 118 biased 4; lerr 0.0022317 rerr 0.00497249 larg 0.269531 rarg 0.273438
 6: out 116 biased 5; lerr 0.00319802 rerr 0.00389675 larg 0.273438 rarg 0.277344
 7: out 114 biased 6; lerr 0.00433191 rerr 0.00265532 larg 0.277344 rarg 0.28125
 8: out 113 biased 6; lerr 0.00148789 rerr 0.00542232 larg 0.28125 rarg 0.285156
 9: out 111 biased 7; lerr 0.00292144 rerr 0.00388464 larg 0.285156 rarg 0.289062
 10: out 109 biased 8; lerr 0.00451607 rerr 0.0021876 larg 0.289062 rarg 0.292969
 11: out 108 biased 8; lerr 0.00204104 rerr 0.00458999 larg 0.292969 rarg 0.296875
 12: out 106 biased 9; lerr 0.00392348 rerr 0.00260824 larg 0.296875 rarg 0.300781
 13: out 105 biased 9; lerr 0.00167641 rerr 0.00478529 larg 0.300781 rarg 0.304688
 14: out 103 biased 10; lerr 0.00383947 rerr 0.00252584 larg 0.304688 rarg 0.308594
 15: out 102 biased 10; lerr 0.0018141 rerr 0.00448366 larg 0.308594 rarg 0.3125
 16: out 100 biased 11; lerr 0.00425098 rerr 0.00195312 larg 0.3125 rarg 0.316406
 17: out 99 biased 11; lerr 0.00244141 rerr 0.00369747 larg 0.316406 rarg 0.320312
 18: out 97 biased 12; lerr 0.00514568 rerr 0.000902127 larg 0.320312 rarg 0.324219
 19: out 96 biased 12; lerr 0.00354633 rerr 0.00243843 larg 0.324219 rarg 0.328125
 20: out 95 biased 12; lerr 0.00203674 rerr 0.00388594 larg 0.328125 rarg 0.332031
 21: out 93 biased 13; lerr 0.00511752 rerr 0.000717621 larg 0.332031 rarg 0.335938
 22: out 92 biased 13; lerr 0.00381051 rerr 0.00196455 larg 0.335938 rarg 0.339844
 23: out 91 biased 13; lerr 0.00258984 rerr 0.00312603 larg 0.339844 rarg 0.34375
 24: out 90 biased 13; lerr 0.00145446 rerr 0.00420307 larg 0.34375 rarg 0.347656
 25: out 88 biased 14; lerr 0.0050098 rerr 0.000564416 larg 0.347656 rarg 0.351562
 26: out 87 biased 14; lerr 0.00406783 rerr 0.00144985 larg 0.351562 rarg 0.355469
 27: out 86 biased 14; lerr 0.00320806 rerr 0.00225385 larg 0.355469 rarg 0.359375
 28: out 85 biased 14; lerr 0.00242958 rerr 0.00297735 larg 0.359375 rarg 0.363281
 29: out 84 biased 14; lerr 0.00173146 rerr 0.00362122 larg 0.363281 rarg 0.367188
 30: out 83 biased 14; lerr 0.00111284 rerr 0.00418633 larg 0.367188 rarg 0.371094
 31: out 82 biased 14; lerr 0.000572846 rerr 0.00467353 larg 0.371094 rarg 0.375
 32: out 80 biased 15; lerr 0.00489479 rerr 0.00027462 larg 0.375 rarg 0.378906
 33: out 79 biased 15; lerr 0.00453439 rerr 0.000583717 larg 0.378906 rarg 0.382812
 34: out 78 biased 15; lerr 0.00425002 rerr 0.000817442 larg 0.382812 rarg 0.386719
 35: out 77 biased 15; lerr 0.0040409 rerr 0.000976562 larg 0.386719 rarg 0.390625
 36: out 76 biased 15; lerr 0.00390625 rerr 0.00106183 larg 0.390625 rarg 0.394531
 37: out 75 biased 15; lerr 0.00384534 rerr 0.00107398 larg 0.394531 rarg 0.398438
 38: out 74 biased 15; lerr 0.00385742 rerr 0.00101372 larg 0.398438 rarg 0.402344
 39: out 73 biased 15; lerr 0.00394179 rerr 0.00088176 larg 0.402344 rarg 0.40625
 40: out 72 biased 15; lerr 0.00409775 rerr 0.000678786 larg 0.40625 rarg 0.410156
 41: out 71 biased 15; lerr 0.00432461 rerr 0.000405468 larg 0.410156 rarg 0.414062
 42: out 70 biased 15; lerr 0.0046217 rerr 6.24637E-05 larg 0.414062 rarg 0.417969
 43: out 70 biased 14; lerr 6.24637E-05 rerr 0.00472478 larg 0.417969 rarg 0.421875
 44: out 69 biased 14; lerr 0.000349583 rerr 0.00426776 larg 0.421875 rarg 0.425781
 45: out 68 biased 14; lerr 0.000830041 rerr 0.00374284 larg 0.425781 rarg 0.429688
 46: out 67 biased 14; lerr 0.00137829 rerr 0.00315063 larg 0.429688 rarg 0.433594
 47: out 66 biased 14; lerr 0.00199374 rerr 0.00249171 larg 0.433594 rarg 0.4375
 48: out 65 biased 14; lerr 0.00267578 rerr 0.00176667 larg 0.4375 rarg 0.441406
 49: out 64 biased 14; lerr 0.00342383 rerr 0.000976086 larg 0.441406 rarg 0.445312
 50: out 63 biased 14; lerr 0.00423733 rerr 0.000120513 larg 0.445312 rarg 0.449219
 51: out 63 biased 13; lerr 0.000120513 rerr 0.00445945 larg 0.449219 rarg 0.453125
 52: out 62 biased 13; lerr 0.000799499 rerr 0.00349816 larg 0.453125 rarg 0.457031
 53: out 61 biased 13; lerr 0.00178341 rerr 0.00247339 larg 0.457031 rarg 0.460938
 54: out 60 biased 13; lerr 0.0028307 rerr 0.00138568 larg 0.460938 rarg 0.464844
 55: out 59 biased 13; lerr 0.00394084 rerr 0.00023553 larg 0.464844 rarg 0.46875
 56: out 59 biased 12; lerr 0.00023553 rerr 0.00439453 larg 0.46875 rarg 0.472656
 57: out 58 biased 12; lerr 0.000976562 rerr 0.00314314 larg 0.472656 rarg 0.476562
 58: out 57 biased 12; lerr 0.0022501 rerr 0.00183069 larg 0.476562 rarg 0.480469
 59: out 56 biased 12; lerr 0.00358461 rerr 0.000457659 larg 0.480469 rarg 0.484375
 60: out 56 biased 11; lerr 0.000457659 rerr 0.00448366 larg 0.484375 rarg 0.488281
 61: out 55 biased 11; lerr 0.000975489 rerr 0.00301265 larg 0.488281 rarg 0.492188
 62: out 54 biased 11; lerr 0.00246829 rerr 0.00148234 larg 0.492188 rarg 0.496094
 63: out 53 biased 11; lerr 0.00402031 rerr 0.000106817 larg 0.496094 rarg 0.5
 64: out 52 biased 11; lerr 0.00563109 rerr 0.00210731 larg 0.5 rarg 0.507812
 65: out 51 biased 11; lerr 0.00345996 rerr 0.00417648 larg 0.507812 rarg 0.515625
 66: out 50 biased 11; lerr 0.00143345 rerr 0.00610301 larg 0.515625 rarg 0.523438
 67: out 48 biased 12; lerr 0.00520152 rerr 0.00219486 larg 0.523438 rarg 0.53125
 68: out 47 biased 12; lerr 0.00349943 rerr 0.00380104 larg 0.53125 rarg 0.539062
 69: out 46 biased 12; lerr 0.00193497 rerr 0.00527137 larg 0.539062 rarg 0.546875
 70: out 44 biased 13; lerr 0.00628347 rerr 0.000789331 larg 0.546875 rarg 0.554688
 71: out 43 biased 13; lerr 0.00502921 rerr 0.00195312 larg 0.554688 rarg 0.5625
 72: out 42 biased 13; lerr 0.00390625 rerr 0.00298721 larg 0.5625 rarg 0.570312
 73: out 41 biased 13; lerr 0.00291271 rerr 0.00389343 larg 0.570312 rarg 0.578125
 74: out 40 biased 13; lerr 0.00204677 rerr 0.00467353 larg 0.578125 rarg 0.585938
 75: out 39 biased 13; lerr 0.00130667 rerr 0.00532924 larg 0.585938 rarg 0.59375
 76: out 38 biased 13; lerr 0.000690699 rerr 0.00586222 larg 0.59375 rarg 0.601562
 77: out 36 biased 14; lerr 0.0062566 rerr 0.000175461 larg 0.601562 rarg 0.609375
 78: out 35 biased 14; lerr 0.00592317 rerr 0.000428823 larg 0.609375 rarg 0.617188
 79: out 34 biased 14; lerr 0.00570878 rerr 0.000564416 larg 0.617188 rarg 0.625
 80: out 33 biased 14; lerr 0.00561191 rerr 0.000583717 larg 0.625 rarg 0.632812
 81: out 32 biased 14; lerr 0.00563109 rerr 0.000488162 larg 0.632812 rarg 0.640625
 82: out 31 biased 14; lerr 0.00576489 rerr 0.000279149 larg 0.640625 rarg 0.648438
 83: out 30 biased 14; lerr 0.00601191 rerr 4.19626E-05 larg 0.648438 rarg 0.65625
 84: out 30 biased 13; lerr 4.19626E-05 rerr 0.00589256 larg 0.65625 rarg 0.664062
 85: out 29 biased 13; lerr 0.00047385 rerr 0.00538852 larg 0.664062 rarg 0.671875
 86: out 28 biased 13; lerr 0.00101522 rerr 0.00477604 larg 0.671875 rarg 0.679688
 87: out 27 biased 13; lerr 0.00166483 rerr 0.00405633 larg 0.679688 rarg 0.6875
 88: out 26 biased 13; lerr 0.00242145 rerr 0.0032306 larg 0.6875 rarg 0.695312
 89: out 25 biased 13; lerr 0.00328389 rerr 0.0023 larg 0.695312 rarg 0.703125
 90: out 24 biased 13; lerr 0.00425098 rerr 0.00126568 larg 0.703125 rarg 0.710938
 91: out 23 biased 13; lerr 0.0053216 rerr 0.000128738 larg 0.710938 rarg 0.71875
 92: out 23 biased 12; lerr 0.000128738 rerr 0.00554953 larg 0.71875 rarg 0.726562
 93: out 22 biased 12; lerr 0.00110974 rerr 0.00424628 larg 0.726562 rarg 0.734375
 94: out 21 biased 12; lerr 0.0024487 rerr 0.00284339 larg 0.734375 rarg 0.742188
 95: out 20 biased 12; lerr 0.0038871 rerr 0.00134187 larg 0.742188 rarg 0.75
 96: out 19 biased 12; lerr 0.00542395 rerr 0.000257287 larg 0.75 rarg 0.757812
 97: out 19 biased 11; lerr 0.000257287 rerr 0.00488281 larg 0.757812 rarg 0.765625
 98: out 18 biased 11; lerr 0.00195312 rerr 0.00312603 larg 0.765625 rarg 0.773438
 99: out 17 biased 11; lerr 0.0037447 rerr 0.00127425 larg 0.773438 rarg 0.78125
 100: out 16 biased 11; lerr 0.00563109 rerr 0.000671612 larg 0.78125 rarg 0.789062
 101: out 16 biased 10; lerr 0.000671612 rerr 0.00426337 larg 0.789062 rarg 0.796875
 102: out 15 biased 10; lerr 0.00271068 rerr 0.00216607 larg 0.796875 rarg 0.804688
 103: out 14 biased 10; lerr 0.00484208 rerr 2.28884E-05 larg 0.804688 rarg 0.8125
 104: out 14 biased 9; lerr 2.28884E-05 rerr 0.00477319 larg 0.8125 rarg 0.820312
 105: out 13 biased 9; lerr 0.00230268 rerr 0.00243701 larg 0.820312 rarg 0.828125
 106: out 12 biased 9; lerr 0.00467248 rerr 1.1444E-05 larg 0.828125 rarg 0.835938
 107: out 12 biased 8; lerr 1.1444E-05 rerr 0.00467353 larg 0.835938 rarg 0.84375
 108: out 11 biased 8; lerr 0.00250271 rerr 0.00210469 larg 0.84375 rarg 0.851562
 109: out 10 biased 8; lerr 0.0051047 rerr 0.000551376 larg 0.851562 rarg 0.859375
 110: out 10 biased 7; lerr 0.000551376 rerr 0.00398129 larg 0.859375 rarg 0.867188
 111: out 9 biased 7; lerr 0.00329393 rerr 0.00118567 larg 0.867188 rarg 0.875
 112: out 9 biased 6; lerr 0.00118567 rerr 0.00564531 larg 0.875 rarg 0.882812
 113: out 8 biased 6; lerr 0.00169516 rerr 0.00271239 larg 0.882812 rarg 0.890625
 114: out 7 biased 6; lerr 0.0046605 rerr 0.000304507 larg 0.890625 rarg 0.898438
 115: out 7 biased 5; lerr 0.000304507 rerr 0.00403259 larg 0.898438 rarg 0.90625
 116: out 6 biased 5; lerr 0.00340469 rerr 0.00088176 larg 0.90625 rarg 0.914062
 117: out 6 biased 4; lerr 0.00088176 rerr 0.00514993 larg 0.914062 rarg 0.921875
 118: out 5 biased 4; lerr 0.00235119 rerr 0.00186722 larg 0.921875 rarg 0.929688
 119: out 4 biased 4; lerr 0.00566562 rerr 0.00149648 larg 0.929688 rarg 0.9375
 120: out 4 biased 3; lerr 0.00149648 rerr 0.00265532 larg 0.9375 rarg 0.945312
 121: out 3 biased 3; lerr 0.00494055 rerr 0.0008372 larg 0.945312 rarg 0.953125
 122: out 3 biased 2; lerr 0.0008372 rerr 0.00324937 larg 0.953125 rarg 0.960938
 123: out 2 biased 2; lerr 0.00440902 rerr 0.000370094 larg 0.960938 rarg 0.96875
 124: out 2 biased 1; lerr 0.000370094 rerr 0.00365258 larg 0.96875 rarg 0.976562
 125: out 1 biased 1; lerr 0.00406783 rerr 9.20338E-05 larg 0.976562 rarg 0.984375
 126: out 1 biased 0; lerr 9.20338E-05 rerr 0.00386801 larg 0.984375 rarg 0.992188
 127: out 0 biased 0; lerr 0.00391391 rerr 0 larg 0.992188 rarg 1

 ... [removed hex data dumping]

max recip 7x7 error at 0.519531: 0.00558472 or 2^-7.4843
max rsqrt 7x7 error at 0.546875: 0.00628347 or  2^-7.31422


On 2020-08-03 1:17 p.m., Bill Huffman wrote:

I should have said that my results are for the 7/7 case.  And it sounds like we're in agreement then.  We probably have the same table.

      Bill

On 8/2/20 9:50 AM, DSHORNER wrote:
EXTERNAL MAIL

This is the link to the revised code that does n by m LUT


https://github.com/David-Horner/recip/blob/master/vrecip.cc

On 2020-08-01 4:51 p.m., David Horner via lists.riscv.org wrote:


Join tech-vector-ext@lists.riscv.org to automatically receive all group messages.