Brief inquiry about BFloat16
[Re-sending. Accidentally sent it from an email not recognized by the group.]
Subject: Re: [RISC-V] [tech] Brief inquiry about BFloat16
Date: September 1, 2020 at 8:46:01 PM GMT+8
To: Anastasia Butko <abutko@...>
Another article providing an independent assessment of 16-bit posits is from Oxford University, which found they could do a much better job than 16-bit floats for weather and climate models:
An article on a posit hardware generator that seems to be attracting citations, and succesfully applies posits to FIR filters, is here:
A fairly recent study of posits for deep neural networks that found them advantageous is here:
Another one about 16-bit posits for deep neural networks was done at University of Pisa:
Screen shot of the summary of that last one:
At last count, I knew of 9 projects worldwide to incorporate posits into RISC-V. One of the better recent efforts is CLARINET:
Another is startup Esperanto:
The latter tends to play down their use of posits "under the hood", but I see this on the Twitter page on one of their senior people:
Maybe I shouldn't leave out ThoughtWorks, with its "Posit Enhanced Rocket Chip":
I think you get the idea. I had nothing to do with any of these papers or efforts. People are noticing that posits and RISC-V and neural networks are a natural fit.
David Patterson mentioned that tapered precision had been tried in the past. Actually only once, by Morris, and the Morris approach was analyzed a couple years ago to show why it failed. See pages 50 to 57 of
The Morris approach produced many redundant representations of the same quantity, even worse than IEEE 754, and to make matters worse, they are chaotically ordered.
The paper "Posits: The Good, The Bad, and The Ugly" was authored by Florent De Dinechin and his students, and Florent is one of the members of the IEEE 754 committee. He is a brilliant guy and I have a lot of respect for him, but his desire to preserve the legacy of IEEE 754 tends to color his comparisons. As an example, there are bounds on the relative error of multiplication for floats, and he does not see the equivalent for posits. A more careful reading shows the float bounds only apply to floats that cannot overflow or underflow when multiplied, which restricts you to half of all floats. If posits also are similarly restricted to half of all possible values, those values have more significant digits than floats and also have a bound on their numerical error. And it's a smaller bound (16 times smaller for 32-bit posits versus 32-bit floats).
There is a classic paradigm change going on here, where venerable octogenarians and their students (and institutional colleagues) try to suppress new ideas. I went through this in the 1980s when I said that parallel processing was the way forward, and venerable experts told me that Amdahl's law showed that parallel processing can't work. The posit revolution reminds me of the parallel processing revolution, except… posits caught fire and spread without my help, whereas parallel programming still meets with resistance by many, decades later.
Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.