Brief inquiry about BFloat16


John Gustafson
 

[Re-sending. Accidentally sent it from an email not recognized by the group.]

Subject: Re: [RISC-V] [tech] Brief inquiry about BFloat16
Date: September 1, 2020 at 8:46:01 PM GMT+8
To: Anastasia Butko <abutko@...>

Thanks, Anastasia,

Another article providing an independent assessment of 16-bit posits is from Oxford University, which found they could do a much better job than 16-bit floats for weather and climate models:


An article on a posit hardware generator that seems to be attracting citations, and succesfully applies posits to FIR filters, is here:


A fairly recent study of posits for deep neural networks that found them advantageous is here:


Another one about 16-bit posits for deep neural networks was done at University of Pisa:

Screen shot of the summary of that last one:


At last count, I knew of 9 projects worldwide to incorporate posits into RISC-V. One of the better recent efforts is CLARINET:


Another is startup Esperanto:


The latter tends to play down their use of posits "under the hood", but I see this on the Twitter page on one of their senior people:


Maybe I shouldn't leave out ThoughtWorks, with its "Posit Enhanced Rocket Chip":


I think you get the idea. I had nothing to do with any of these papers or efforts. People are noticing that posits and RISC-V and neural networks are a natural fit.

David Patterson mentioned that tapered precision had been tried in the past. Actually only once, by Morris, and the Morris approach was analyzed a couple years ago to show why it failed. See pages 50 to 57 of 


The Morris approach produced many redundant representations of the same quantity, even worse than IEEE 754, and to make matters worse, they are chaotically ordered.

The paper "Posits: The Good, The Bad, and The Ugly" was authored by Florent De Dinechin and his students, and Florent is one of the members of the IEEE 754 committee. He is a brilliant guy and I have a lot of respect for him, but his desire to preserve the legacy of IEEE 754 tends to color his comparisons. As an example, there are bounds on the relative error of multiplication for floats, and he does not see the equivalent for posits. A more careful reading shows the float bounds only apply to floats that cannot overflow or underflow when multiplied, which restricts you to half of all floats. If posits also are similarly restricted to half of all possible values, those values have more significant digits than floats and also have a bound on their numerical error. And it's a smaller bound (16 times smaller for 32-bit posits versus 32-bit floats).

There is a classic paradigm change going on here, where venerable octogenarians and their students (and institutional colleagues) try to suppress new ideas. I went through this in the 1980s when I said that parallel processing was the way forward, and venerable experts told me that Amdahl's law showed that parallel processing can't work. The posit revolution reminds me of the parallel processing revolution, except… posits caught fire and spread without my help, whereas parallel programming still meets with resistance by many, decades later. 

John


On Sep 1, 2020, at 2:23 AM, Anastasia Butko <abutko@...> wrote:

  - External Email -
 
Hi All,

Here is a recent study on the numerical stability of posit arithmetic. It shows "that Posits do not consistently improve stability across a wide range of matrices, but demonstrate that a simple rescaling of the underlying matrix improves convergence rates for Conjugate Gradient Method and reduces backward error for Cholesky Factorization. We also demonstrate that 16-bit Posit outperforms Float16 for mixed precision iterative refinement”.

John Shalf is the corresponding author in case you want to know more about this study.


Best,

Anastasiia Butko
Lawrence Berkeley National Laboratory
Computer Science Department
+1 510 486 5227

On Aug 26, 2020, at 11:33 PM, gouyue via lists.riscv.org <gouyue=huawei.com@...> wrote:

Hi All!

Bfloat16 is a custom 16-bit floating point format for machine learning,widely used in the AI scenarios. We hope to implement the BF16 operator on RISCV, are there any plans to propose instructions that support BF16?

Thanks
GouYue

<PositPaper.pdf>





Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.

Join tech@lists.riscv.org to automatically receive all group messages.