I always found the wikipedia examples for 16bit floating point helpful since the numbers are smaller. You can really see how the exponent and fraction affect each other in a very simple way.
BFloat16[1] is an interesting tweak to the original, but built with more modern requirements for ML in mind.
Particularly because it is easier to think of 32 bit IEEE-754 to BFloat16, but with fewer bits of think about (& possibly you can enumerate the entire range in a laptop to test something like "will this function work for all values?").
The 8-bit floating point is even easier to understand, since you can list them all in a single page, and even visualize the entire addition and multiplication tables.
When talking about tiny 8-bit floats, it does waste a lot: if your exponent is only 3 bits, you've "wasted" 1/8 of all 256 possible values, which is a lot. With normal-sized floats, it's much less of an issue: 1/256 of the billions of possible 32-bit values, and 1/2048 of all possible 64-bit values.
(Also, the real "waste" is only on the multiple NaN values, since the zeros always "waste" only a single value for the "negative zero", and the infinities always "waste" only two values; AFAIK, both negative zero and the infinities are necessary for stability of some calculations.)
Particularly because it is easier to think of 32 bit IEEE-754 to BFloat16, but with fewer bits of think about (& possibly you can enumerate the entire range in a laptop to test something like "will this function work for all values?").
[1] - https://en.wikipedia.org/wiki/Bfloat16_floating-point_format...