Hacker News new | ask | show | jobs
by ErsatzVerkehr 4490 days ago
As a "specification" this document is laughable. For example, rounding modes and overflow behavior are not addressed. The comment that object pointers can be stuffed into the coefficient field (usually called 'mantissa') is completely non-sequitur. Frankly I am surprised to see such a big name behind it.

I imagine this project is inspired by the sad state of numerical computing in Javascript, but this proposal will surely only make it worse. The world certainly doesn't need a new, incompatible, poorly-thought-out floating point format.

Compare the level of thought and detail in this "specification" to the level of thought and detail in this famous summary overview of floating point issues: https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numer... ("What every computer scientist should know...")

> DEC64 is intended to be the only number type in the next generation of application programming languages.

Jesus, I certainly hope not.

7 comments

I would hate if this became the only numeric type available in JS (in fact, I don't want it at all), but it's dishonest to quote "specification" as you have and then dismiss it as such, as it doesn't claim to be one, the word doesn't appear a single time in it, and if you go through to the github repo, the readme explicitly refers to the page as a "descriptive web page".

In fact, the only line of actual substance in your post is "For example, rounding modes and overflow behavior are not addressed", and it turns out that's only true for the descriptive web page, not the reference implementation.

> I imagine this project is inspired by the sad state of numerical computing in Javascript, but this proposal will surely only make it worse. The world certainly doesn't need a new, incompatible, poorly-thought-out floating point format.

I'll just quote pg here:

Yeah, we know that. But is that the most interesting thing one can say about this article? Is it not at least a source of ideas for things to investigate further?

The problem with the middlebrow dismissal is that it's a magnet for upvotes. The "U R a fag"s get downvoted and end up at the bottom of the page where they cause little trouble. But this sort of comment rises to the top. Things have now gotten to the stage where I flinch slightly as I click on the "comments" link, bracing myself for the dismissive comment I know will be waiting for me at the top of the page.

This format is suicide from a numerical stability POV. It will lose 3 to 4 bits of precision for an operation with mixed exponents, whereas double will only lose one. Any kind of marginally stable problem, say eigenvalues with poorly conditioned data, will yield crap.
Can you explain this point in more detail? And maybe give an example? As someone who is not an expert in this I can't follow your argument.
I'd like an explanation also.

From my own understanding, an operation will lose precision iff () the result cannot be represented with 52 significant coefficient bits. Logically, the same happens in IEEE754, the difference is that the loss of precision in DEC64 is always a multiple 3.5 bits, whereas IEEE754 can lose precision in decrements of one bit.

() Maybe excluding over/underflow scenarios.

To clarify, what you say sounds like (I'm pretty sure that's not what you meant) every operation with mixed exponent will lead to loss of precision. From my understanding, that is not the case. It also sounds like every IEEE754 operation with mixed exponents only leads to one bit precision loss while it could lead to much more (in fact to (<total number of fraction bits between the two IEEE754 doubles> - 52).

OK. It looks like the proposed representation represents numbers as a signed 56 bit integer times ten to a signed 8-bit number. That does avoid fiddling with BCD, but it's VERY different from usual floating point. Look at the neighborhood of zero. As with usual floating point there's a gap between zero and the smallest nonzero number... but unlike floating point numbers, the next larger nonzero number is twice the smallest nonzero number. The behavior of numerical methods in DEC64 will, I suspect, be quite different from that of floating point. It would be very interesting to know what the difference is, and I'd hope that proponents of it as "the only number type in the next generation of application programming languages" would exercise due diligence in that and other regards.
So the takeaway is that "one type to rule them all" in a scripting language won't really work until 128 bit machines are mainstream?
I completely accept and agree with your criticism of my criticism.

My gut sense is that this proposal is simply too amateurish to be worth the effort of thoroughly debunking. Floating point is kind of like cryptography: the pitfalls are subtle and the consequences of getting it wrong are severe (rockets crashing, etc). Leave it to the experts. This is not a domain where you want to "roll your own."

> rounding modes and overflow behavior are not addressed

He provides a reference implementation. That means this and many other details are defined by code. Quoting dec64.asm: "Rounding is to the nearest value. Ties are rounded away from zero. Integer division is floored."

> Compare the level of thought and detail in this "specification" to ... [Goldberg]

I don't think this comparison is fair. David Goldberg's text is an introduction to the topic. Douglas Crockford describes an idea and gives you a reference implementation.

Said reference implementation is written in x86 assembly. 1261 lines, 650 lines sans comments and blanks.

To be fair it is extensively commented, but the comments describe what it does, not why. And for fuck's sake, hundreds of lines of assembly is not a spec, even if it is most readable code in the world

> He provides a reference implementation. That means this and many other details are defined by code.

I dream of a day when we stop thinking of "reference implementations" as proper specifications. The whole concept of a "reference implementation" leads to an entire class of nightmarish problems.

What happens if there is an unintentional bug in the reference implementation that causes valid (but incorrect) output for certain inputs? What if feeding it certain input reliably produces a segfault? Does that mean that other implementations should mimic that behavior? What if a patch is issued to the reference implementation that changes the behavior of certain inputs? Is this the same as issuing an amendment to the specification? Does this mean that other implementations now need to change their behavior as well?

Or, worse, what if there is certain behavior that should be explicitly left undefined in a proper specification? A reference implementation cannot express this concept - by definition, everything is defined based on the output that the reference implementation provides.

Finally, there's the fact that it takes time and effort to produce a proper specification, and this process usually reveals (to the author) complexities about the problem and edge cases that may not become apparent simply by providing a reference implementation.

Bitcoin suffers from this problem. In fact, some people in the community frown upon attempts to make mining software or clients when bitcoin-qt/bitcoind can just be used instead, for the exact reasons you mentioned.

The spec for a valid transaction in Bitcoin can currently only be defined as "a transaction that bitcoin-qt accepts." The problem is magnified by how disorganized the source code is.

I suppose that Crockford is just tossing around this idea to test the waters. If he were serious, he would have to deliver a written spec of course!
On the other hand, if you don't have a reference specification you shouldn't declare something standard.

(re: all the web standards that had very wide "interpretation" by different browser efforts, leading to chaos and a whole industry based on fear an uncertainty)

But nobody did, you're the first person in the thread to mention the word "standard". You might be confusing specification with standard.
And what if the code has a bug? Code is also difficult to analyze. The people who do numerical computing need to prove theorems about what their algorithms produce.
> The BASIC language eliminated much of the complexity of FORTRAN by having a single number type. This simplified the programming model and avoided a class of errors caused by selection of the wrong type. The efficiencies that could have gained from having numerous number types proved to be insignificant.

I don't agree with that, and I don't think BASIC has much (if anything) to offer in terms of good language design.

I tried finding out whether that "having a single number type" actually were true (the microcomputer versions used a percentage sign suffix (I%, J%) to denote integers).

From http://bitsavers.trailing-edge.com/pdf/dartmouth/BASIC_Oct64..., that appears to be the case.

Off topic: in that PDF (page 4) the letter "Oh" is distinguished from the numeral "Zero" by having a diagonal slash through the"Oh". Yes, that program printed "NØ UNIQUE SØLUTIØN".

That made me think of the periodic rants here on HN about the supposedly neigh insurmountable inconsistencies in mathematical notation.

Not all microcomputer versions of BASIC used the percent sign to designate integers. The one I grew up using (Microsoft BASIC on the TRS-80 Color Computer) used only one representation for numbers, floating point (5 byte value, not IEEE 754 based).
Probably mean 'nigh'. Neigh is a sound horses make.
Yep, let's see whether I can blaim iOS autocorrect: nigh. No. Thanks for the feedback.
or nay...
> The comment that object pointers can be stuffed into the coefficient field (usually called 'mantissa') is completely non-sequitur.

Why? A similar thing is being used by some JavaScript and Lua implementations. It's called NaN boxing.

At the least, overflow is addressed:

> nan is also the result of operations that produce results that are too large to be represented.

It's addressed, but in the wrong way. IEEE 754 has positive and negative infinity for a reason. Why do there have to be 255 zero values? Also, what about rounding modes? Floating point math is really, really hard and this specification makes it look too easy.
They want to allow fast "addition with equal exponents" in a single cycle and that requires a zero value for each possible exponent.
I see, that makes sense.
Where's the spec? I can't find it. Certainly the linked web page isn't the spec...
Not a spec, but the reference implementation here: https://github.com/douglascrockford/DEC64
Thanks, I saw that. I thought maybe Ersatz was complaining about a spec I couldn't find. It seems he may be addressing the linked description instead.
This isn't a specification, and it's laughable that you think so.

It's a polemic, designed to try and propagate an idea and change minds.

It didn't change my mind much, but I found your comment more affecting - in the humorous!