From the comment in protobuf source (which does the same thing as Python), mentioned in the Twitter thread:
(...) An arguably better strategy would be to use the algorithm described in "How to Print Floating-Point Numbers Accurately" by Steele & White, e.g. as implemented by David M. Gay's dtoa(). It turns out, however, that the following implementation is about as fast as DMG's code. Furthermore, DMG's code locks mutexes, which means it will not scale well on multi-core machines. DMG's code is slightly more accurate (in that it will never use more digits than necessary), but this is probably irrelevant for most users.
Rob Pike and Ken Thompson also have an implementation of dtoa() in third_party/fmt/fltfmt.cc. Their implementation is similar to this one in that it makes guesses and then uses strtod() to check them. (...)
The C/C++ standards do not require formatting to round correctly or even be portable. I recently had an issue where a developer used this method to round floats for display, and there were differences on PC and on Mac. It literally rounded something like 18.25 to 18.2 on one platform and 18.3 on the other. This led to all sorts of other bugs as some parts of the program used text to transmit data, which ended up in weird states.
The culprit was this terrible method. If you want anything approaching consistency or predictability, do not use formatting to round floating point numbers. Pick a numerically stable method, which will be much faster of done correctly.
Coincidentally, C/C++ do not require any of their formatting and parsing routines to round-trip floating point values correctly (except the newly added hex formatted floats which are a direct binary representation, and some newly added function allowing an obscure trick I do not recall at the moment... )
"""PyOS_snprintf() and PyOS_vsnprintf() wrap the Standard C library functions snprintf() and vsnprintf(). Their purpose is to guarantee consistent behavior in corner cases, which the Standard C functions do not."""
The python wrapper also does not specify in this case, so you should not use them for rounding, or you will have the same problem. No where on that page does it specify proper rounding will be cross platform.
Simply do it with floats. There are perfectly good, numerically stable, fast rounding methods, that avoid all this nonsense.
Sure, PyOS_snprintf isn't a standard library function, but it's a thin wrapper to snprintf, which is. Python/mysnprintf.c is the location of the:
snprintf() wrappers. If the platform has vsnprintf, we use it, else we
emulate it in a half-hearted way. Even if the platform has it, we wrap
it because platforms differ in what vsnprintf does in case the buffer
is too small:
It mentions that one corner cases what happens when the buffer is too small. Not rounding issues.
The "the best ways to go about it" comment links to the protobuf code, which also uses snprintf.
> For e, E, f, F, g, and G conversions, if the number of significant decimal digits is at most DECIMAL_DIG, then the result should be correctly rounded. If the number of significant decimal digits is more than DECIMAL_DIG but the source value is exactly representable with DECIMAL_DIG digits, then the result should be an exact representation with trailing zeros. Otherwise, the source value is bounded by two adjacent decimal strings L<U, both having DECIMAL_DIG significant digits; the value of the resultant decimal string D should satisfy L ≤ D ≤ U, with the extra stipulation that the error should have a correct sign for the current rounding direction.
So either 1) "The C/C++ standards do not require formatting to round correctly or even be portable.", in which case Python and protobuf are doing it wrong and somehow this issue was never detected, or 2) The C/C++ standards do require correct rounding, but the case described by ChrisLomont didn't quite meet the spec requirements to get precision and rounding modes to match across platforms. Or 3), I don't know what I'm talking about.
"correctly rounded" is implementation defined is the problem. You cannot do it portably, and you cannot query it portably. As such, different platforms, compilers, etc do it differently. Thus using formatting for rounding is inconsistent.
Here's [1] where you can query the current floating-point environment in C: "Specifics about this type depend on the library implementation".
Here's [2] where you can set some rounding modes in C++: "Additional rounding modes may be supported by an implementation.". Note this does not have by default bankers rounding which is used to make many scientific calculations more stable (lowers errors and drift in accumulated calculations). Many platforms do this by default, but it's not in the standard.
You can chase down this rabbit hole. I (and several others) did during the issue on the last project, and got to where it was well-known in numerics circles that this is not a well-defined process in C/C++. If it were, printing and parsing should round-trip, and it does not before a recent C++ addition, and now it only is guaranteed in a special case.
They document their goal is correctness in edge cases that other standard C functions don’t guarantee. Seems obvious to say this is “one of the best” ways to go about it. It’s absolutely true given this stated and documented goal of correctness, which would be a very commonly needed property.
Other good ways could trade-off edge case comprehensiveness for performance or whatever. That doesn’t make this way less good.
IEEE754 defines 5 rounding modes. This one sounds like nearest, ties to even. Not all decimal values are representable as float. Depending on if you compile for x87 (80-bit internal repreaentation) or SSE (64-bit) you might get slightly different results.
I'm well aware of that, having written at length about floating-point tricks, numerical issues, etc.
The issue here is you don't know what a library that formats a float does, and is the function is not specified clearly (as in C/C++), you have zero way of knowing what you will get.
Thus I said to do it yourself, using proper numerics.
They already had the same float, though. That's not very likely if the rounding modes were different.
For what it's worth, it looks like different standard libraries make different choices on whether float->string conversion cares about the current rounding mode.
At some point in 3.x Python moved to "bankers rounding" which is slightly less biased than the one we learn in school, perhaps C++ did the same. Might be a factor in the discrepancy.
Yep, I'd be curious what any better alternative is.
Consider that a float's internal representation is in base 2, and you're trying to round in base 10. Even if you didn't use a string, I'd assume you'd have to create an array of ints that contain base 10 digits in order to do the rounding, unless there are some weird math tricks that can be employed that can avoid you having to process all base 2 digits. And an array of ints isn't all that computationally different from a string.
In fact, I don't know what cdecimal now does but back when decimal.Decimal was pure python it would store the "decimal number" as a string and manipulate that.
I thought this is what the Unix philosophy is supposed to be all about.
(Realistically, calling wordexp should just abort the program. Now I actually want to make a hacked up musl that aborts in all the various "libc functions no one should ever use" and see how far I get into a Ubuntu boot..)
/* wordexp is also rife with security "challenges", unless you pass it
WRDE_NOCMD it must support subshell expansion, and even if you
don't beause it has to support so much of the standard shell (all
the odd little variable expansion options for example) it is hard
to do without a subshell). It is probbably just plan a Bad Idea
to call in anything setuid, or executing remotely. */
$ otool -L /usr/bin/perl
/usr/bin/perl:
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1663.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)
Well, the problem is precisely that rounding as it is generally conceived, is expressed in base 10 - as we generally conceive numbers including floating point ones in base 10. Yet at the lowest level, the representation of numbers is in base 2, including floating point ones. It is imaginable, would be more correct and efficient to perform rounding (or flooring or ceiling, for that matter) in base 2, but it would be that more difficult to comprehend when dealing with non integers in code.
Rounding in base 10 needs some form of conversion anyway, going for the string is one way that is, at least, readable (pun intended).
In my experience there are few things slower that float to string and string to float. And it seems so unnecessary.
I always implemented round to a specific digit based on the built-in roundss/roundsd functions which are native x86-64 assembler instructions (i.e. https://www.felixcloutier.com/x86/roundsd).
I do not understand why this would not be preferable to the string method.
float round( float x, int digits, int base) {
float factor = pow( base, digits );
return roundss( x * factor ) / factor;
}
I guess this has the effect of not working for numbers near the edge of it's range.
One could check this and fall back to the string method. Or alternatively use higher precision doubles internally:
float round( float x, int digits, int base ) {
double factor = pow( base, digits );
return (float)( roundsd( x * factor ) / factor );
}
But then what do you do if you have a double rounded and want to maintain all precision? I think there is likely some way to do that by somehow unpacking the double into a manual mantissa and exponent each of which are doubles and doing this manually - or maybe using some type of float128 library (https://www.boost.org/doc/libs/1_63_0/libs/multiprecision/do...)...
But changing this implementation now could cause slight differences and if someone was rounding then hashing this type of changes could be horrible if not behind some type of opt-in.
float to string is incredibly fast now - look at Ulf Adams’ Ryu and Ryu Printf algorithms, which I’ve used to implement C++17 <charconv> in Visual Studio 2019 (16.2 has everything implemented except general precision; the upcoming 16.4 release adds that).
I don’t know of truly fast algorithms for string to float, although I improved upon our CRT’s performance by 40%.
I don't think converting is slow by itself depending on what you need done. I have a function in my code I wrote to convert strings of floats/doubles to a rounded string of however many digits you want(for storing financial data that was muxed with multiple streams), and converting 1.59688139452f to a string, and then rounding that string to the 5th decimal place took 8.759 seconds for 10 million iterations (87.5 nanoseconds/iteration). Granted, it was written for my specific use case, so I don't need to handle the various edge cases/formats. But, little is inherently slow if you take the time to write a solution for your own need.
I have encountered situations where irregular rounding became solvable but annoyingly problematic to detect / calculate, in the LANL Earthquake dataset on Kaggle, it had a column with samples and a column with (incorrectly incrementing) sample times that were rounded. In order to create a corrected column, I noticed quite a lot of irregularities in the python rounding (or the underlying mechanism)
I also consider simultaneously deterministic, fast, portable rounding or binary formats to be important for decentralized delegation of computation, say a TrueBit style platform:
It's because Python is slow everywhere, so it's hard to find bottlenecks like this. This method is easily 100x slower than need be. If everything else were well written, this would stand out. Since lots of Python is written like this, none stand out.
I just ran into a similar booby trap the other day: whereas BigDecimal::BigDecimal(double) does the full decimal expansion, BigDecimal::valueOf(double) goes through Double::toString(double), which is generally a lot fewer digits.
Somewhat offtopic, but is there a reason some many explanations of this issue lump together the fundamental principle of how numbers are represented (integers vs. fractions vs. exact reals (technically impossible) vs. IEEE 754) and the base (decimal vs. binary)? Every time I read something like the explanation on that site, I wonder if I would understand it if I didn't knew it already.
The "why does this happen" on that page hits on it, at least tangentially. A lot of what's confusing about IEEE floats isn't the inability to represent all rationals in and of itself, it's more that the particular patterns of inaccuracy end up being different between the computer approximation and the approximations we'd make on paper, because of the different numeric bases.
Related to that problem, a major source of confusion with IEEE floats is that languages go to great lengths[1] to present them as decimals and hide the underlying binary denominator. Even when you know they're binary internally, it throws you off.
High level languages are also annoying in that they don't provide great support for working with them as binary denominated rationals, e.g. there's no round_base2 in python, and there's no hex float representation in printf.
The difference between decimal and binary is essential to understanding the problem. Just as there's no elegant way to represent 1/3 in base 10, there's no elegant way to represent 1/10 in base 2.
Whenever I've helped older people with technology they've never used before (a new tablet or similar), if I started off with any suggestion that it's less than simple, they'll almost certainly frame the problem scope in their mind as difficult, and give up, because they're already exhibiting some animosity toward learning a new thing.
If instead you phrase it as "this is really easy, let me show you how...", you short-circuit this process by framing their expectations differently, and that little bit of extra confidence ("this is easy") can help them through the learning process.
I've found you can't simply show them, either. It's almost better to say "It's easy" and then go through the process, because it's absolutely necessary to establish expectations first. They're already afraid of it (it's new), so doing something to get their guard down can go a long way toward helping them explore on their own. I tried this experiment with my mother, and some weeks later she'd have a problem and discover the solution herself specifically because she was convinced it was easy to do. This can (and will) backfire if you're not careful about how you do it, but I've had far more success using this tool than other techniques individually (e.g. writing down instructions).
This doesn't broadly apply to areas outside education and support (or even to all areas in education), but for simple things that people may express an irrational fear over, it works and it works well. A good teacher will use this technique successfully with their students, so if you're teaching someone, use it!
If someone has some anxiety about not understanding something, telling them it's actually pretty simple can just reinforce the framing they already have going in that maybe they're too dumb to get it.
I've found it's usually better to acknowledge that it's a little difficult or otherwise totally normal not to already know / have grasped the thing in question.
I think the intent is in the right place when saying "it's actually pretty simple" -- you want to provide optimism. The approach I like is along the lines of "this part is a little tricky, so let's break it down."
I punched "It's actually pretty simple" into http://talktotransformer.com (which generates nonsense text from a seed using an OpenAI language model) and after a couple of tries it gave me this:
> It's actually pretty simple. We'll be looking at something called the "D-Wave P-500", which is a version of the P500 chip for quantum computers.
> It's basically a single bit computer, but with more than 500 qubits. Which means that our "real number" will have more numbers than the number of qubits that are available. That's really important.
> Quantum computers are theoretically able to do more things than just solve equations. For example, the way that a quantum computer uses energy from an electron to solve a classical math problem, or the way that it can break a complex calculation into smaller bits of information that each can solve on its own, is very different from how computers currently work.
> But I am not suggesting that a quantum computer can be used to solve more abstract problems. Because that would be crazy.
> But to give an example of what it could do, imagine doing a number crunching function that was 10× faster than a classical chip, and that had some really useful, and practical things that would be interesting to try.
Because Talk to Transformer is trained on real-world data, this supports the hypothesis that the phrase "It's actually pretty simple" is often followed by an unintelligible and highly technical explanation.
The intent of that sentence is a lot clearer if you also consider the subsequent text, which expands on the idea quite a bit:
> Computers can only natively store integers, so they need some way of representing decimal numbers. This representation comes with some degree of inaccuracy. . . Why does this happen? It's actually pretty simple. When you have a base 10 system (like ours), it can only express fractions that use a prime factor of the base. . .
Perhaps a more formally correct way to put it is that integers (and natural numbers) are the only numbers that a computer can manage in a way that behaves reasonably similarly to the corresponding mathematic set. Specifically, as long as you stick to their numeric range, computer ints behave like a group, just like real integers do. But IEEE floats break the definition of a field in every which way, so they're really not a great match for the rationals.
That said, you could represent the rationals as a pair of integers, and that would be better-behaved, and some programming languages do do that. But I'm not aware of an ISA that supports it directly.
I believe that's where "natively" comes in. If you store it in binary, you can operate on the values as numbers. If you do some decimal conversion, it's a blob of data. (Yes, there are BCD instructions but they're massively limited)
Maybe you need to parse "decimal numbers" as "fractional numbers" rather than "expressed using a radix of 10".
Perhaps it would be better phrased as "computers can only store whole numbers, so they need some way to represent other information, including integers, rational numbers and approximations to complex numbers."
Whoa. In FF, I just see a screenfull of blank boxes. I scroll down and see content wrongly rendered. On Chrome its a dated design with an uncomfortable text size, and the narrow column creates boxes where I have to use horizontal scrollbars. The Motherfucking Website revolution cant come too soon enough!
My quick impression is that the choice of a rounding algorithm is relative to the purpose that it serves. For instance, floor(x + 0.5) is good enough in many applications.
In some cases, rounding is performed for the primary purpose of displaying a number as a string, in which case it can't be any less complicated than the string conversion function itself.
Fun fact: floor(x + 0.5) rounds 0.49999997 to 1.0 (this is 32 bit floats, the same principle applies to 64). Most libraries have slower than ideal round conversion because of historical dross; modern chips have a very fast SIMD round instruction but its behavior doesn't exactly match libc round. See https://github.com/rust-lang/rust/issues/55107 for a deeper discussion.
Maybe the parent wanted to say that the result is detetministic and not far too off from the true value (in fact, as accurate as possible given the format). In the other words FP is exact but abides by a slight different calculation rule involving rounding.
Good call. I usually only use this conversion when the input is approximate to begin with, e.g., feeding a floating point "signal" to a DAC, or computing the contents of a lookup table to be coded into an arduino. To put it another way, I can live with some uncertainty as to the precise threshold going from one output value to the next.
Is there a phrase for the ratio between the frequency of an apparent archetype of a bug/feature and the real-world occurrences of said bug/feature? If not then perhaps the "Fudderson-Hypeman ratio" in honor of its namesakes.
For example, I'm sure every C programmer on here has their favored way to quickly demo what bugs may come from C's null-delimited strings. But even though C programmers are quick to cite that deficiency, I'd bet there's a greater occurrence of C string bugs in the wild. Thus we get a relatively low Fudderson-Hypeman ratio.
On the other hand: "0.1 + 0.2 != 0.3"? I'm just thinking back through the mailing list and issue tracker for a realtime DSP environment that uses single-precision floats exclusively as the numeric data type. My first approximation is that there are significantly more didactic quotes of that example than reports of problems due to the class of bugs that archetype represents.
Does anyone have some real-world data to trump my rank speculation? (Keep in mind that simply replying with more didactic examples will raise the Fudderson-Hypeman ratio.)
I was looking once at Python and Redis and how numbers get stored. I remember Python would in the end send Redis some strings. I dove pretty deep and found that Python floats when turned into a string and then back are exactly the same float.
I remember even writing a program that tested every possible floating point number (must have only been 32 bit). I think I used ctypes and interpreted every binary combination of 32 bits as a float, turned it into a string, then back and checked equality. A lot of them were NaN.
A NaN is any value where the 7-bit (for 32-bit floats) exponent is all 1s, except for +/-inf. So a quick approximation is that 1/128 ~= 0.78% of the space is NaN.
That means there's 25 bits that we can change while still being either a NaN or an inf. But two of those values are infs, so we need to remove them. Divide that by the entire range and we have (2^25 - 2) / 2^32 = 16777215/2147483648, or about 0.78124995%.
`blob/master` isn't a suitable permalink. Use the first few letters of the commit hash so the line numbers and code are still relevant when this file inevitably gets modified.
Rounding a number is, in the common case, multiplying it by some base, truncating to an integer, and dividing by the base. You do have to handle extremely high exponents, but even the logic for that is not complex.
Every step of this function is complex and expensive, especially printing a float as a decimal is very complex. And round is routinely used in a tight loop.
The numpy approach sacrifices correctness for speed (you sometimes get unexpected results in some corner cases, see below), the cpython way sacrifices speed for correctness.
I don't know that it is "wrong", just unexpected. I suspect most people expect all math functions to be purely implemented in numerical terms, so finding string manipulation is surprising/interesting.
> I don't know that it is "wrong", just unexpected. I suspect most people expect all math functions to be purely implemented in numerical terms, so finding string manipulation is surprising/interesting.
You kind of got me thinking now. The decimal representation of a number is really a string representation (in the sense of a certain sequence of characters). Hence rounding to a certain decimal is essentially a string operation. You can of course do it by (say) dividing by 10^whatever or something else in some numerical fashion, but the more I think about it, the more natural it is to just think of the whole thing as a string.
Or you could flip it around and consider that the string manipulation can also be described numerically so whether you consider the operation as a string operation or a numerical operation is sort irrelevant. It's just a point of view.
I think the best way to think about it is as a symbolic representation. We have processes for manipulating symbols to achieve the correct results. Purely numeric (binary based) operations just happen to allow for some quicker shortcuts but sometimes lead go lost information.
>The decimal representation of a number is really a string representation
It is incorrect to speak of "the" decimal representation of a number, as many numbers have non-unique decimal representations, the most famous example being 1.000...=0.999...
The definition that makes it unique is the shortest representation where trailing zeros are bout included. In your example that would be 1 note that this definition comes up in binary floating point where there are infinite decimal representations that will round back to a given binary64 float but the decimal representation chosen is the shortest (and closest to the binary64 float in the case of ties for length).
> It is incorrect to speak of "the" decimal representation of a number, as many numbers have non-unique decimal representations, the most famous example being 1.000...=0.999...
That's true and I'll concede that point, but it's not really relevant to what I said. That just means some numbers have different string representations that represent the same object. That doesn't really contradict anything in my post except the use of the definite article.
There are many corner cases involved with rounding, and the folks who did the string conversion had to put a lot of effort into handling all of them. It makes sense to piggyback on their efforts, even if it isn't the most efficient way 99% of the time.
The two concerns I have are performance and correctness. I don’t know enough about the implementation of round(3) to know... perhaps someone else does?
This approach is used specifically because of correctness. Doing things the 'obvious' way with round(3) or truncation introduces precision problems in corner cases.
round(3) can only round to an integer. Python's round works to an arbitrary decimal position.
It would previously scale up, round (ceil/floor really) then scale down. That turned out to induce severe precision issues: https://bugs.python.org/issue1869
This is where decimal floating point really shines. Since the exponential portion is base 10, it's trivially easy to round the mantissa.
The only silly part of ieee754 2008 is the fact that they specified two representations (DPD, championed by IBM, and BID, championed by Intel) with no way to tell them apart.
Sure, but their approach is on the other hand more correct. All numerical code reaches a point where you have to balance performance vs. correctness, and here cpython has chosen correctness over speed.
They're creating a sequence of digits and then truncating. If you want to replicate that precisely, you could use an accumulator and a loop to do the same thing. At least then you could break early.
Well obviously the approach round() takes is slower than the naive approach. The whole point of doing it the way round() does it is that it gives the correct answer in cases where the naive approach fails.
From the comment in protobuf source (which does the same thing as Python), mentioned in the Twitter thread:
(...) An arguably better strategy would be to use the algorithm described in "How to Print Floating-Point Numbers Accurately" by Steele & White, e.g. as implemented by David M. Gay's dtoa(). It turns out, however, that the following implementation is about as fast as DMG's code. Furthermore, DMG's code locks mutexes, which means it will not scale well on multi-core machines. DMG's code is slightly more accurate (in that it will never use more digits than necessary), but this is probably irrelevant for most users.
Rob Pike and Ken Thompson also have an implementation of dtoa() in third_party/fmt/fltfmt.cc. Their implementation is similar to this one in that it makes guesses and then uses strtod() to check them. (...)
https://github.com/protocolbuffers/protobuf/blob/ed4321d1cb3...