| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tomrod 2651 days ago
	This article highlights and issue with floating point numbers, a substantial use case for data scientists (and as such, I value the input). How do REPLs and databases handle this edge case?

4 comments

pjc50 2651 days ago

Almost always they ignore this kind of issue; the best you're likely to get is a mean() function that remembers to sort the input first. Most numbers are in a "human" range far from the limits.

link

rightbyte 2651 days ago

If you when calculating an average actually reach overflow in a double without messing up the precision first and making the calculation worthless in the first place, some numbers in the list of numbers is bogusly big anyway.

link

Ragib_Zaman 2651 days ago

Or the list is just very long.

link

gubbrora 2651 days ago

Not really. The time required to overflow that way is unrealistic. Also I think you'll run into S + x = S. at that point your sum will stop climbing towards overflow.

link

nwatson 2651 days ago

Data science is squishy to begin with -- those wanting high performance are heading to fixed-point hardware-accelerated solutions where loss-of-precision is a given, so not having a fully accurate answer with high-precision floating-point along many steps to solving the problem doesn't seem like a big deal.

link

dsr_ 2651 days ago

One option is to convert to a bignum representation, which is slow but works.

link