| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rogerbinns 4149 days ago

Note that the Python code is not limited to 32 bits and will automatically promote to bignum (aka long in py27).

    >>> l=[0xffffffff, 17]
    >>> sum(l) 
    4294967312

C based code will silently overflow/truncate, giving 16 in the example above. The author handwaved all this away, but it does show the usual tradeoffs between right/robust answers and fast answers.

2 comments

legulere 4149 days ago

Signed integer behaviour is undefined on overflow in C.

link

anewhnaccount 4148 days ago

But not on specific compilers with specific settings on specific machines like were used for the benchmarks results. Execution time is also undefined by the C standard.

link

MaulingMonkey 4148 days ago

"Undefined behavior" is a specific term used by the C and C++ standards. What you're describing sounds more like "unspecified behavior", something entirely different.

Undefined behavior allows for aggressive optimizations where the compiler assumes the integer overflow can never happen, leading to such wonderful bugs as this:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475

Some compilers may offer specific settings, such as fwrapv, which give you a way to define the behavior of integer overflow. (Although last I read fwrapv was broken.) I see no mention of any such specific setting in the article. If you've spotted one I've missed, please call it out more specifically.

link

ebbv 4149 days ago

Obviously if this weren't a trivial example and were real code he'd account for this in his C code, and it would still be much faster than the Python code.

He hand waved it away because it's not really relevant to the point.

link

rogerbinns 4149 days ago

The Python code was trivial to write and immediately correct and robust. Trying to get C code to do the right thing is considerably more difficult - for example even trying to detect overflow can result in code the compiler then removes - http://lwn.net/Articles/278137/

Robust, correct and fast are a hard combination to do. I do acknowledge the article is about the latter.

link

haberman 4149 days ago

It's not hard if you use GMP.

link

lumpypua 4149 days ago

The fancy library you're using in that other part of your project doesn't use GMP.

link

haberman 4149 days ago

It likely doesn't need arbitrary precision arithmetic either. A lot of algorithms truly want modular arithmetic (hashes, checksums), many are dealing in domains that can't practically exceed a machine integer size (length of a string, number of vertices in a graph, etc). There is no reason that every library needs to deal in arbitrary precision.

link

_delirium 4149 days ago

> domains that can't practically exceed a machine integer size

This assumption is common among authors of C code, but is sometimes also exploitably incorrect. Even if you really don't care about supporting things larger than a certain size, you do still have to correctly account for overflow, not just ignore it.

link