Hacker News new | ask | show | jobs
by data-cat 4574 days ago
Wouldn't a 64 bit processor be able to perform more accurate floating point operations faster? I definitely don't believe the only benefit to a 64 bit architecture is being able to address more memory.
1 comments

Depends highly on the architecture. Both floating-point and vector operations are often special-cased in the pipeline (e.g. x86), so e.g. 64-bit floating point operations on a particular 32-bit processor may not exhibit worse performance than if that processor had 64-bit registers.

I'm not familiar enough with the particulars of ARM to answer confidently for floating point operations, but to take an example that's not usually special-cased, say bit vector arithmetic, yes, those operations will execute twice as quickly if they are vectored.

ARM 32-bit did not have SIMD (vector) double precision, while ARM 64-bit does, so here it's definitely a win.

On x86 though, both 32-bit and 64-bit did double precision vectors just fine, so it didn't really apply there (except that the fp register count was doubled).

Only if your code were SIMD aligned. But that is not code that your typical compiler outputs.

Most SIMD code is heavy number-crunching stuff like multimedia or GPU shaders. But much of that low-level handling is handled off CPU on phone platforms. It is simply more power efficient to have a hardware decoder of multimedia.

Only if your code were SIMD aligned. But that is not code that your typical compiler outputs.

GCC has supported autovectorization for a while now.

"Unoptimized code will be slow" isn't a great argument anyway. There's not much a processor can do to help that.

Only with proper pragmas and loop constructs.

Besides, most ARM chips supported vectorized code anyway. You know, NEON? http://www.arm.com/products/processors/technologies/neon.php

ARM64 does NOT grant you the vectorized instruction advantage. Qualcomm Snapdragon Krait have supported NEON for some time already.

http://www.anandtech.com/show/5559/qualcomm-snapdragon-s4-kr...

So many people are arguing here, but clearly few of you people have even worked with ARM chips at the assembly level.

Only with proper pragmas and loop constructs.

(1) it's enabled by default at -O3; (2) the loop constructs needed are fairly simple; (3) arguing that unoptimized code will be slow is still a poor argument.

ARM64 does NOT grant you the vectorized instruction advantage.

32-bit ARM NEON does not support vectorized doubles. 64-bit ARM NEON does. Source: http://en.wikipedia.org/wiki/ARM_NEON#Advanced_SIMD_.28NEON....

So many people are arguing here, but clearly few of you people have even worked with ARM chips at the assembly level.

Yep. Thankfully I can back up my arguments with quoted facts.

EDIT: and I already granted that vectorized and floating-point operations don't necessarily benefit from larger register widths, so I don't know why you're even arguing. Let alone the OP wasn't even asking specifically about ARM!