|
|
|
|
|
by colanderman
4573 days ago
|
|
Only with proper pragmas and loop constructs. (1) it's enabled by default at -O3; (2) the loop constructs needed are fairly simple; (3) arguing that unoptimized code will be slow is still a poor argument. ARM64 does NOT grant you the vectorized instruction advantage. 32-bit ARM NEON does not support vectorized doubles. 64-bit ARM NEON does. Source: http://en.wikipedia.org/wiki/ARM_NEON#Advanced_SIMD_.28NEON.... So many people are arguing here, but clearly few of you people have even worked with ARM chips at the assembly level. Yep. Thankfully I can back up my arguments with quoted facts. EDIT: and I already granted that vectorized and floating-point operations don't necessarily benefit from larger register widths, so I don't know why you're even arguing. Let alone the OP wasn't even asking specifically about ARM! |
|