Hacker News new | ask | show | jobs
by IshKebab 1028 days ago
Probably SIMD support and constant time support. Crypto libraries tend to use SIMD a lot to be fast.

You can write constant time code in Rust by carefully making sure your code only compiles to constant time instructions without branches, but you'd really want some kind of annotation on the code to enforce that.

That's mostly a guess though.

2 comments

Rust has simd support, so maybe the latter is the issue.
not in stable yet
This is incorrect. Rust has had x86_64 (up to and including AVX2) intrinsics stable since Rust 1.26. Wasm32 simd128 and aarch64 neon intrinsics are also stable.
core::arch::x86_64::_mm256_shuffle_epi8 and such seem to be in stable.
AFAIK intrinsics are stable. Idk about auto-vectorizers and the like.
other commenters, call me when this issue is closed.

https://github.com/rust-lang/rust/issues/48556

isnt constant time orthogonal to whether there are branches?
No. Constant wall clock time involves not using branches. Maybe you're thinking of "asymptotic constant time" or "runtime bounded above by a constant." These are not what is needed, because what is needed is to not expose any information via timing.
What if there are branches but both paths result in the same number of cycles being required to execute the instructions?

Is it correct to say: "all branchless code runs in constant time, but not all constant time code is branchless"?

The subtlety is that eliminating branching isn't sufficient to have constant time code. A simple example is using trigonometric and transcendental opcodes. They don't branch (at the assembly level), but on x86 take variable amounts of time depending on the input operand. Very few algorithms actually use these opcodes though, so a more relevant concern is memory access due to variable latency. Even if you have that nailed down, integer operations like multiplication and especially division can take variable amounts of time depending on the input.

Writing truly constant time code on modern processors ranges is difficult at best, and usually less efficient than variable-time code.

Really interesting information, thank you.
Add cache misses, bus interference, SMT woes and it quickly becomes harder and harder to write (and check) constant-time (or WCET) properties. Even modern micro-benchmarks are a huge labyrinth of architectural traps.
Because of speculative execution, branchy code with equal-runtime branches will still take different amounts of time if it is called repeatedly, usually in ways that reveal information about the input.
Timing attacks are common everywhere, by the way. Simplest example, perhaps a bit too contrived:

I'm an attacker doing targeted research. I want to see if a multi-auth system has an association between two email addresses tied to the same account.

Pulling a database record or in-memory record (e.g. via LFU/LRU cache) in some cases may cache the account record, which means a subsequent record might be warm when fetched with the second email.

I run a time analysis against the endpoint with garbage addresses, known addresses (that I've set up) and the two target addresses to check subsequent fetch speeds.

In some cases, this will cause enough of a time difference to tell me if there's a connection.

Timing attacks are hard, and even a well-architected system can expose information indirectly. Encryption is a bit one if the inputs are static (e.g. keys or the like) and are a common way to target endpoints.

The issue is speculative execution. Whenever there is a branch, the CPU makes a guess. If it guesses wrong, it has to go back to the correct path which introduces a delay. So any branching code has the possibility of revealing information through the branch predictor.
> all branchless code runs in constant time

No - e.g. division is not constant time.

You have to have branchless code and only use certain instructions.

E.g. here is the list for RISC-V.

https://github.com/rvkrypto/riscv-zkt-list/blob/main/zkt-lis...

Most things except div/rem, branches and floating point are ok. Oh and obviously store/load.

Thanks, this is really interesting.
Because branch prediction exists: sometimes yes, often no. Among other reasons.