Hacker News new | ask | show | jobs
by UnquietTinkerer 2804 days ago
For anyone interested, here are links to the slides and the accompanying white paper.

[Slides] https://millcomputing.com/blog/wp-content/uploads/2018/04/20...

[White Paper] https://millcomputing.com/blog/wp-content/uploads/2018/01/Sp...

I haven't read the paper yet; hopefully it offers more detail than the talk does because I am still confused about how the Mill avoids cache pollution from speculative loads.

EDIT: Here is my attempt at a summary of the relevant bits of the whitepaper:

The Mill is immune to Meltdown for the same reason AMD et al. are; it does permission checks before loading rather than in parallel and thus the load faults before going to memory.

The Mill is immune to Spectre because "Current Mill configurations will [speculatively] issue, and revoke, a maximum of two instructions. Revocation includes all cache and other micro-architectural side effects."

Neither of those points is covered in the talk. I don't know enough about the subject to judge, but the arguments in the paper seem a bit glib. I'd like to hear from an expert on the subject.

1 comments

I’m pretty surprised if they don’t leave speculatively loaded (and still correct) data in the cache. My understanding of speculation is that was sort of the point: often you won’t compute the right value (because you have to be right in every instance) but you will have loaded nearly all of the relevant data into the cache, so it’s comparatively fast the second time around.
This argument holds better for an OoO CPU that is speculating 100 instructions ahead, so there's significant work done in this window. When your speculative execution is only 2 cycles ahead, you aren't throwing away much work; you'd be lucky to even have work to throw away by that point, at least as it applies to cache misses.
I'd be very surprised is they didn't too. But Spectre isn't just about what's in cache, you have to load secret data and then do another load with a location based on that secret data before the the mis-predicted branch is caught. The number of clock cycles from branch prediction to branch resolution on the Mill is just too short for you to do all of that, just like it is on most in order architectures. Just loading the secret data into cache isn't enough to be a problem. You already knew its address if the attack is going to work.