Hacker News new | ask | show | jobs
by gpderetta 342 days ago
yes, preserve_none would be exactly what I want, except that I also want to avoid the call instruction in the final asm stream: as the call would not be paired with a ret, the call stack predictor will always mispredict it on every context switch, while an an indirect jmp has a much better chance to be predicted when two coroutines call each other in a tight loop (consider generators for example).

Ideally I think that a ctx_t* __builtin_context_switch(ctx_t* to) would need to be provided by the compiler.

Re thread_local, I believe at least MSVC has (had?) a fiber-safe flag that would handle thread_locals correctly by not caching addresses across function calls.