I'm not sure to understand that p90 latency problem, the cpu is used somewhere anyway so even if you use another language you won't be able to server a request while doing some intense cpu work?
The cpu will pause it to give all threads some cpu time. The difference is that it's the OS doing the work of cleaning up between threads, as opposed to the go runtime pausing and switching. Keeping it all in Go is faster, but it doesn't have the capability to pause, cleanup, and prepare for re-execution in the middle of a block of code that the OS does.