When you allocate against a running GC it will penalize you for this (literally sleep your thread) - hence garbage tail latency. The solution is to not do allocation which is what gogo library strives for
Most of the phases of GC can be done parallel, so allocation should stop only for a really small amount of time. For comparison, Java's low-latency GC (which optimize for latency but may reduce throughput), ZGC can do worst-case <1ms pause times, that is the OS scheduler will cost more than GC.
I’m talking about Go garbage collector specifically which does mark and sweep and will slow down functions doing frequent allocations on purpose in order to catch up. This is different from stw kind of scenario e.g shenandoah