What the OP is referring to requires overprovisioning of the high priority traffic and the sine-like utilization (without it, the benefits of the "batch" tier is close to zero -- the preemption is too high for any meaningful work when you are close to the top of the utilization hill).
You get that organically when you are serving lots of users. And, there's not much GPUs etc. used for that. Training LLMs gives you a different utilization pattern. The "best effort" resources aren't as useful in that setup.
Because accelerators (tpus, gpus) unlike ram/cpu are notoriously hard to timeshare and vitrualize. So if you get evicted in an environment like that, you have to reload your entire experiment state from a model checkpoint. With giant models like that, it might take dozens of minutes. As a result, I doubt that these experiments are done using "spare" resources - in that case, constant interruptions and reloading would result in these experiments finishing sometime around the heat death of the universe :)