Hacker News new | ask | show | jobs
by a2128 265 days ago
Yep those are exactly the same considerations. LLM providers will have inconsistent latency and throughput due to batching across many users, while training with cloud GPU servers can have inconsistent bandwidth and delay for uploading mass training data. LLM providers are always limited in how you can use them (often no LoRAs, finetuned models, prompt restrictions)