Hacker News new | ask | show | jobs
by happyPersonR 404 days ago
Pretty sure llama.cpp can already do that
1 comments

I forgot to clarify dealing with the network bottleneck
Just my two cents from experience, any sufficiently advanced LLM training or inference pipeline eventually figures out that the real bottleneck is the network!