Hacker News new | ask | show | jobs
Pipeline-parallel LLM inference across GPUs on separate machines (github.com)
5 points by ngaut 6 days ago