Hacker News new | ask | show | jobs
by habosa 39 days ago
Can you ELI5 why this is so slow for local inference but so fast for using hosted models?