Hacker News new | ask | show | jobs
by mistercheese 78 days ago
Is it feasible to run LLM inference comparably without CUDA or Rocm? How much of the cost performance goes away?