Hacker News new | ask | show | jobs
by petra 551 days ago
But given inference time compute, to give a strong reply reasonably fast, you'll need a lot of compute, very rarely used.

Economically this fits the cloud much better.