Hacker News new | ask | show | jobs
by verdverm 1163 days ago
These still don't seem unique to LLMs.

1. There are many GPU based applications already in production, I've seen work queues, which are used in any system where the load exceeds the capacity, GPU or not.

2. Content moderation is not unique to LLMs

3. Training and serving users at inference time are different beasts