Hacker News new | ask | show | jobs
by rolisz 1077 days ago
FastChat-T5 can work for such a use case and it runs on (beefy) CPUs. With a 700$/month instance, it can do 4 conversations simultaneously, without needing GPUs.

The instant a company has sensitive data, this becomes very viable.