Hacker News new | ask | show | jobs
by wesleyyue 842 days ago
it's an interesting idea. We asked our users this as well but at least for those we talked to, running their own model wasn't a big priority. What actually mattered to them is being able to try different (but high performance) models, privacy (their code not being trained on), and latency. We have some optimizations around time-to-first-token latency that would be difficult to do if we didn't have information about the model and their servers.
1 comments

I see. Thanks Wesley for sharing and great to know it is not a priority. Also, the Mistral situation kinda makes me feel that big corps will want to host models.

Although, I feel Apple will break this trend and bring models to their chips rather than run them on the cloud. "Privacy first" will simply be a selling point for them but generally speaking cloud is not a big sell for them.

I am not at the level to do much optimizations, plus my product is a little more generic. To get to MVP, prompt engineering will probably be my sole focus.