|
|
|
|
|
by wesleyyue
842 days ago
|
|
it's an interesting idea. We asked our users this as well but at least for those we talked to, running their own model wasn't a big priority. What actually mattered to them is being able to try different (but high performance) models, privacy (their code not being trained on), and latency. We have some optimizations around time-to-first-token latency that would be difficult to do if we didn't have information about the model and their servers. |
|
Although, I feel Apple will break this trend and bring models to their chips rather than run them on the cloud. "Privacy first" will simply be a selling point for them but generally speaking cloud is not a big sell for them.
I am not at the level to do much optimizations, plus my product is a little more generic. To get to MVP, prompt engineering will probably be my sole focus.