I always thought there was a place in the market for some kind of unified AI reseller because I have about five different AI accounts to test the capabilities, all with tokens that I'll probably never use again.
Because it happens when running your own models on localhost too. I have ollama and all the ones they support, but there are some on HuggingFace I run through llama.cpp inside apps where I won't have ollama installed, Replicate also has Stable Diffusion models, not just chat ones, and OpenAI which is its own thing. So it could potentially all be unified under a provider like that.
Haven't actually tried Replicate because I'm just running locally for free, but probably would try to find a single cloud provider for all deployments, like a Heroku of LLMs.
Because it happens when running your own models on localhost too. I have ollama and all the ones they support, but there are some on HuggingFace I run through llama.cpp inside apps where I won't have ollama installed, Replicate also has Stable Diffusion models, not just chat ones, and OpenAI which is its own thing. So it could potentially all be unified under a provider like that.
Haven't actually tried Replicate because I'm just running locally for free, but probably would try to find a single cloud provider for all deployments, like a Heroku of LLMs.