Oollama.ai focuses on making it as easy as possible to run models locally. We aim to provide a seamless experience that feels the same whether you're developing locally or deploying remotely for production.
I've been using a remote ollama server with a local jupyter notebook. The langchain configuration allows me to specify the ollama host. So I can develop locally with remote models. I guess I still don't see the difference. Does lepton decouple the HTTP server from the model backend?
declaimer: work at Lepton AI.