Hacker News new | ask | show | jobs
by marcopicentini 1042 days ago
No idea yet. Which you recommend to start? It will be hosted on a Ubuntu server (Digital Ocean, Linode etc..)
2 comments

Any reason you're doing that vs. using Lambda Labs / Replicate / together.ai / Banana.dev, etc.

There's a lot of good model deployment platforms that would make it easy to call your model behind a hosted endpoint

-- If you do want to self-host - there's some great libraries like https://github.com/lm-sys/FastChat and https://github.com/ggerganov/llama.cpp that might be helpful

If none of these really solve your issue - feel free to email me and I'm happy to help you figure something out - krrish@berri.ai

You can definitely start by checking out ollama, it was super helpful for me
It’s only for MacOSX. I expect to load the model on a Ubuntu server, not on my local dev machine.
You have to build it if you want it for Ubuntu, Windows, or anything else. Just build Go on your machine and have at it.