Hacker News new | ask | show | jobs
by runjake 949 days ago
+1. The APIs and programmability are there on open models. Heck my M3 Max is churning out answers far more quicker than from GPT-4.

But the quality isn’t nearly as good as GPT. The “hallucinations” from the open models are often annoyingly inferior.

1 comments

Absolutely interesting. Would you mind sharing which models and tools you are using on your M3?
Sure. I’m not a computer scientist so I’m using ollama[1] along with a number of the most popular models[2].

It publishes a REST API on localhost and you can make use of “Modelfiles”, which are modeled on Dockerfile, to create customized models (ala GPTs).[3]

It took 10 minutes to get all this working — most of it waiting for model downloads. I had working code and Modelfiles in another 20 minutes after that. I also have it (with the llama uncensored model) hooked into my Raycast.

Switching models on the fly is pretty seamless.

If for some reason my links go over anyone’s heads, there’s a number of thorough and approachable ollama walkthroughs on YouTube.

1. https://ollama.ai/

2. https://ollama.ai/library

3. https://github.com/jmorganca/ollama