Hacker News new | ask | show | jobs
by NKosmatos 649 days ago
It would be good if LLMs were somehow packaged in an easy way/format for us "novice" (ok I mean lazy) users to try them out.

I'm not interested so much with the response time (anyone has a couple of spare A100s?), but it would be good to be able to try out different LLMs locally.

6 comments

I understand your situation. It sounds super simple to me now but I remember having to spend at least a week trying to get the concepts and figuring out what prerequisite knowledge I would need between a continium of just using chatgpt and learning relevant vector math etc. It is much closer to the chatgpt side fortunately. I don't like ollama per se (because i can't reuse its models with other frontends due to it compressing them in its own format) but it's still a very good place to start. Any interface that lets you download models as gguf from huggingface will do just fine. Don't be turned off by the roleplaying/waifu sounding frontend names. They are all fine. This is what I mostly prefer: https://github.com/oobabooga/text-generation-webui
With Mozilla's llamafile you can run LLMs locally without installing anything: https://github.com/Mozilla-Ocho/llamafile
LM Studio is pretty good: https://lmstudio.ai/
One Docker command if you don't mind waiting minutes for CPU-bound replies:

https://localai.io/

You can also use several GPU options, but they are not as easy to get working.

You should try GPT4all. It seems to be exactly what you’re asking for.
This is already possible. There are various tools online you can find and use.