| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mentos 781 days ago
	This is awesome. I have been using ChatGPT4 for almost a year and haven't really experimented with locally running LLMs because I assumed that the processing time would take too long per token. This demo has shown me that my RTX 2080 running Llama 3 can compete with ChatGPT4 for a lot of my prompts. This has sparked a curiosity in me to play with more LLms locally, thank you!

3 comments

bastawhiz 781 days ago

My pixel 6 was able to run tinyllama and answer questions with alarming accuracy. I'm honestly blown away.

abi 781 days ago

This is amazing. Thanks both for sharing your stories. Made my day.

navigate8310 780 days ago

Try https://lmstudio.ai/

moffkalast 780 days ago

Uh oh, I had that same moment a bit over a year ago with MLC's old WebLLM. Take a deep breath before you jump into this rabbit hole because once you're in there's no escape :)

New models just keep rolling in day after day on r/locallama, tunes for this or that, new prompt formats, new quantization types, people doing all kinds of tests and analyses, new arxiv papers on some breakthrough and llama.cpp implementing it 3 days later. Every few weeks a new base model drops from somebody. So many things to try that nobody has tried before. It's genuinely like crack.