| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by NikhilVerma 779 days ago
	This is absolutely wonderful, I am a HUGE fan of local first apps. Running models locally is such a powerful thing I wish more companies could leverage it to build smarter apps which can run offline. I tried this on my M1 and ran LLama3, I think it's the quantized 7B version. It ran with around 4-5 tokens per second which was way faster than I expected on my browser.

1 comments

Appreciate the kind words :)