|
|
|
|
|
by NikhilVerma
779 days ago
|
|
This is absolutely wonderful, I am a HUGE fan of local first apps. Running models locally is such a powerful thing I wish more companies could leverage it to build smarter apps which can run offline. I tried this on my M1 and ran LLama3, I think it's the quantized 7B version. It ran with around 4-5 tokens per second which was way faster than I expected on my browser. |
|