Hacker News new | ask | show | jobs
by DarthNebo 979 days ago
I'm hitting 3.9tok/s with CTX of 300 tokens on Android/778G via Userland & this is with an older unoptimized build of llama.cpp