Y
Hacker News
new
|
ask
|
show
|
jobs
by
DarthNebo
979 days ago
I'm hitting 3.9tok/s with CTX of 300 tokens on Android/778G via Userland & this is with an older unoptimized build of llama.cpp