Hacker News new | ask | show | jobs
by guywhocodes 974 days ago
Seems to be running on llama.cpp, so it's going to be a question of performance. I don't have any M-cpu but on my 13th gen i5 I can run mistral at about 6.5 tokens per second. Which seems comparable to what this is.