Hacker News new | ask | show | jobs
by mft_ 23 days ago
Thanks - I’m in the process. I’ve tried briefly, but so far it appears marginally slower. (Noting that llama-bench doesn’t support MTP yet so you’re reduced to running different prompts and eyeballing the log.)

So I’m assuming I’ve done something wrong along the way, but I’ve not had time yet to explore it.