|
|
|
|
|
by dofm
4 days ago
|
|
Yeah. I have not really tinkered much with parameter optimisation for the 35B model with MTP. Would be interested to see what you've found. I'm using the GGUF too; it appears slightly faster in llama.cpp now than current LM Studio but it's not clear to me if that is down to LM Studio having a little more code overhead, older llama.cpp under the hood, or just parameter differences. |
|