| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dofm 4 days ago
	Yeah. I have not really tinkered much with parameter optimisation for the 35B model with MTP. Would be interested to see what you've found. I'm using the GGUF too; it appears slightly faster in llama.cpp now than current LM Studio but it's not clear to me if that is down to LM Studio having a little more code overhead, older llama.cpp under the hood, or just parameter differences.