| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dev_tools_lab 128 days ago
	Thanks for this project. Prioritizing MoE models and adding an intelligent NVMe cache could improve efficiency, especially on the M4 Max where bandwidth makes usage more realistic.