Y
Hacker News
new
|
ask
|
show
|
jobs
by
dev_tools_lab
82 days ago
Thanks for this project. Prioritizing MoE models and adding an intelligent NVMe cache could improve efficiency, especially on the M4 Max where bandwidth makes usage more realistic.