Hacker News new | ask | show | jobs
by lostmsu 89 days ago
> MoE models via expert sharding with zero cross-node inference traffic

This makes the whole project questionable