Hacker News new | ask | show | jobs
by zozbot234 3 hours ago
Single user scenarios can also use MTP to make auto-regressive inference more compute-intensive with no loss of quality.