Hacker News new | ask | show | jobs
by strangecasts 304 days ago
Should the title be the other way around? It's starting with the 20B model and pruning it to different parameter counts