Hacker News new | ask | show | jobs
by SparkyMcUnicorn 189 days ago
If the pretraining rumors are true, they're probably using continued pretraining on the older weights. Right?