Hacker News new | ask | show | jobs
by cs702 269 days ago
I don't think it's been tried at scale, with large models.

It remains to be seen if it works better than conventional training schemes.