| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by byyoung3 656 days ago
	Yes, now it seems obvious, but before this it wasn't clear that that would be something that could speed things up, due to the fact that the pretrained model was trained on a separate objective. It's a brilliant idea that works amazingly.

2 comments

psb217 655 days ago

It's a classic "Will it work? IDK, maybe. Let's try it and find out..." paper.

link

byyoung3 653 days ago

haha yeah I mean I think they are all like that to a certain extent

link

fxtentacle 655 days ago

To me, it seemed that the technique presented here was just a logical continuation of methods that OpenAI used when they trained the Dota agents:

https://arxiv.org/pdf/1912.06719v1

And, arguably, Facebook's unsupervised pre-training for their multi-modal speech-to-text models is kind of the same idea as unsupervised pre-training for a multi-modal text-to-image diffuser.

https://ai.meta.com/research/publications/wav2vec-2.0-a-fram...

link