Y
Hacker News
new
|
ask
|
show
|
jobs
by
ricardobeat
1988 days ago
In the linked CLIP paper they say it is trained on 256 GPUs for 2 weeks. No mention of the size of the trained output.