Y
Hacker News
new
|
ask
|
show
|
jobs
by
lxe
1180 days ago
Ah yes that's right. Well they technically do use a visual transformer for CLIP text encoder as I understand.