|
|
|
|
|
by markisus
1260 days ago
|
|
The letter frames recent advances like ChatGPT as positive indicators for Scale's future business. However, task specific bespoke human data annotation may fall away to datasets generated by massive self supervised multi-modal models. The other day, I saw that one of these image generation models can be prompted to create image-semantic mask pairs. And I wouldn't be surprised if ChatGPT could generate text along side named entity labels for specialized domains. The holy grail for AI has always been obviating the need for manual annotation and increased efficiency on the data we do have. Scale's data annotation business seems at odds with the core goals of AI researchers, and may dry up as those goals are increasingly realized. |
|
>Each of the big AI breakthroughs (ChatGPT, GPT-3, DALL-E) have been made possible through extensive data collection and annotation infrastructure.
Maybe I'd count the RLHF stuff from ChatGPT as requiring a form of annotation. But like Dall-E is trained using existing captions/alt-text. No one is sitting in a room annotating images for Dall-E. Where would a company like Scale fit into this?