|
|
|
|
|
by csande17
2474 days ago
|
|
I've never really gotten why AI types are so concerned about text-generation models. Like, sure, I can kind of see why you wouldn't want to make the Deepfakes program public; it currently takes a lot of time, effort, and expertise to swap faces realistically in a video, and maybe we don't want to give every average Joe the ability to do that. But pretty much everyone in the world can already pretty trivially write text. (I'm doing it right now!) And the "typical" generation output from these programs usually isn't very good—OpenAI had to try like thirty times for each of the prompts in their PR materials—so it usually ends up being less work to just write the fake news yourself instead of using the software. My personal conspiracy theory is that all this talk of "the model is too dangerous to release" really boils down to "if we let people test out the model, they'll find it doesn't work as well as our PR team wants them to think it does". |
|
My guess is that they will perfect the transformer and its training process, curate the dataset and make this method really easy to use. Maybe it can do translation, math, even auto-complete code. That is only by iterating more on the current formulation of the Transformer.
But it is also possible that it is surpassed by something even better. This new language model could replace the inductive bias specific to the Transformer - the ability to "attend" to any part of the input text, with something more efficient, because Transformers are quite hard and expensive to train right now. Maybe the Transformer inductive bias is too general (like a fully connected network) and needs too much data, with a slightly different idea it could be made much more efficient and probably more convincing.