|
|
|
|
|
by InfiniteLoup
85 days ago
|
|
I was always curious about how Tay worked technically, since it was build before the Transformers era. Was it based on a specific scientific paper or research? The controversy surrounding it seemed to have polluted any search for a technical breakdown or a discussion, or the insights gained from it. |
|
There seems to have been interest in a model which would pick up language and style of its conversations (not actually learning information or looking up facts). If you haven't trained an LSTM model before - you could train on Shakespeare's plays and get out ye olde English in a screenplay format, but from line to line there was no consistency in plot, characters, entrances and exits, etc. in a way which you'd expect after GPT-2. Twitter would be good for keeping a short-form conversation. So I believe Tay and the Watson that appeared on Jeopardy are more from this 'classical NLP' thinking and not proto-LLMs, if that makes sense.