Hacker News new | ask | show | jobs
by dwringer 938 days ago
One can fine tune a smaller parameter model like GPT-NeoX on a home GPU pretty readily, and it's absolutely capable of doing what you specified. Teach it with a bunch of example sentences that have parts of speech like verb and noun following a simple grammar, and you will see it generate sentences afterward that combine the parts of speech grammatically in novel ways, using the same grammatical structures but forming productions that did not appear in the training set.

Depending on settings, they are also capable of producing a lot of ungrammatical nonsense, but the odds of what it produces are changed considerably by the training.

1 comments

No I mean creating 50-grams that appear in the dataset created by the paper linked by OP, but not present in the actual dataset the model was trained on. Of course, the model would be able to output 50-grams that were not present in either.
As I understand you, what you state is exactly what I meant. If you train with a bunch of text containing substrings of those 50-grams, but not the full 50-grams themselves [or, expose it to the same vocabulary used in the same parts of speech as in the full 50], the model will pretty readily produce the full 50-grams despite never having seen them in their entirety. Try it out, it's pretty easy to do on a modern GPU and can be done in less than an hour.