|
|
|
|
|
by frisco
3938 days ago
|
|
Fun hack. If anything, it highlights how compelling deep learning and RNNs are: no messing with NLP, no messing with building other features or adding up classifiers, etc. The manual feature engineering means it might work better on a smaller dataset, but even then probably not. For comparison with Andrej Karpathy's RNN code (http://karpathy.github.io/2015/05/21/rnn-effectiveness/) training on the "HarryPotter(xxlarge).txt" (76K) file using the default hyperparameters and a batch size of 25 gets me: > But Atfa the loom proset! No contarin — mibll,’s just pucking to live
> note left them hard and fitther, clooked of course little happered to
> trige on the fistpened. Their knew Harry mear from the shind-beas
> eveided, at Uncle Vernon’s thepped to spept were pelled and beadn
> Harry, distine dy use. Harry had in a amalout, into the fish sfary door.
The difference here is tokenizing on words vs letters: the RNN code is trying to learn the structure of English from completely zero whereas the code here gets to work with well-formed words from the beginning. But otherwise, the results in the linked post are about as silly semantically: > Input: "Harry don't look"
> Output: "Harry don't look , incredibly that a year for been parents in .
> followers , Harry , and Potter was been curse . Harry was up a year ,
> Harry was been curse "
EDIT: Updated the RNN output text. Was sampling from a checkpoint file for a different input corpus. Got confused by the long similar-looking filenames. Doesn't change the overall point though. |
|
As I've posted here before, people have been training character n-gram models and getting language modeling performances comparable to those from word-based models---without using neural networks---for at least a decade. That it works with RNNs is no surprise because it worked just fine with the much more constrained predecessor technology.