Hacker News new | ask | show | jobs
by kastnerkyle 2811 days ago
I don't think this is necessarily true. CNNs can produce incredible results, see for example [0]. A snippet from the paper (their code is also available):

""" Prompt: The Mage, the Warrior, and the Priest

Story: A light breeze swept the ground, and carried with it still the distant scents of dust and time-worn stone. The Warrior led the way, heaving her mass of armour and muscle over the uneven terrain. She soon crested the last of the low embankments, which still bore the unmistakable fingerprints of haste and fear. She lifted herself up onto the top the rise, and looked out at the scene before her. [...] """

RNN/LSTM type models can also produce really coherent results on the scales shown in the blog.

Markov chains can be quite useful when you can manually create conditioning, but some of the key power of neural methods is learning to process conditioning and blend in ways that may not be obvious (or require significant domain expertise to grok) when handcrafting. YeGoblynQueenne below discusses a lot of the core issues for Markov chain based approaches.

In particular, plagiaristic sequences are incredibly common in Markovian generation, and though there are some papers on how to deal with this [1] it is not straightforward to decide what plagiarism really means in many contexts. This problem also arises in neural models, but isn't nearly as extreme due to the nature of both learning and the sampling process.

One huge bonus of Markov chains is that controlling exactly what you want or using hard rules is pretty easy, which is definitely NOT the case with neural generation...

This blog post is a really nice run through of how to get decent results from Markov chains, but don't discount neural methods either - they are getting better all the time, if you are willing to deal with the headaches.

[0] Hierarchical Neural Story Generation, https://arxiv.org/abs/1805.04833

[1] Max Order http://www.flow-machines.com/maxorder/