Hacker News new | ask | show | jobs
by ElHacker 3630 days ago
This is a cool project. I did something similar with LSTM in Tensorflow to generate Reddit comments. The biggest problem I found in my experiment is that the comments generated had not real meaning, nor even tried to deliver a message. Real Reddit users identified this right away, resulting in my Bot being banned. I think the author faced the same issues here.

Does anyone has any suggestions on papers that try to solve that problem?

2 comments

The problem is that the latent space of a model trained on all comments is still massive. For any sort of coherence, you need to condition the probabilities on smaller spaces, or enforce structural bounds.

Here's an example of a paper where they simultaneously optimize high-level semantics with token-level sequence probabilities: http://arxiv.org/pdf/1606.00776v2.pdf.

I don't have any papers, but I guess you could try and scrape other short comments on the same article and basing it of those?