Hacker News new | ask | show | jobs
by transcranial 3630 days ago
The problem is that the latent space of a model trained on all comments is still massive. For any sort of coherence, you need to condition the probabilities on smaller spaces, or enforce structural bounds.

Here's an example of a paper where they simultaneously optimize high-level semantics with token-level sequence probabilities: http://arxiv.org/pdf/1606.00776v2.pdf.