| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by transcranial 3630 days ago
	The problem is that the latent space of a model trained on all comments is still massive. For any sort of coherence, you need to condition the probabilities on smaller spaces, or enforce structural bounds. Here's an example of a paper where they simultaneously optimize high-level semantics with token-level sequence probabilities: http://arxiv.org/pdf/1606.00776v2.pdf.