Hacker News new | ask | show | jobs
by crazygringo 2814 days ago
POS tagging is a nice touch, but I've long wondered if there's a way to combine Markov sentence generation with actual grammatical rules -- that if there's an opening parenthesis there has to be a closing one, dependent clauses, etc. It would need to both gramatically parse the inputs as well as produce grammar trees for the outputs that would then be filled in by Markov chains that fit the trees...

Heck, even do it at a level larger than sentences too, so questions are followed by answers, a line of dialog is followed by a response, paragraphs follow a realistic distribution of lengths...

1 comments

Hmm. A Markov chain where you actually do some kind of search? You build up a chain then backtrack and retry as necessary until you get something meeting the requirements.
See [0] for a method combining branch and prune with Markovian probabilities. I did a hacky version of my interpretation of this work at [1].

[0] https://link.springer.com/article/10.1007/s10601-010-9101-4

[1] https://github.com/kastnerkyle/pachet_experiments

Sounds like getting some sort of context free grammar out of it. There is a LR grammar algorithm that is very fast, you could use a Markov chain to generate and then a CFG to verify
Or a CFG to generate at a high level and a Markov chain to fill in the details
Excellent idea. Then it would actually read like language! But uh, how to choose sentence length and which grammars you want?