Author of post here. I'd say most of the examples generated from the best model were good. However we chose examples that were not too gruesome, as news can be :)
We encourage you to try the code and see for yourself.
How does the model deal with dangling anaphora[1]? I wrote a summarizer for Spanish following a recent paper as a side project, and it looks as if I'll need a month of work to solve the issue.
[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.
Wouldn't it suffice to do a coreference pass before extracting sentences? Obviously you'll compound coref errors with the errors in your main logic, but that seems somewhat unavoidable.
I am working on this in my kbsportal.com NLP demo. With accurate coreference substitutions (eg., substituting a previous NP like 'San Francisco' for 'there' in a later sentence, substituting full previously mentioned names for pronouns, etc.) extractive summarization should provide better results, and my intuition is that this preprocessing should help abstractive summarization also.
>>"In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline."
Can you elaborate a little on that? Is the training the problem or is the model just not good at longer texts?
[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.