Hacker News new | ask | show | jobs
by peterjliu 3589 days ago
Author of post here. I'd say most of the examples generated from the best model were good. However we chose examples that were not too gruesome, as news can be :)

We encourage you to try the code and see for yourself.

5 comments

How does the model deal with dangling anaphora[1]? I wrote a summarizer for Spanish following a recent paper as a side project, and it looks as if I'll need a month of work to solve the issue.

[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.

We're not "selecting" sentences as an extractive summarizer might. The sentences are generated.

As for how does the model deal with co-reference? There's no special logic for that.

Wouldn't it suffice to do a coreference pass before extracting sentences? Obviously you'll compound coref errors with the errors in your main logic, but that seems somewhat unavoidable.
I am working on this in my kbsportal.com NLP demo. With accurate coreference substitutions (eg., substituting a previous NP like 'San Francisco' for 'there' in a later sentence, substituting full previously mentioned names for pronouns, etc.) extractive summarization should provide better results, and my intuition is that this preprocessing should help abstractive summarization also.
That is inter-sentence logic? Even humans have trouble with such ambiguity for certain cases.
In the post you mentioned that

>>"In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline."

Can you elaborate a little on that? Is the training the problem or is the model just not good at longer texts?

Any chance some trained model will be released?
Any hints on how to integrate the whole document for summarization? ;)

I've seen copynet, where you do seq2seq but also have a copy mechanism to copy rare words from the source sentence to the target sentence.

Is it hard to get the code up and running on Google Cloud? Does TensorFlow come as a service?