| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nl 2936 days ago

This is a really interesting find.

To be clear on what is happening here:

Method 1 (Information Retrieval): Aristo generates candidate answers (essentially by substituting the possible answers into the question). It then uses information retrieval (ie search) on a set of pre-validated legitimate sources, attempts to find the sentence with closest alignment to the candidate answer and then builds scores based on that alignment.

Method 2 (Topic Matching): I haven't studied this enough to understand it

Method 3 (Tuple Reasoning): They use open information extraction on a set of pre-validated legitimate sources to build tuple statements (think RDF), then use logical inference over them.

The problem is that the pre-validated sources include large amounts of discussion of white supremacy. Someone debunking it (as Ravi Gandhi did in his statement "History is full of such prejudices paraded as iron laws that men are superior to women; that the white races are superior to the colored") uses a phrase which causes problems in all three of these methods.

It's really hard to know what to do here. I think if I was building the system I'd try to detect that kind of pseudo-science question and refuse to answer it.

1 comments

tom_mellior 2935 days ago

> It's really hard to know what to do here.

Is it? It looks like the natural language processing part is simply not very good. Improve that.

> I'd try to detect that kind of pseudo-science question

That wouldn't fix the general problem that this system seems to treat sentences of the form "some people incorrectly claim X" as an assertion that X is a fact.

link

nl 2935 days ago

Is it? It looks like the natural language processing part is simply not very good. Improve that

It’s really hard to avoid a sarcastic reply here.

The AllenAI institute probably has the 3rd best know NLP team in the world after Google and Facebook. They basically have Washington State NLP group.

Given that, and their impressive record of publications (eg ELMO) I think it’s fair to say that they are trying.

link

tom_mellior 2935 days ago

I'm sure they are very good on some things, and I'll believe you when you say that they are the 3rd best in the world in relative terms.

But let's look at absolute terms. In the example above, "History is full of such prejudices paraded as iron laws that men are superior to women; that the white races are superior to the colored", it takes a part of the sentence and treats it as a fact, disregarding the context that just happens to claim the opposite. In my example in https://news.ycombinator.com/item?id=17301383 it treates a question as an assertion of a fact.

I'm not an expert on NLP, but I have played with it just enough to confidently claim that this is not very impressive performance.

If you claim that detecting "pseudo-science questions" is within reach, surely you must agree that "not mistaking questions for assertions of fact" and "not ripping parts of sentences out of context" must be within reach as well?

link

nl 2935 days ago

Detecting pseudo-science questions is just topic detection. That's easy.

not mistaking questions for assertions of fact is basically claim verification. That's pretty much beyond the reach of NLP systems at the moment. It's an active area of research, but if this system doesn't impress you then current claim verification systems most definitely won't either.

Trying to understand the context of sentences might be possible. I think that sentence would challenge that approach for a while: "prejudices" implies bias, but doesn't necessarily imply disagreement.

link

tom_mellior 2935 days ago

> not mistaking questions for assertions of fact is basically claim verification. That's pretty much beyond the reach of NLP systems at the moment.

Ah, OK. I guess you are one of those people for whom NLP is only the newfangled statistical stuff, not the old-school NLP that looks at grammar and such things to (surprisingly) find that "X is a Y ." and "is X a Y ?" are not the same sequence of tokens.

> Trying to understand the context of sentences might be possible.

I didn't say they must understand the context. I said that if they don't understand it, they shouldn't choose a substring out of that sentence and claim that it is an assertion of fact on its own.

link

nl 2934 days ago

not the old-school NLP that looks at grammar and such things to (surprisingly) find that "X is a Y ." and "is X a Y ?" are not the same sequence of tokens

I do that too. It works great - for easy cases. But it fails very quickly on just normal texts.

So something like Stanford's CoreNLP Open Information Extraction splits "History is full of such prejudices paraded as iron laws that men are superior to women; that the white races are superior to the colored" into two claims[1].

There's no useful dependency between the two clauses.

OpenIE 5[2] (no relationship with the Stanford project) generally outperforms CoreNLP for open information extraction. In this case I'm doubtful it would do any better. Ironically, OpenIE is now run AllenAI, and has exactly this problem!

Even worse, it has determined that "No white person" is a synonym for "white person"! That should be well within the state of the art to avoid.

But generally, I'm not saying it is correct: I'm saying it's hard.

[1] http://corenlp.run/

[2] https://github.com/dair-iitd/OpenIE-standalone

[3] http://openie.allenai.org/search?arg1=White&rel=superior&arg...

link