| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kolinko 322 days ago
	"Yet, every time I tried to get LLMs to perform novel research, they fail because they don’t have access to existing literature on the topic. Whereas, humans, on the other hand, discovered everything humanity knows." Just because the author was unable to wrangle LLM to do novel research doesn't mean that it's impossible. We already have examples of LLMs either doing or aiding significantly with novel research.

3 comments

tovej 322 days ago

This comment would be more useful if you actually provided those examples.

I'm also a researcher and agree wholeheartedly with the article. LLMs can maybe help you sift through existing literature or help with creative writing, at most they can be used or background research in hypothesis generation by finding pairs of related terms in the literature which can be put together into a network of relationships. They can help with a few tasks suitable for an undergrad research assistant.

link

viraptor 322 days ago

https://arxiv.org/abs/2409.04109

> we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas while being judged slightly weaker on feasibility.

It's a bit better than just finding related pairs. And that's with sonnet 3.5 which is basically ancient at this point.

link

tovej 322 days ago

This paper centers "novelty" but also finds that human ideas are more feasible, and that LLM-generated ideas are not diverse and that LLMs cannot reliably evaluate ideas. None of the ideas were actually evaluated by performing experiments either.

Pretty much what I would expect. The paper also seems to be doing exactly what I described, I don't understand how the technique is better than that?

link

JimDabell 322 days ago

> I'm also a researcher and agree wholeheartedly with the article.

The article says:

> Yet, every time I tried to get LLMs to perform novel research, they fail because they don’t have access to existing literature on the topic.

You say:

> LLMs can maybe help you sift through existing literature

> they can be used or background research in hypothesis generation by finding pairs of related terms in the literature

As far as I can see, these two positions are mutually exclusive. Aren’t you disagreeing with the article?

link

tovej 322 days ago

No, helping with chores is not the same as "performing research", it has limited utility for minor tasks, it is not essential and it does not even necessarily have a positive productivity impact. To illustrate the point, when I use vim to write LaTeX, would you say that vim is "performing research"?

link

JimDabell 322 days ago

The relevant part of what I quoted was not “performing research” it was “they don’t have access to existing literature”.

link

tovej 322 days ago

That part we also agree on, generating hypotheses is done on the basis of existing literature, lit review (& related reasoning) is done on the basis of existing literature. The blog author is talking about reasoning in the context of a new research topic.

link

queenkjuul 322 days ago

Well hold on now, are the LLMs doing novel research or not? "Aiding significantly" (as in, humans doing novel research are using LLMs to aid their process) is not remotely the same -- can you show us examples of LLMs doing novel research?

Researchers using GPT to summarize papers may be helping humans create novel research, but it certainly isn't GPT doing any such thing itself.

link

viraptor 322 days ago

It's a really weird claim too, because we don't magically communicate the research between minds either. You need a person to find, read, process the new research and the same applies to the LLM - either rag it or provide the whole relevant thing in context.

link

benterix 322 days ago

I don't follow. SOTA models have access to tons of existing research. Basically orders of magnitude more than a human could read during a lifetime. And yet, they fail to produce anything new. Hey, even a simple "generate me an innovative list of ten marketing ideas for X, things that nobody has ever done before" gives ridiculous results.

link