This only works if the article isn't covering anything very novel. If there isn't a fairly wide base of explanations on the topics within, it struggles and hallucinates.
ChatGPT is the ultimate non-human bullshit artist so far developed. It's scary to think LLMs will improve and will be used by used unethically by students, professionals, state actor boiler rooms, and targeted harassment and manipulation of individuals. For the case of proving original work, it's worth considering recording chain-of-custody with an "anti-cheat" app for data capture and written works will be necessary to prove it was generated by or directed by a human being by recording specific data of creation to establish zero-knowledge, zero-reputation provenance.
The next question becomes: When automation and AI are used in drug discovery and STEM ever more autonomously (they are mostly narrow AI initiated by a human with specific goals today), what's the protocol for distributing credit? It seems plausible in the near future with large piles of computing power to train LLMs against journals, give it access to data, drug/protein/ligand databases, and let it find and screen beneficial candidate molecules for critical need medications such as new classes of antibiotics and antimycotics, and rare degenerative diseases, perhaps including gene edits. Efficient, progressive layers of screening methodology seem important. IIRC, 15 years ago (c. 2007) biomedical informatics people at Stanford (SAIL -> SMI -> BMIR) were using NLP heuristics against the literature for meta-analyses. I assume things progressed by several lightyears hence.
Linguistic transformations are the feature that I trust LLMs the most with. It requires relatively little encoding of knowledge outside of language. Parsing jargon to convey novel ideas is absolutely a doable task for a language model.
Yes. If you can necessarily and sufficiently prompt an LLM with authoritative facts, it can usually generate reasonable copy around it. For example, if you were a PwC or McKinsey consultant tasked with rapidly generating a voluminous, glossy deliverable in need of a consistent, formal style, LLMs offer an initial starting point Easy Button for content generation prior to human editing (perhaps assisted by an on-prem "grammarly" app) and pasting into ye olde templates.
That's not going to help you understand the subject material better. It's just going to assemble a facsimile of the lowest common denominator of other people on the internet talking about it. If anything, it will hurt your understanding and give you a false sense of knowledge.
Given how well chatGPT can explain code, I'm reasonably confident it can do much better than the lowest common denominator of people on the internet talking about any given expert document.
May well still give a false sense of knowledge, but that's a very common problem for all of us.
An LLM can reasonably explain code because code is both a closed, finite set of instructions and heavily explained in the training data that contains a vast number of learning materials.
A novel scientific paper is much, much different than code.
> A novel scientific paper is much, much different than code.
Sure, just not in a way that matters.
Human language is a "closed, finite set" of symbols (to the same extent as code, at least), all of which are heavily explained in the training data that contains a vast number of learning materials.
Science is about the novel things; but when the things are so novel the research has to invent new words to discuss these things, those new words come with explanations of what they mean.
It is clear that you do not understand how an LLM works if you believe it can explain a novel scientific paper to you with any reasonable degree of accuracy.
It doesn't look up or use existing explanations of papers like some kind of internet or database search. You have a fundamental misunderstanding of how it all works.
You don't know how it works. The fact that it is a "novel" scientific paper is completely irrelevant. Try it on a new paper just released today, that is definitely not in the training set.