|
|
|
|
|
by ar813
378 days ago
|
|
If I take a step back and think back to say a few (or 5) years ago, what LLMs can do is amazing. One has to acknowledge that (or at least, I do). But as a scientist it's been rather interesting to probe the jagged edge and unreliability, including using deep research tools, on any topic I know well. If I read through the reports and summaries it generates, it seems at first glance correct - the jargon is used correctly, and physical phenomena referred to mostly accurately. But very quickly I realize that, even with the deep research features and citations, it's making a bunch of incorrect inferences that likely arise from certain concepts (words, really) co-occurring in documents but are actually physically not causally linked or otherwise fundamentally connected. In addition to some strange leading sentences and arguments made, this often ends up creating entirely inappropriate topic headings/ sections connecting things that really shouldn't be together. One small example of course, but this type of error (usually multiple errors) shows up in both Gemini and OpenAI models, and even with some very specific prompts and multiple turns. And keeps happening for topics in the fields I work in in the physical sciences and engineering. I'm not sure one could RL hard enough to correct this sort of thing (and it is not likely worth the time and money), but perhaps my imagination is limited. |
|
They fail to understand other engineering fields documentation and process are awful. Not that computer science is good because they are even less rigorous.
The difference is other fields don’t log every single change they make into source control and have millions of open source projects to pull from. There aren’t billions of books on engineering to pull from like with language. The information is siloed and those with the keys now know what it’s worth.