|
|
|
|
|
by troupo
420 days ago
|
|
> We have very different ideas of what "literal" means. You _interpreted_ what I wrote as "just Google it". I didn't say those words verbatim _nor_ do I mean that. Use a search engine if you want to find some high-quality papers. Or use Google Scholar. Or go straight to Arxiv. Or ask people on a forum. Aka "I will make some vague references to some literature, go Google it" > Instead, you could do more reading and research. Instead of vague "just google it", and vague ad hominems you could actually provide constructive feedback. |
|
My disagreement with the claim "AIs are overeager junior developers at best" largely has to do with both understanding what is happening under the hood and well as personal experience. Like many people, I have interacted for thousands of hours with ChatGPT, Claude, Gemini, and others, though my interaction patterns may be unusual -- not sure -- which I would characterize as (a) set expectations with a detailed prelude; (b) frame problems carefully; (c) trust nothing; (d) pushback relentlessly; (e) require 'thinking out loud'; (f) resist bundled solutions; (g) actively guide design and problem-solving dialogues; (h) actively mitigate sycophancy, overconfidence, and hallucination.
I've guided some junior / less experienced developers using many of the same patterns above. More or less, they can be summarized as "be more methodical". While I've found considerable variation in the quality of responses from LLMs, I would not characterize this variation as being anywhere close to that of a junior developer. I grant adjusting my interaction patterns considerably to improve the quality of the experience.
LLMs vary across dimensions of intelligence and capability. Here's my current assessment -- somewhat off the cuff, but I have put thought into it -- (1) LLM recall is superhuman. (2) Contextual awareness is mixed, sometimes unpredictably bad. Getting sufficient context is hard, but IMO this is less of a failure of the LLM or RAG and more about its lack of embodiment in a particular work setting. (3) Speed is generally superhuman. (4) Synthesis is often superhuman. (5) Ready-to-go high-quality all-in-one software solutions are not there yet. (6) Failure modes are painful; e.g. going in circles or waffling.
I should also ask what you mean by "overeager"? I would guess you are referring to the tendency of many LLMs to offer solutions problems despite lacking a way to validate their answers, perhaps even hallucinating API calls that don't exist?