|
|
|
|
|
by sysmax
353 days ago
|
|
I am working on a GUI for delegating coding tasks to LLMs, so I routinely experiment with a bunch of models doing all kinds of things. In this case, Claude Sonnet 3.7 handled it just fine, while Llama-3.3-70B just couldn't get it. But that is literally the simplest example that illustrates the problem. When I tried giving top-notch LLMs harder tasks (scan an abstract syntax tree coming from a parser in a particular way, and generate nodes for particular things) they completely blew it. Didn't even compile, let alone logical errors and missed points. But once I broke down the problem to making lists of relevant parsing contexts, and generating one wrapper class at a time, it saved me a whole ton of work. It took me a day to accomplish what would normally take a week. Maybe they will figure it out eventually, maybe not. The point is, right now the technology has fundamental limitations, and you are better off knowing how to work around them, rather than blindly trusting the black box. |
|
I think it's a combination of
1) wrong level of granularity in prompting
2) lack of engineering experience
3) autistic rigidity regarding a single hallucination throwing the whole experience off
4) subconscious anxiety over the threat to their jerbs
5) unnecessary guilt over going against the tide; anything pro AI gets heavily downvoted on Reddit and is, at best, controversial as hell here
I, for one, have shipped like literally a product per day for the last month and it's amazing. Literally 2,000,000+ impressions, paying users, almost 100 sign ups across the various products. I am fucking flying. Hit the front page of Reddit and HN countless times in the last month.
Idk if I break down the prompts better or what. But this is production grade shit and I don't even remember the last time I wrote more than two consecutive lines of code.