|
|
|
|
|
by hn_acc1
386 days ago
|
|
Just curious what area you work in? Python or some kind of web service / Jscript? I'm sure the LLMs are reasonably good for that - or for updating .csv files (you mention spreadsheets). I write code to drive hardware, in an unusual programming style. The company pays for Augment (which is now based on o4, which is supposed to be really good?!?). It's great at me typing: print_debug( at which point it often guesses right as to which local variables or parameters I want to debug - but not always. And it can often get the loop iteration part correct if I need to, for example, loop through a vector. The couple of times I asked it to write a unit test? Sure, it got a the basic function call / lambda setup correct, but the test itself was useless. And a bunch of times, it brings back code I was experimenting with 3 months ago and never kept / committed, just because I'm at the same spot in the same file.. I do believe that some people are having reasonable outcomes, but it's not "out of the box" - and it's faster for me to write the code I need to write than to try 25 different prompt variations. |
|
Thanks for sharing your perspective with ACTUAL details unlike most people that have gotten bad results.
Sadly hardware programming is probably going to lag or never be figured out because there's just not enough info to train on. This might change in the future when/if reasoning models get better but there's no guarantee of that.
> which is now based on o4
based on o4 or is o4, those are two different things. augment says this: https://support.augmentcode.com/articles/5949245054-what-mod...
Which IMO is....a cop out, a terrible take, and just...slimey. I would not trust a company like this with my money. For all you know they are running your prompts against a shitty open source model running on a 3090 in their closet. The lack of transparency here is concerning.You might be getting bad results for a few reasons:
Note: your company is paying a subscription to a service that isn't allowing you to bring your own keys. they have an incentive to optimize and make sure you're not costing them a lot of money. This could lead to worse results.see here for Cline team's perspective on this topic: https://www.reddit.com/r/ChatGPTCoding/comments/1kymhkt/clin...
I suggest this as the bare minimum for the HN community when discussing their bad results with LLMs and coding:
I'm genuinely surprised when someone complaining about LLM results provides even 2 of those things in their comment.Most of the cynics would not provide even half of this because it'd be embarrassing and reveal that they have no idea what they are talking about.