|
> With all due respect, with a response like "What AI coding tools/models have you been using?" to a complaint that AI tools just don't seem to be effective, what difference does a reply to that even make? "Damn, these relational databases really suck, I don't know why anyone would use them, some of the data by my users had emojis in them and it totally it! Furthermore, I have some bits of data that have about 100-200 columns and the database doesn't work well at all, that's horrible!" In some cases knowing more details could help, for example in the database example a person historically using MySQL 5.5 could have had a pretty bad experience, in which case telling them to use something more recent or PostgreSQL would have been pretty good. In other cases, they're literally just holding it wrong, for example trying to use a RDBMS for something where a column store would be a bit better. Replace the DB example with AI, same principles are at play. It is equally annoying to hear people blaming all of the tools when some are clearly better/worse than others, as well as making broad statements that cannot really be proven or disproven with the given information, as it is people always asking for more details. I honestly believe that all of these AI discussions should be had with as much data present as possible - both the bad and good experiences. > If your experience makes you believe that certain tools are particularly good--or particularly bad--for the tasks at hand, you can just volunteer those specifics. My personal experience: * most self-hosted models kind of suck, use cloud ones unless you can get really beefy hardware (e.g. waste a lot of money on them)
* most free models also aren't very good, nor have that much context space
* some paid models also suck, the likes of Mistral (like what they're doing, just not very good at it), or most mini/flash models
* around Gemini 2.5 Pro and Claude Sonnet 4 they start getting somewhat decent, GPT 5 feels a bit slow and like it "thinks" too much
* regardless of what you do, you still have to babysit them a lot of the time, they might take some of the cognitive load off, but won't make you 10x faster usually, the gains might definitely be there from reduced development friction (esp. when starting new work items)
* regardless of what you do, they will still screw up quite a bit, much like a lot of human devs do out there - having a loop of tests will be pretty much mandatory, e.g. scripts that run the test suite and also the compilation
* agentic tools like RooCode feel like they make them less useless, as do good descriptions of what you want to do - references to existing files and patterns etc., normally throwing some developer documentation and ADRs at them should be enough but most places straight up don't have any of that, so feeding in a bunch of code is a must
* expect usage of around 100-200 USD per month for API calls if the rate limits of regular subscriptions are too limiting
Are they worth it? Depends. The more boilerplate and boring bullshit code you have to write, the better they'll do. Go off the beaten path (e.g. not your typical CRUD webapp) and they'll make a mess more often. That said, I still find them useful for the reduced boilerplate, reduced cognitive load, as well as them being able to ingest and process information more quickly than I can - since they have more working memory and the ability to spot patterns when working on a change that impacts 20-30 files. That said, the SOTA models are... kinda okay in general. |