Someone responded to a previous comment of mine [0] positing a Peter principle [1] of slopcoding — it will always be easier to tack on a new feature than to understand a whole system and clean it up. The equilibrium will remain at the point of near, but not total, codebase incomprehensibility.
People are often skeptical when I say this, but there's simply no guarantee that it's possible in principle to clean up a bad architecture. If your system is "overfitted" to 10,000 requirements from 1,000 customers, it may be impossible to satisfy requirements 10,001 through 10,100 without starting over from scratch.
It's really not that big of a word. The CAP theorem shows that as few as three reasonable-sounding requirements with no obvious conflicts can be impossible to satisfy simultaneously. (User needs will start more flexible than strict mathematical requirements, of course, but once people start to build production workloads on top of your systems that flexibility is radically reduced.)
I really am surprised that people on a heavy CS themed forum still have trouble grasping this.
Imagine the year is 1995, C exists, but some guy out there is working on essentially what modern Python is. He says to you "check out this language, you can just import stuff, and use it and dynamically modify anything at run time". You can probably come up with hundreds of arguments about things that could go wrong, like memory clean up, threading, e.t.c, but turns out, incrementally, they were all solved and we have the modern Python that basically is good enough to build these large LLM models.
Now imagine modern programming and computing is what C was back in 1995, and AI use is that guy building the Python code.
You can imagine anything you want, but it’s not an argument - you could apply this to anything. “Python was successful after a dubious beginning so NFTs will be successful”
Also, Python does not build or run large language models. It orchestrates C code that does that, and it was probably good enough to do that in 1998.
Highly dynamic languages existed for decades prior to 1995, Python was not particularly innovative in its features at the time. There were also countless languages more feature-rich than C being used for development at the time.
The biggest change that happened was that hardware kept getting better and it became feasible to use garbage-collected languages everywhere including really inefficient implementations like CPython.
That being said, 30 years later Python is still slow as shit even compared to other dynamic languages and runs into all kinds of scaling issues when used for anything serious. And everywhere that performance matters, software continues to be written in typed, compiled languages including C (but also C++, Rust, Go, etc.). Even in ML, Python chiefly acts as a thin wrapper and glue language for high performance CUDA libraries (aka C and C++).
So your historical analogy is mostly anachronistic.
No, you just don't have a grasp on reality. For example, you claim that Python runs into scaling issues for anything serious, but you are blissfully unaware that youtube and uber both run python backends. Nobody cares that its "slow" by whatever metric you consider. Its fast enough. The metric that matters is developer time not compute time, because the former is vastly more expensive. Python and Node are the number one languages on github for a reason. And you are vastly deluded on how many jobs there for C++ and Rust devs lol.
In the future, you won't be dealing with strings, json, or apis. You will be importing agents, and giving them brief instructions, either in plain English or in some intermediate language higher than Python that is more brief. Wanna deal with database reliability ? Import database agent and give it brief instructions on what you want to manage. Just like you mention, right now Python is the wrapper for low level libraries, because everyone who is doing work in ML doesn't want to waste time making sure their C Cuda kernels compile. In the same way, nobody is going to care if they get the API headers right, or if their strings are correctly parsed when you can just invoke a dedicated LLM (which will likely be highly specialized small model able to run on local hardware) to do all that.
You can scream and cry as much as you want how that is bad, how its slow, but nobody is going to care because shit is going to get built faster. Ever notice how despite the massive layoffs across tech, there isn't service degradation in any sector? Good luck trying to sell your Rust skills in the future lol.
The point is that in the future, AI will be able handle things like missing databases just like the modern high level dynamic languages can import a library to handle whatever you want.
I can't tell if you're being facetious, but a future AI really may be able to fill in a missing database. Like, if it knew some of the entries, it could infer the rest.
Wow - imagine being able to infill a geophysical database with the dullest possible milquetoast totally expected signal derived from the NASVD most common eigen vectors.
The infill will look seamless.
And entirely lack any actual strikes of interest - the outliers are exceptional signal and the entire raison d'etre for building such a database.
Jeez, if AI can just infill where the gold is, why even bother to look in the first place.
>"clean up" dropped databases, compromised computers or leaked personal data?
For each of those things, you can right now build an agent that handles all of that. Or use a large frontier model with enough context to build code that ensures all of those edge cases are handled.
Future coding will essentially be like this. The concepts of dynamic vs compiled language will shift towards having frontier edge models put together code versus small runtime edge models dynamically processing data.
Frankly this is what everyone is counting on whether they know it or not. The question though is not “will the models get good enough?”. The question is does the repo even contain enough accurate information content to determine what the system is even supposed to be doing.
Yes. And as the models get better, it works better. But at one point you do have to understand the code because it's also just guessing as to what your actual intentions are.
It doesn't know what mess you want to clean up. A lot of times AI just starts making up new patterns on top of other patterns and having backwards compatibility between the two. How does it know which one you actually like?
Every frontier model from each major US lab is cheaper than their frontier model this time a year ago with the exception of Anthropic whose pricing has remained exactly the same.
[0] https://news.ycombinator.com/item?id=48037128#48038639
[1] https://en.wikipedia.org/wiki/Peter_principle