|
|
|
|
|
by somat
56 days ago
|
|
"When the software is being written by agents as much as by humans, the familiar-language argument is the weakest it has ever been - an LLM does not care whether your codebase is Java or Clojure. It cares about the token efficiency of the code, the structural regularity of the data, the stability of the language's semantics across releases." Isn't familiarity with the language even more the case with a LLM. The language they do best with is the one with the largest corpus in the training set. |
|
Stability, consistency and simplicity are much more important than this notion of familiarity (there's lots of code to train on) as long as the corpus is sufficiently large. Another important one is how clear and accessible libraries, especially standard libraries, are.
Take Zig for example. Very explicit and clear language, easy access to the std lib. For a young language it is consistent in its style. An agent can write reasonable Zig code and debug issues from tests. However, it is still unstable and APIs change, so LLMs get regularly confused.
Languages and ecosystems that are more mature and take stability very seriously, like Go or Clojure, don't have the problem of "LLM hallucinates APIs" nearly as much.
The thing with Clojure is also that it's a very expressive and very dynamic language. You can hook up an agent into the REPL and it can very quickly validate or explore things. With most other languages it needs to change a file (which are multiple, more complex operations), then write an explicit test, then run that test to get the same result as "defn this function and run some invocations".