Hacker News new | ask | show | jobs
by londons_explore 1120 days ago
Not close to as in "it can nearly write it correctly", but close to as in "I believe within a small number of years, gpt-4-like tools would be able to write you a unix-like kernel from scratch and have it actually work, with no human input".

I think we already have most of the pieces in place:

* big language models that sometimes get the right answer.

* language models with the ability to write instructions for other language models (ie. writing a project plan, and then completing each item of the plan, and then putting the results together).

* language models with the ability to use tools (ie. 'run valgrind, tell the model what it says, and then the model will modify the code to fix the valgrind error')

* language models with the ability to summarize large things to small.

* language models with the ability to review existing work and see what needs changing to meet a goal, including chucking out work that isn't right/fit for purpose.

With all these pieces, it really seems that with enough compute/budget, we are awfully close...

3 comments

It could also be a case of Tesla's full self-driving that's perpetually 2 years away. Progress in AI isn't linear so you can't extrapolate based on historic data. We really don't know if we're just a small step away from AGI or if we'll be stuck with the current crop of LLMs, with only incremental improvements over the next decade.
That seems to also miss the intentionality that goes into some things in the kernel as well… I understand now you mean when a feedback of LLMs are improved on. I guess fair enough there, no idea if that will work till we see it. However I think the problem of a Unix-like kernel is a lot less trivial due to the human intentionality that goes into some choices as well as bit-banging optimization.
> human intentionality that goes into some choices

Many choices are made at design time to make the right tradeoffs between complexity, speed, etc.

But with AI-designed things, complexity is no longer an issue as long as the AI understands it, and you no longer need to think too much about speed - just implement 100 different designs and pick the one which does best on a set of benchmarks also designed by the AI.

Current AIs can’t reason about trivial, 2 years old human cognitive things, let alone multi-million lines of code bases.
This is a misguided conclusion, as they are entirely different systems. Don't compare apples and oranges like this or you'll be very disappointed.
Can any of these do any real reasoning? I feel this would just result in some Goldberg machine of producing terrible shit the big majority of the time, but may be impressive in some edge cases when the star constellations are right.