Hacker News new | ask | show | jobs
by YeGoblynQueenne 2227 days ago
So that's basically program synthesis from natural language (ish) specifications (i.e. the comments).

I can see this being a useful tool [1]. However, I don't expect any ability for innovation. At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).

That's not to say that it can't write new code, that nobody has quite written before in the same way. But in order for a tool like this to be useful it must stick as close as possible to what is expected- or it will slow development down rather than helping it. Which means it can only do what has already been done before.

For instance- don't expect this to come up with a new sorting algorithm, out of the blue, or to be able to write good code to solve a certain problem when the majority of code solving that problem on github happens to be pretty bad.

In other words: everyone can relax. This will not take your job. Or mine.

____________

[1] I apologise to the people who know me and who will now be falling off their chairs. OK down there?

10 comments

I think you are underselling the potential of a model which deeply understand programming. Imagine combining such a model with something like AutoML-Zero: https://arxiv.org/abs/2003.03384 It may not be 'creative', but used as tab-completion, it's not being rewarded or incentivized or used in any way which would expose its abilities towards creating a new sort algorithm.
I agree on the tab-completion part. Something like Gmail's smart-compose could have potentially huge benefits here.

But I'm not sure about the "deeply understand programming" part. Language modelling and "AI", in its current form, uncovers only statistical correlations and barely scratches the surface of what "understanding" is. This has restricted deployment of majority of academic research into the real-world and this, I believe, is no different and will work only in constrained settings.

Edit: typo

It would be nice to have an AI that could write unit tests, or look over your code and understand and explain where you might have bugs.
>> It would be nice to have an AI that could [write unit tests, or] look over your code and understand and explain where you might have bugs.

What you're describing (outside of the square braces) is algorithmic debugging:

https://en.wikipedia.org/wiki/Algorithmic_program_debugging

It was introduced in the PhD thesis of Ehud Shapiro. There's been a steady trickle of research work since then but it's never formed into a strong current, if I may. One reason for that is of course that Shapiro's thesis was published in 1983. So it's one of the research directions that was cut short by the last AI winter. Lessons to be learned.

Shapiro's thesis is one of two doctoral theses that became the precursors to Inductive Logic Programming, a field at the intersection of logic programming and machine learning. ILP algorithms learn programs from examples and "background knowledge" (i.e. a library of existing programs used as building blocks for new, learned programs).

The way that algorithmic debugging works is that it finds differences between the intended "model" (in logical terms: the consequences) of a program and its actual model. An algorithm that can do that can also walk back up the AST of a program the other way and produce a correct program from examples of its intended inputs and outputs.

That's the kind of stuff I study. Hence my comment above about lack of innovation etc. It's possible to automatically create novel programs with complex structures (recursion and invented sub-programs) and even discover new algorithms in the process and so on -and we know ways to do that right now. But the way to do it is not with a language model trained to predict the next character in a sequence.

As to writing unit tests, the way that most ILP algorithms work is that you give them a set of examples of the inputs and outputs of the program you want to write (e.g. "droplast([alice,and,bob,sitting,on,the,tree], [alice,and,bob,sitting,on,the])") and they write the program for you. I like to think of it as a kind of automatic TDD.

> look over your code and understand and explain where you might have bugs.

This would certainly be interesting. I'm not aware of active research going on in this area (any pointers would be helpful!).

This would require an agent to have thorough understanding of the logic you're trying to implement, and locate the piece of code where it silently fails. For this you'd again need a training dataset where the input is a piece of code and the supervision signal (the output) is location of the bug. I could imagine some sort of self-supervision to tackle this initially where you'd intentionally introduce bugs in your code to generate training data. But not sure how far this can go!

1. Generate test cases from function/class/method definitions.

2. Generate test cases from fuzz results.

3. Run tests and walk outward from symbols around relevant stacktrace frames (line numbers,).

4. Mutate and run the test again.

...

Model-based Testing (MBT) https://en.wikipedia.org/wiki/Model-based_testing

> Models can also be constructed from completed systems

> I'm not aware of active research going on in this area (any pointers would be helpful!).

Look at the static analysis tool in clang. Xcode uses it well.

> Language modelling and "AI", in its current form, uncovers only statistical correlations and barely scratches the surface of what "understanding" is

This is recurrent and somewhat unfair. Current architectures have long known to be universal, capable of reproducing any computational structure (of finite depth for NNs, and Turing complete for RNNs); they have significant structural flexibility and in principle their learning can converge to "ideal processing structures" (which supposedly our brains also approach) given good enough training conditions (data, regimen, etc.). The network scales, timescales and dataset scale to achieve what comparable human function are debatable and unknown, but I believe it's very safe to judge them on function (this particular example is indeed quite impressive), because given their performance it's likely a powerful structure has emerged under the hood -- you can think of it emerging similarly to intelligence emerges from evolution (and of course human learning). Internal recurrent evaluations of logic and representations of language can all emerge.

I wouldn't describe this process as simply statistical inference, since it has complex computational priors and structure involved. It's really algorithmic learning.

Of course, you can bake in structure to accelerate this process, and we've been discovering very useful structures (such as CNNs, LSTMs, Transformer arch) which bias the models in the desired direction but still have internal flexibility.

Bert is a language model. It's trained to predict the next character in a sequence. It does not have any capacity to "understand" programming, or anything at all. It can also not produce outputs that are not similar to the examples it's been trained on. Like all neural net models it can interpolate between its examples, but it can't extrapolate to regions of the sample space it's never seen. This is why I say it lacks the ability to innovate.

I'm not sure how you would combine AutoML-Zero with Bert. How do you mean?

What do you think is a more productive path leading to "AutoCode" ?!

A. Add external definitions or reward formalism to make the code-space easier to search?

OR

B. Keep adding code trees, execution traces, comments, memory dumps and learn from those?

My own instinct is that AlphaZero was a lot more convincing than AlphaStar, so lots of (A) is definitely needed

> In other words: everyone can relax. This will not take your job. Or mine

Of course not. This technology converts writing code into bug hunting in pre-written code. Finding bugs in code that you did not write is way harder than writing the code yourself.

So if anything, this makes programming harder, not easier, and we will need more programmers, not less.

Oh dear.

And then the model trains itself on the buggy code written and poorly debugged by these extra coders and then so on and so forth.

Codepocalypse.

Kill it with fire!

> At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).

Yeah, all it could do for you is autocomplete around what it thinks the specification might be at that point in time.

> But what if Andy gets another dinosaur, a mean one? -- Toy Story (1995)

I agree completely with your expectation of the abilities of such a system.

However, I think very little programming labor is employed in the construction of new algorithms or even most business logic, even a casual stroll through github reveals a staggering amount of reimplementation.

I think the promise here is the ability to code in a more conceptual way with less fiddling with the finicky details.

> I think the promise here is the ability to code in a more conceptual way with less fiddling with the finicky details.

This is basically how product managers code. Or former engineers turned engineering managers. Or even team leads. Hell, maybe like an architect?

You come up with a rough sketch, design the system, think through a couple edge cases, tell the computer what you need, and the computer figures out the details for you. Similar to being a high level engineer that designs/defines/codes the broad strokes of something and then lets the lower level minions handle details.

We made a similar leap when compilers were invented.

> Similar to being a high level engineer that designs/defines/codes the broad strokes of something and then lets the lower level minions handle details

and then the impl. turns out to have a bunch of details wrong that you didn't catch initially. And you wonder why there's so many bugs in software these days!

I think the AI model is helpful, but the specification being ambiguous or under-specified is the problem, and the effort to sort that out is hard. I'm not sure an AI can help in that aspect, and that's where most of the value of programming comes from.

I'm sure in the early days of compilers (I wasn't around back then, so I'm just assuming) they could also be fairly unreliable. Maybe they translated something in a completely idiotic way, lots of bugs, etc... But over time they improved and improved to the point that 99% of programmers never worry about anything except high level abstractions. This could be signalling the beginning of another such paradigm shift to a higher level abstraction
Would that that were the case (compilers getting better). See http://embed.cs.utah.edu/csmith/ C compilers for the PDP-11 were pretty good if only because they were both simple, single threaded, and C was essentially a verbose version of the PDP-11 instruction set. Languages got more abstract and instruction sets (including their models of execution) got more complex. Optimizing the compilation of an abstraction with a faulty understanding of either the abstraction or the instruction set (or both) begets bugs you can't see as reported in the reference above. OB: I personally would like to see these folks point their AI code machine at netlib including the Collected Algorithms of the ACM. Generating numerical methods code is, in my experience, not the same as generating much of what is found on Github.
I agree- that's why I think such a system is not capable of innovation.

In the same way, that's why I think it would be a useful tool: it promises to automate away the kind of coding that most programmers can do with eyes closed and that's the most boring and repetitive part of the job.

Like, without trying to demean it, it sounds like a great boilerplate generator.

I'd put it differently. This is going to take your job, just like an assembly programmer from the 70s might consider Python to have basically taken their job. In software, the job is constantly eating itself and transforming.

It's part of the job to continually incorporate new capabilities and lever yourself up.

I agree. While this is well done, it seems to be copying human programming techniques rather than allowing the AI to create code that it thinks is optimal. I think there is the potential to evolve efficient and secure code that is free from the constraints we impose on it due to the way our minds work. Such code may not be intelligible to us but could very well be much better than what we could write.
An AI like this can hold a hell of a lot more information in its head at one point than a human. Each decision it makes is based on way more context, it can manipulate the problem using much more information, much faster. The problem is that it can't think in abstractions.

If AI gets to the point where it has a reasonable understanding of the shape of the data & the basic spatial manipulations being applied (not far off IMO), I'd expect it to be waaaaaay better at discovering certain types of new algorithms than humans. It can handle thinking about algorithms that have millions of independently moving parts in a way a human can't.

Humans have the edge deriving algorithms that require a sequence of high-level steps on an abstraction. "Do this, then we get a thing, then we do some stuff to the thing, stretch it, squash it, massage it." AI sucks at that, it doesn't think in the same kind of flexible abstractions.

But imagine if you build an understanding of how the code will be compiled & how that will interact with the cache into the AI. That's very difficult for humans because you can't think about all those mechanics at once, we have to focus on one at a time. An AI that really gets it? I could see it writing a better sorting algorithm for a specific, complex datatype than a human could, or at the very least having the competetive edge because it can do it basically instantly.

How often does the average programmer come up with a new sorting algorithm?
Yeah I'm thinking it would be more useful to have a really well indexed library of functions accessible by search.
alphago and alphastar were certainly creative. this project in its current state may not have that capacity but it also may not be a huge leap to get there.