| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by skybrian 719 days ago

This article doesn't talk much about testing or getting training data. It seems like that part is key.

For code that you think you understand, it's because you've informally proven to yourself that it has some properties that generalize to all inputs. For example, a sort algorithm will sort any list, not just the ones you tested.

The thing we're uncertain about for a neural network is that we don't know how it will generalize; there are no properties that we think are guaranteed for unseen input, even if it's slightly different input. It might be because we have an ill-specified problem and we don't know how to mathematically specify what properties we want.

If you can actually specify a property well enough to write a property-based test (like QuickCheck) then you can generate large amounts of tests / training data though randomization. Start with one example of what you want, then write tests that generate every possible version of both positive and negative examples.

It's not a proof, but it's a start. At least you know what you would prove, if you could.

If you have such a thing, relying on spaghetti code or a neural network seem kind of similar? If you want another property to hold, you can write another property-based test for it. I suppose with the neural network you can train it instead of doing the edits yourself, but then again we have AI assistance for code fixes.

I think I'd still trust code more. At least you can debug it.