Hacker News new | ask | show | jobs
by desdenova 387 days ago
What makes models non-deterministic isn't the training algorithm, but the initial weights being random.

Training is reproducible only if, besides the pipeline and data, you also start from the same random weights.

2 comments

That would fall under "Feed the same data in and you'll get the same weights out." Lots of deterministic algorithms use a random seed.
So is there no “introduce randomness” at some step afterwards? If not, I would guess these models would be getting stuck in a local maxima
> If not, I would guess these models would be getting stuck in a local maxima

It sounds like you're referring to something like simulated annealing. Using that as an example, the fundamental requirement is to introduce arbitrary, uncorrelated steps -- there's no requirement that the steps be random, and the only potential advantage of using a random source is that it provides independence (lack of correlation) inherently; but in exchange, it makes testing and reproduction much harder. Basically every use of simulated annealing or similar I've run into uses pseudorandom numbers for this reason.