Hacker News new | ask | show | jobs
by fc417fc802 312 days ago
A similar phenomenon was demonstrated with deep neural networks nearly a decade ago. You optimize the architecture using randomized weights instead of optimizing the weights. You can still optimize the weights in a separate additional step to improve performance.