Hacker News new | ask | show | jobs
by heavenlyblue 761 days ago
They don't do global optimisation of all layers at the same time, instead training all layers independently of each other.