Hacker News new | ask | show | jobs
by imtringued 1037 days ago
They are inefficient by design. Gradient descent and backpropagation scale poorly, but they work and GPUs are cheap, so here we are.