|
|
|
|
|
by gdiamos
240 days ago
|
|
The problem is that it doesn’t always work and when it does fail it fails silently. Debugging requires knowing some small detail about your data distribution or how you did gradient clipping which take time and painstakingly detailed experiments to uncover. |
|
Right, but why does that mean you need more employees? You need to figure out how to surface failures, rather than just adding more meat to the problem.