Hacker News new | ask | show | jobs
by charleshmartin 138 days ago
Right. If the dynamics of training are governed by RG flow, then the best optimization path should remove redundant directions, as specified by the RG operator(s)