Hacker News new | ask | show | jobs
by noahgolmant 37 days ago
There is some literature on student-teacher training with Jacobian matching: https://arxiv.org/abs/1803.00443 this might be in the direction you're looking for. I believe in the cross entropy case matching the Jacobian should imply a matching Fisher information matrix.