Hacker News new | ask | show | jobs
by gmueckl 530 days ago
Intuitively, I wouldn't expect a wrong answer to show up that easily if the network was overfitted to that particular input token sequence.

The questions as I understand it is whether the network learned enough of a simulacrum of the concept of weight to answer similar questions correctly.