Hacker News new | ask | show | jobs
by tbalsam 889 days ago
Hi gwern. The other issue at hand is that inherently information is not always two-way between distributions, so having an implicit bias towards reversal actually can cause quite a few issues as well (though I'm unfortunately still in the 'development stage' of potentially-to-be-published work on this one, so I don't have a ton of details to provide there yet).

I don't think what a lot of people call the reversal curse is as much of an inherently problem as it is an issue of data coverage and assumptions, reversability is certainly more "general" in some contexts but also will reduce performance in other contexts, at least w.r.t. the source data it's trained on (if that makes sense).

Sorta similar to how grokking is a bit of a fad topic, it is technically unique enough to be identifiable but also at the same time it's just a straightforward 'failure mode' of a relatively general process with a somewhat soft definitional barrier to it.