Hacker News new | ask | show | jobs
by spwa4 617 days ago
Uh, this is extracting a LOT from very little data. I don't understand where it's coming from but it's explanation just keeps going into more and more detail ... that doesn't seem to follow from the data it's got.

I just don't see how you could answer these questions without trying it out. And chatgtp DEFINITELY isn't doing that.

Plus the obvious question I'd pose is not in there. What's the difference in performance between this trick and just "softmax() - 0.5 * 2" ? That seems very relevant.