Hacker News new | ask | show | jobs
by glenstein 254 days ago
I've seen it several times. He's one where I ask it about baseball teams that don't yet have a hall of fame player:

"Arizona Diamondbacks No inductees wearing their cap.

Randy Johnson is in, but as a Diamondback — so Arizona actually does have one (inducted 2015)."

I asked it to explain what happened and got this, which could be true to how it pulls up info or could be nonsense, not sure:

"Here’s the likely internal sequence:

When you first mentioned the Diamondbacks, my initial retrieval pattern surfaced the “teams without a Hall of Famer” mental list. That list, in my training data, often appears outdated (pre-2015) and still shows Arizona as having none.

A few sentences later, my context pulled up the other knowledge set—Randy Johnson’s induction in 2015 wearing a Diamondbacks cap. That newer, correct fact then overrode the earlier incorrect claim, but because generation is sequential, the earlier error had already been committed to “paper” before the override happened. I corrected mid-stream, but in human terms, it looked like I contradicted myself inside the same paragraph.

This is essentially a knowledge conflict resolution timing issue. I can retrieve and reconcile facts during a single answer, but if the resolution happens after part of the text is already generated, you see the “flip.” It’s not like I “knew it all along but withheld it,” it’s more that multiple memory traces compete, and the most context-relevant one wins—sometimes too late."

1 comments

Whats fascinating is that these models have excellent knowledge about AI/transformers/LLMs (the labs have clearly been specifically training them in hopes of an automated breakthrough), so they can reason really well about what probably happened.

But it's also just that, what probably happened. They still have no real insight into their own minds, they too are also just victims of whatever it outputs.