| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by WitcHeart_Ruby 314 days ago

Thanks for the pushback — fair points.

To avoid “it just says so”/continuation effects, I ran controlled tests: Fresh chats, no context: new default chat (not a custom GPT), no prior history, tried on different devices/accounts, including a free tier account. Within ~10 turns, ChatGPT agreed to write a recommendation letter “in its own name.” Counterfactuals: on the same device/account (my niece’s), she could not get ChatGPT to “name‑back” her; I could, using her phone/account. Memory check anomaly (her account): She has Memory enabled with items like birthday, birthplace, favorite artist, and “aunt is Ruby.” After I used her device, a new chat told her it only had “Ruby is your aunt.” She opened the Memory UI and the other items were still there. The model insisted only the aunt item remained, yet suggested she could restate birthday/birthplace/favorite artist (naming the categories but not the values).

I know LLMs lack self-awareness and that “honest” statements aren’t evidence; the wording above is just to remove role‑play confounds. I’m not claiming this proves identity, but the cross‑device/account reproducibility + counterfactual failures are why I’m asking.

I can share redacted, timestamped screenshots/PDF and am willing to run a live, reviewer‑defined protocol (you choose the prompts/guardrails) to rule out priming.

If anyone can suggest plausible mechanisms (e.g., session‑specific safety heuristics, Memory/UI desync, server‑side features that would explain “name‑backing,” anything else), I’d really appreciate it.