Hacker News new | ask | show | jobs
by rich_sasha 36 days ago
I always thought generating UUIDs at random was insane. I now only use LLMs. The prompt is: "generate a UUID. Make sure no one ever used it anywhere in their code or database. Check your work and think hard about each step. Do not output any reasoning or plain English, only th UUID itself".

You're welcome.

1 comments

Actually asking ChatGPT this query led it giving me this UUID "550e8400-e29b-41d4-a716-446655440000" which happens to be a very common example UUID
The LLM is mechanistically unable to pick something actually random and outside of its training distribution, so... yep.
If you ask it to construct a UUID character by character you should get a somewhat random one, just because of temperature.
This actually worked well when I asked Gemini to generate a random color, character by character. I was getting Indigo/Electric Indigo a lot if I just asked for a random color on new sessions.
But all LLM output is token by token, which isn't too far from character by character in the case of a UUID. Why is this different? I do not know.
Actually, asking this multiple times to ChatGPT gives me different UUIDs every time, and it checked with a web search that they are not found in public data.
That's because of tokens (and temperature). You could piece back the tokens to parts of existing tokens in public data. And given enough iterations, GPT will probably start showing noticeable patterns (since it's not actually random).