Hacker News new | ask | show | jobs
by rafram 479 days ago
True, but LLM token probability doesn't map nearly as cleanly to "how readable was the text".
1 comments

Why not though? Both kinds of models jumble around the data and spit out a probability distribution. Why is the tesseract distribution inherently more explainable (aside from the UI/UX problem of the uncertainty being per-token instead of per-character)?