Hacker News new | ask | show | jobs
by hansvm 483 days ago
Kind of. Tesseract's confidence is just a raw model probability output. You could easily use the entropy associated with each token coming out of an LLM to do the same thing.
1 comments

True, but LLM token probability doesn't map nearly as cleanly to "how readable was the text".
Why not though? Both kinds of models jumble around the data and spit out a probability distribution. Why is the tesseract distribution inherently more explainable (aside from the UI/UX problem of the uncertainty being per-token instead of per-character)?