| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rafram 480 days ago
	Tesseract doesn’t use an LLM. LLMs don’t know how confident they are; Tesseract’s model does.

2 comments

touisteur 480 days ago

With most Machine Learning algorithms I used to get shapley values or other 'explainable AI' metrics (for a large cost compared to simple inference, yes), it's very unsettling and frustrating to work without them now on LLMs.

link

hansvm 480 days ago

Kind of. Tesseract's confidence is just a raw model probability output. You could easily use the entropy associated with each token coming out of an LLM to do the same thing.

link

rafram 480 days ago

True, but LLM token probability doesn't map nearly as cleanly to "how readable was the text".

link

hansvm 479 days ago

Why not though? Both kinds of models jumble around the data and spit out a probability distribution. Why is the tesseract distribution inherently more explainable (aside from the UI/UX problem of the uncertainty being per-token instead of per-character)?

link