| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MrSkelter 199 days ago
	I have a personal corpus of letters between my grandparents in WW2. My grandfather fighting in Europe and my grandmother in England. The ability of Claude and ChatGPT to transcribe them is extremely impressive. Though I haven’t worked on them in months and this uses older models. At that time neither system could properly organize pages though and chatGPT would sometimes skip a paragraph.

2 comments

vertnerd 199 days ago

I've also been working on half a dozen crates of old family letters. ChatGPT does well with them and is especially good at summarizing the letters. Unfortunately, all the output still has to be verified because it hallucinates words and phrases and drops lines here and there. So at this point, I still transcribe them by hand, because the verification process is actually more tiresome than just typing them up in the first place. Maybe I should just have ChatGPT verify MY transcriptions instead.

link

embedding-shape 199 days ago

It helps when you can see the confidence of each token, which downloadable weights usually gives you. Then whenever you (your software) detects a low confidence token, run over that section multiple times to generate alternatives, and either go with the highest confidence one, or manually review the suggestions. Easier than having to manually transcribe those parts at least.

link

seidleroni 198 days ago

Is there any way to do this with the frontier LLM's?

link

red75prime 198 days ago

Ask them to mark low confidence words.

link

akoboldfrying 198 days ago

Do they actually have access to that info "in-band"? I would guess not. OTOH it should be straightforward for the LLM program to report this -- someone else commented that you can do this when running your own LLM locally, but I guess commercial providers have incentives not to make this info available.

link

red75prime 197 days ago

Naturally, their "confidence" is represented as activations in layers close to output, so they might be able to use it. Research ([0], [1], [2], [3]) shows that results of prompting LLMs to express their confidence correlate with their accuracy. The models tend to be overconfident, but in my anecdotal experience the latest models are passably good at judging their own confidence.

[0] https://ieeexplore.ieee.org/abstract/document/10832237

[1] https://arxiv.org/abs/2412.14737

[2] https://arxiv.org/abs/2509.25532

[3] https://arxiv.org/abs/2510.10913

link

seidleroni 198 days ago

interesting... I'll give that a shot

link

criemen 198 days ago

It used to be that the answer was logprobs, but it seems that is no longer available.

link

SoftTalker 198 days ago

Always seemed strange to me that personal correspondence between two now-dead people is interesting. But I guess that is just my point of view. You could say the same thing about reading fiction, I guess.

link

suddenlybananas 198 days ago

Why on earth wouldn't it be interesting? Do you only care about your own life?

link