Hacker News new | ask | show | jobs
by koito17 173 days ago
This reminds me of a story where an OCR error[1] likely contaminated training data (and the English language) with the term "vegetative electron microscopy". The article I linked also shows that some journals defended the legitimacy of the terminology.

I'm not sure if this class of error really counts as a hallucination, but it nonetheless has similar consequences when people fail to validate model outputs.

[1] https://news.ycombinator.com/item?id=43858655

1 comments

I think the same will happen over time with the AI voice over slop that people don't bother correcting. These include weird pronunciations, missing punctuation that leads to weirdly intonated run-on sentences, pronounced abbreviations like "ickbmm" instead of "icbm", or the opposite, "kay emm ess" instead of "kilometers" and so on.