Hacker News new | ask | show | jobs
by lovasoa 1036 days ago
Not sure whether that was sarcastic, but ISO-8859-1 (Latin 1) encodes most european languages, not just latin.

https://en.wikipedia.org/wiki/ISO/IEC_8859-1

1 comments

But where do you find it? Almost the entirety of internet is UTF-8. You can always transcode to Latin 1 for testing purposes, but that raises the question of practical benefits of this algorithm.
Older corpora are probably still in Latin-1 or some variant. That could include decades of news paper publications.
All of Europe has written in Latin 1 for a decade. There are billion of files encoded in Latin 1 everywhere.
Where?