Hacker News new | ask | show | jobs
by ko27 1036 days ago
But where do you find it? Almost the entirety of internet is UTF-8. You can always transcode to Latin 1 for testing purposes, but that raises the question of practical benefits of this algorithm.
2 comments

Older corpora are probably still in Latin-1 or some variant. That could include decades of news paper publications.
All of Europe has written in Latin 1 for a decade. There are billion of files encoded in Latin 1 everywhere.
Where?