I'm curious if you're aware of some papers from around 2005 on using contextual entropy to do unsupervised word segmentation on Chinese, and other languages that don't use spaces for word boundaries.
It is also quite similar to Carl de Marcken's work for segmenting text and speech. He phrased everything in terms of minimum description length (MDL), but that is trivially the same thing as local entropy.
At least I wasn't aware of this work, but thanks for the refs! I'm always curious to read papers from 10-20+ years ago that have similarly inspired ideas. If it makes sense, we'll mention those in the next related work update.
https://dspace.mit.edu/handle/1721.1/7191?show=full