For example, each english word has ~7 ascii chars, but each chinese / japanese / korean words only has ~2 unicode chars