|
|
|
|
|
by joelburget
1274 days ago
|
|
It works on all human languages, just inefficiently. I ran it over a sample I found on wikipedia: sample = "ฟองมันฟันหนู, ฟันหนูฟองมัน, ฝนทองฟองมัน"
len(sample), len(enc.encode(sample))
This returns `39, 40` so it's just encoding one character at a time. It's probably like this for almost all non-English text. |
|