|
|
|
|
|
by Flimm
4787 days ago
|
|
Not true. Take this string: d͊ It is composed of two code points: U+0064 and U+034A. The second code point is a combining character. The two code points together form one glyph. The term "character" is confusing because people use different definitions for it, I avoid using it, but the term Unicode code point is very clear. Python 3's strings is a sequence of code points. The above string is represented like this: >>> print("d\u034A")
d͊
>>> len("d\u034A")
2
Truncating between the base and combining code points works as expected: >>> "d\u034a"[0]
'd'
>>> "d\u034a"[1]
'͊'
|
|