| HN Mirror

Not true. Take this string:

d͊

It is composed of two code points: U+0064 and U+034A. The second code point is a combining character. The two code points together form one glyph. The term "character" is confusing because people use different definitions for it, I avoid using it, but the term Unicode code point is very clear.

Python 3's strings is a sequence of code points. The above string is represented like this:

  >>> print("d\u034A")
  d͊
  >>> len("d\u034A")
  2

Truncating between the base and combining code points works as expected:

  >>> "d\u034a"[0]
  'd'
  >>> "d\u034a"[1]
  '͊'