|
|
|
|
|
by cloudbonsai
85 days ago
|
|
> There is no caching of a "utf-8 representation". No there certainly is. This is documented in the official API documentation: UTF-8 representation is created on demand and cached in the Unicode object.
https://docs.python.org/3/c-api/unicode.html#unicode-objects
In particular, Python's Unicode object (PyUnicodeObject) contains a field named utf8. This field is populated when PyUnicode_AsUTF8AndSize() is first called and reused thereafter. You can check the exact code I'm talking about here:https://github.com/python/cpython/blob/main/Objects/unicodeo... Is it clear enough? |
|
(And the code search seems to be broken; it can't find me the definition of `unicode_fill_utf8` although I'm sure it's obvious enough.)