Hacker News new | ask | show | jobs
by infogulch 1465 days ago
Well that will be necessary until windows gets UTF-8 APIs. Probably not soon. Until then there are various optimizations you can do, like caching the UTF-16 conversion alongside the UTF-8 string (good for calling OS APIs frequently with with long-lived strings), allocating temporary UTF-16 conversions on the stack (good for infrequent calls with strings up to a certain size), or storing raw UTF-16 strings as opaque bytes in Rust memory (good for providing strings back to the OS that you got from the OS).

You should try to avoid calling OS APIs in general and cache the results as much as possible. Who knows what the performance characteristics are of an API that has to serve 7 layers of historical OSes simultaneously. Unless you're directly interfacing with the kernel you shouldn't expect much. Omnibar-like layered calls between your app and the OS are a worst-case scenario regardless of conversions.

1 comments

winapi does support UTF-8 on recent versions:

https://docs.microsoft.com/en-us/windows/apps/design/globali...

Very interesting I wasn't aware. After glancing over that doc, it looks like they smuggle UTF-8 in through the -A variant windows APIs [1] by explicitly setting the CP_UTF8 codepage in an application manifest. I wonder if this actually uses UTF-8 internally to service the API call or if it just manually converts strings to wide form and calls the -W variant on the windows side instead of making you do it on the app side. If the latter it may be better to avoid this feature so you don't close the door on potential optimizations like I mentioned above.

[1]: Windows has two variants of many API calls with either -A or -W suffix, where the -A suffix is for strings formatted as 1-byte ASCII (or a specified codepage) and the -W suffix is for strings formatted as 2-byte UTF-16 (kinda). Example: DlgDirListA / DlgDirListW, https://docs.microsoft.com/en-us/windows/win32/api/winuser/n...