Hacker News new | ask | show | jobs
by feldrim 480 days ago
Hi all. OP here. I added a Postscriptum about the surrogte pairs and their status in Linux. I used WSL to access those files under Windows, and generated the same on Linux. You can see that behavior differs on the same file names:

1. On Windows, accessed by WSL

2. On Linux (WSL), using UTF-8 locale

3. On Linux (WSL), using POSIX locale

The difference is weird for me as a user. I'd like to know about the decisions made behind these. If anyone has information, please let me know.

3 comments

The Linux section just seems to be artifacts of the WSL hacks, it has nothing to do with how Linux filenames function. Those are simply bags of bytes, the encoding only matters for displaying them, and isn't interpreted internally. ls failing to access the .exe is clearly a WSL filesystem issue and not a Linux / ls issue. You also can't set a UTF-16 locale because that's not what a locale is. UTF-16/32 vs SBCS and UTF-8 is the wide/narrow character distinction, which is a whole separate thing, different ABIs, different APIs.
WSL-to-Windows, yes, it is due to translations. But within the WSL, not sure. I'll try to replicate them on a Ubuntu VM for comparison.
Your subscript 2 applies to NTFS. The only characters NTFS does not allow are NUL and "/".

Beyond that, it is up to the API you're choosing to use to read the volume. Win32 has of course many more restrictions than POSIX would, but since Windows NT supports multiple personalities, you could still RW illegal Win32 characters under NT, e.g. with SFU.

WSL is not Linux, despite whatever Microsoft says.
WSL is Linux -- it's an automatically managed VM with some special sauce for connectivity between the parent partition and guest.
WSL2 is what you describe, WSL1 is not.
WSL2 is what people use now though. Saying "WSL is not linux" (because wsl1 isn't) is pedantically analogous to saying "mac os is not a unix based os" (because mac os 9 and under isn't)
Of course, why would we discuss WSLv1 in 2025? If we discuss "Windows" today, it's unlikely we mean Windows 2000.