Hacker News new | ask | show | jobs
by feldrim 479 days ago
I see two points here. First, you did not read the article and did not see the footnote that these are valid in Linux as well.

Second, your comment shows you are lacking the knowledge on Linux as well. In Linux, as I have written in the foot note, accepts anything but 0x00 (null) and 0x2F (“/”). Other than that, all characters are valid paths. If you consider these a problem, I'd like to remind that the 2048 surrogate pairs is a really small subset of unrenderable combinations allowed in Linux.

Anyone are free to have their opinions but at least, before making bold claims, please do your due diligence.

1 comments

> In Linux, as I have written in the foot note, accepts anything but 0x00 (null) and 0x2F (“/”)

POSIX 2024 encourages (but doesn’t require) implementations to disallow newline in file names, returning EILSEQ if you try to create a new file or directory with a name containing a newline. Thus far Linux hasn’t adopted that recommendation, but I personally hope it does some day.

For backward compatibility, it would have to be a mount option. It could be done at VFS level so it applies to all filesystems.

Personally I would go even further and introduce a “require_sane_filenames” mount option, which would block you (at the VFS layer) from creating any file name containing invalid UTF-8 (including overlong sequences and UTF-8 encoded surrogates), C0 controls or (UTF-8 encoded) C1 controls.

Also I think it would be great if filesystems had a superblock bit that declared they only supported “sane filenames”. Then even accessing such a file would error because it would be a sign of filesystem corruption.

This I did not know. I know that ZFS has "utf8only" option, but not sure about others.