Hacker News new | ask | show | jobs
by mananaysiempre 1829 days ago
Just to share a little more of the weirdness (discovered while reading a couple of the historical URL & URI RFCs several days ago):

Per the original spec, in FTP URLs,

- ftp://example.net/foo/bar will get you bar inside the foo directory inside the default directory of the FTP server at example.net (i.e. CWD foo, RETR bar);

- ftp://example.net//foo/bar will get you bar inside the foo directory inside the empty string directory inside the default directory of the FTP server at example.net (i.e. CWD, CWD foo, RETR bar; what do FTP servers even do with this?);

- and it’s ftp://example.net/%2Ffoo/bar that you must use if you want bar inside the foo directory inside the root directory of the FTP server at example.net (i.e. CWD /foo, RETR bar; %2F being the result of percent-encoding a slash character).

4 comments

> what do FTP servers even do with this?

Pretty sure CWD by itself isn't even valid (at least RFC959 assumes it has an argument), and therefore // isn't valid in FTP URLs.

The %2Ffoo/bar is needed because of the fact that FTP CWD and RETR paths are system dependent (with, theoretically, system dependent path separators), but URLs are not, so the FTP client breaks the URL on / and sequentially executes CWD down the tree so that it doesn't need to know what it's connected to.

In other words: URL paths are not system paths, and it's a mistake to think of them as such.

(Alternate in other words: FTP is awful)

> Pretty sure CWD by itself isn’t even valid [...] and therefore // isn’t valid in FTP URLs.

So, I looked it up carefully and it appears that (despite the promises in later RFCs such as 2396 and 3986) the current specification of the ftp scheme is still the ancient RFC 1738 which predates not only the URL / URI distinction but even the notion of relative URLs. In §3.2.2 <https://tools.ietf.org/html/rfc1738#section-3.2.2> it specifically says that a null segment in the path should result in a “CWD ” command (i.e. CWD, space, null string argument) being sent to the FTP server, going against both the current RFC 959 and its predecessor 765 (apparently the earliest formal specification of FTP to include CWD) which require the argument to CWD to be non-null.

Thus apparently a conformant implementation of the ftp URL scheme cannot be a conformant implementation of an FTP client. Joy.

It still seems unlikely that Berners-Lee et al. would specifically call this case out if it were useless at the time... What were the servers that made this necessary, I wonder?

> FTP CWD and RETR paths are system dependent (with, theoretically, system dependent path separators), but URLs are not

Thank you, that’s the insight that I was missing. So a %2F inside an ftp URL component is just performing a (sanctioned) injection of the (supposedly UNIXy) server path syntax.

> FTP is awful

I’d go with “unbelievably ancient, with the attendant problems”, but yes. Funny how it still manages to be better than everything else (that I know) at transferring files by not multiplexing control and data onto the same TCP connection. (I think HTTP over QUIC can do this as well?)

> i.e. CWD, CWD foo, RETR bar; what do FTP servers even do with this?

If you go by shell sematics, that pokes around the home directory of the user running the FTP daemon; hopefully that doesn't actually work.

> it's ftp://example.net/%2Ffoo/bar that you must use if you want bar inside the foo directory inside the root directory

This smells like a security vulnerability for most setups.

> This smells like a security vulnerability for most setups.

Yes, but if you look around on some old FTP servers (like on the few still-extant mirror networks) you’ll find that some do actually let you CWD to the system /, and sometimes they even drop you there by default (so you have to CWD pub or whatever to get at the things you actually want).

> If you go by shell sematics, that pokes around the home directory of the user running the FTP daemon; hopefully that doesn't actually work.

This is why FTP servers have default directories. They're the equivalent of user home directories. By the way, many FTP servers (especially historically) map FTP logins to real, local users.

> This smells like a security vulnerability for most setups.

How do you figure? Surely your sensitive files aren't world-readable... /s

I wanted to mention that in practice, most FTP server implementations are not unicode compatible and are very likely vulnerable to effective-power-like abuses of RTL/LTR switching characters as well.

Let alone that probably all server implementations on Windows seem to have been a fork of BSD's original ftpd at some point, which had an RCE vulnerability when the password exceeded the limited bytelength of 256 bytes iirc.

Even software like ProFTPd where vulnerable over 30 years later.

Just writing this to make a point to stay the fuck away from FTP, because software is heavily outdated in that space and never updated to fix issues. Use ssh/sftp, always.

You know, in a fantasy world where standards of comparable complexity have equally good implementations I would much rather use Telnet and FTP over TLS (1.3) than SSH and SFTP. For all that they show their age they just seem to me to be cleaner designs.

I will have to concede, though, that FTP servers in the real world are surprisingly awful. Even the supposedly easy task of spinning up an anonymous read-only FTP server to serve the current directory for five minutes, all permissions and security be damned, is annoyingly non-trivial.

(Unrelated to that awfulness, does anyone know how to get active FTP to pass through SLIRP networking on Qemu?)

I totally agree with you in regards of complexity. The main issue behind a server's level of security is probably more related to using a memory safe language than we care to admit.

I have the feeling that way too many libraries and implementations written in C use a linter or any kind of mechanism to catch the obvious type errors.

Everyone loves typed languages, but nobody uses their obvious advantages in regards to security. Kinda ironic when you see a -Wall all over the place.

From source code of preferred ftp/http client, maybe this is helpful. Also suggest reading source code for djb's ftp server.

   Parse URL of form (per RFC 3986):
       <type>://[<user>[:<password>]@]<host>[:<port>][/<path>]
 
   XXX: this is not totally RFC 3986 compliant; <path> will have the
   leading `/' unless it's an ftp:// URL, as this makes things easier
   for file:// and http:// URLs.  ftp:// URLs have the `/' between the
   host and the URL-path removed, but any additional leading slashes
   in the URL-path are retained (because they imply that we should
   later do "CWD" with a null argument).
 
   Examples:
        input URL                       output path
        ---------                       -----------
       "http://host"                   "/"
       "http://host/"                  "/"
       "http://host/path"              "/path"
       "file://host/dir/file"          "dir/file"
       "ftp://host"                    ""
       "ftp://host/"                   ""
       "ftp://host//"                  "/"
       "ftp://host/dir/file"           "dir/file"
       "ftp://host//dir/file"          "/dir/file"
 
    If we are dealing with a classic `[user@]host:[path]'
    (urltype is CLASSIC_URL_T) then we have a raw directory
    name (not encoded in any way) and we can change
    directories in one step.
   
    If we are dealing with an `ftp://host/path' URL
    (urltype is FTP_URL_T), then RFC 3986 says we need to
    send a separate CWD command for each unescaped "/"
    in the path, and we have to interpret %hex escaping
    *after* we find the slashes.  It's possible to get
    empty components here, (from multiple adjacent
    slashes in the path) and RFC 3986 says that we should
    still do `CWD ' (with a null argument) in such cases.
   
    Many ftp servers don't support `CWD ', so if there's an
    error performing that command, bail out with a descriptive
    message.
   
    Examples:
                 
    host:                                dir="", urltype=CLASSIC_URL_T
                 logged in (to default directory)
    host:file                            dir=NULL, urltype=CLASSIC_URL_T
                 "RETR file"
    host:dir/                            dir="dir", urltype=CLASSIC_URL_T
                 "CWD dir", logged in
    ftp://host/                          dir="", urltype=FTP_URL_T
                 logged in (to default directory)
    ftp://host/dir/                      dir="dir", urltype=FTP_URL_T
                 "CWD dir", logged in
    ftp://host/file                      dir=NULL, urltype=FTP_URL_T
                 "RETR file"
    ftp://host//file                     dir="", urltype=FTP_URL_T
                 "CWD ", "RETR file"
    host:/file                           dir="/", urltype=CLASSIC_URL_T
                 "CWD /", "RETR file"
    ftp://host///file                    dir="/", urltype=FTP_URL_T
                 "CWD ", "CWD ", "RETR file"
    ftp://host/%2F/file                  dir="%2F", urltype=FTP_URL_T
                 "CWD /", "RETR file"
    ftp://host/foo/file                  dir="foo", urltype=FTP_URL_T
                 "CWD foo", "RETR file"
    ftp://host/foo/bar/file              dir="foo/bar"
                 "CWD foo", "CWD bar", "RETR file"
    ftp://host//foo/bar/file             dir="/foo/bar"
                 "CWD ", "CWD foo", "CWD bar", "RETR file"
    ftp://host/foo//bar/file             dir="foo//bar"
                 "CWD foo", "CWD ", "CWD bar", "RETR file"
    ftp://host/%2F/foo/bar/file          dir="%2F/foo/bar"
                 "CWD /", "CWD foo", "CWD bar", "RETR file"
    ftp://host/%2Ffoo/bar/file           dir="%2Ffoo/bar"
                 "CWD /foo", "CWD bar", "RETR file"
    ftp://host/%2Ffoo%2Fbar/file         dir="%2Ffoo%2Fbar"
                 "CWD /foo/bar", "RETR file"
    ftp://host/%2Ffoo%2Fbar%2Ffile       dir=NULL
                 "RETR /foo/bar/file"
   
    Note that we don't need `dir' after this point.
   
    The `CWD ' command (without a directory), which is required by   
    RFC 3986 to support the empty directory in the URL pathname (`//'),   
    conflicts with the server's conformance to RFC 959.
ffff