|
From source code of preferred ftp/http client, maybe this is helpful. Also suggest reading source code for djb's ftp server. Parse URL of form (per RFC 3986):
<type>://[<user>[:<password>]@]<host>[:<port>][/<path>]
XXX: this is not totally RFC 3986 compliant; <path> will have the
leading `/' unless it's an ftp:// URL, as this makes things easier
for file:// and http:// URLs. ftp:// URLs have the `/' between the
host and the URL-path removed, but any additional leading slashes
in the URL-path are retained (because they imply that we should
later do "CWD" with a null argument).
Examples:
input URL output path
--------- -----------
"http://host" "/"
"http://host/" "/"
"http://host/path" "/path"
"file://host/dir/file" "dir/file"
"ftp://host" ""
"ftp://host/" ""
"ftp://host//" "/"
"ftp://host/dir/file" "dir/file"
"ftp://host//dir/file" "/dir/file"
If we are dealing with a classic `[user@]host:[path]'
(urltype is CLASSIC_URL_T) then we have a raw directory
name (not encoded in any way) and we can change
directories in one step.
If we are dealing with an `ftp://host/path' URL
(urltype is FTP_URL_T), then RFC 3986 says we need to
send a separate CWD command for each unescaped "/"
in the path, and we have to interpret %hex escaping
*after* we find the slashes. It's possible to get
empty components here, (from multiple adjacent
slashes in the path) and RFC 3986 says that we should
still do `CWD ' (with a null argument) in such cases.
Many ftp servers don't support `CWD ', so if there's an
error performing that command, bail out with a descriptive
message.
Examples:
host: dir="", urltype=CLASSIC_URL_T
logged in (to default directory)
host:file dir=NULL, urltype=CLASSIC_URL_T
"RETR file"
host:dir/ dir="dir", urltype=CLASSIC_URL_T
"CWD dir", logged in
ftp://host/ dir="", urltype=FTP_URL_T
logged in (to default directory)
ftp://host/dir/ dir="dir", urltype=FTP_URL_T
"CWD dir", logged in
ftp://host/file dir=NULL, urltype=FTP_URL_T
"RETR file"
ftp://host//file dir="", urltype=FTP_URL_T
"CWD ", "RETR file"
host:/file dir="/", urltype=CLASSIC_URL_T
"CWD /", "RETR file"
ftp://host///file dir="/", urltype=FTP_URL_T
"CWD ", "CWD ", "RETR file"
ftp://host/%2F/file dir="%2F", urltype=FTP_URL_T
"CWD /", "RETR file"
ftp://host/foo/file dir="foo", urltype=FTP_URL_T
"CWD foo", "RETR file"
ftp://host/foo/bar/file dir="foo/bar"
"CWD foo", "CWD bar", "RETR file"
ftp://host//foo/bar/file dir="/foo/bar"
"CWD ", "CWD foo", "CWD bar", "RETR file"
ftp://host/foo//bar/file dir="foo//bar"
"CWD foo", "CWD ", "CWD bar", "RETR file"
ftp://host/%2F/foo/bar/file dir="%2F/foo/bar"
"CWD /", "CWD foo", "CWD bar", "RETR file"
ftp://host/%2Ffoo/bar/file dir="%2Ffoo/bar"
"CWD /foo", "CWD bar", "RETR file"
ftp://host/%2Ffoo%2Fbar/file dir="%2Ffoo%2Fbar"
"CWD /foo/bar", "RETR file"
ftp://host/%2Ffoo%2Fbar%2Ffile dir=NULL
"RETR /foo/bar/file"
Note that we don't need `dir' after this point.
The `CWD ' command (without a directory), which is required by
RFC 3986 to support the empty directory in the URL pathname (`//'),
conflicts with the server's conformance to RFC 959.
|