Hacker News new | ask | show | jobs
by rcconf 4249 days ago
I've had the same issues when developing with Flask in Python. I forgot to URL encode some query parameters and it worked fine with the local HTTP server.

But when I put nginx in front as a proxy, it denied all requests.

1 comments

The thttpd webserver doesn't handle requests with too many slashes either, which I only found out recently

This is treated as an invalid request:

      http://example.com//robots.txt
Unless I'm reading RFC 3986 incorrectly, that's valid because you can't have an empty segment in the path part of a URI.
I think you're reading it incorrectly.

You can have an empty segment in the path. The BNF for a segment is:

    segment       = *pchar
Which according to RFC2234 section 3.6 means zero or more repetitions.
But then the server may still decide that an empty segment is so meaningless that it will refuse it.

In fact, it would not be a smart move to just treat double slashes the same as single ones, because of relative URLs: a ".." segment only removes one slash, so the hierarchy levels would get messed up. thttpd is doing the smart thing here.

As one of my teachers at university would say: the empty segment is also a segment.

The server can of course interpret the path as it wants, but it should allow an application running under the server to give 'foo//bar' a meaning if that application wants to, IMO.
True. I was writing about the case when the URL simply mapped to a file system location. Applications should be able to apply their own interpretation.
Agreed.

(The problem in my case was just stupid spiders that were crawling my sites.)

Yes, it's a valid URI but //robots.txt is different resource to /robots.txt. It seems thttpd is probay doing the right thing.
The difference between `path-abempty` and `path-absolute` is bloody confusing but I think you're right.