| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sutterd 81 days ago
	Doh! The part past the # does not go to the sever, so that wasn't a longer URL. How about: https://chrismorgan.info/%6e%6f-%71%75%65%72%79-%73%74%72%69...

1 comments

abanana 80 days ago

Indeed, that's not a query string! The #, and following text, is a fragment, is client-side only, and isn't the subject of the blogpost. Neither is percent encoding, which is just another way to send the exact same path from your browser to the server.

Note that it has nothing to do with the length of the URL. That's just the error message he's chosen to use, because "4xx stop pissing about with my URLs" doesn't exist in the spec.

link

chrismorgan 80 days ago

> percent encoding, which is just another way to send the exact same path

This is not true for all characters. Some can only be expressed by percent-encoding, and decoding them will either break things completely (e.g. %20) or change the meaning of the URL (e.g. %2F, %3F in paths).

Yes, you can encode x as %78 and it should work identically, and you can decode %78 to x and it should work identically—though in both cases, I reckon there’s a strong case for blocking the request as suspicious, and I will probably start doing that soon.

But take these examples of improperly decoding:

• /foo%2Fbar/baz.html has path «"foo/bar", "baz.html"».

• /foo/bar/baz.html has segments «"foo", "bar", "baz.html"».

• /foo%3Fbar/baz?quux has path «"foo?bar", "baz"» and query "quux".

• /foo?bar/baz?quux has path «"foo"» and query "bar/baz?quux".

link

abanana 80 days ago

Indeed, it's essential in some cases. I was talking about in the context of sutterd's suggestion, where just lower-case letters have been encoded.

> strong case for blocking the request as suspicious

Yep, as there shouldn't be any "normal" reason to do such a thing.

link