Hacker News new | ask | show | jobs
by mfer 3181 days ago
&tldr; A legacy behavior is to treat + as a space. When you've been around you need to keep backwards compatibility.

URLs and URIs have separate standards from HTTP and they have changed over time (been replaced by newer ones).

Many years ago it was common to encode a space as a + sign. For example, the PHP function urlencode[1] does the same thing with a + sign. If you're a PHP user, don't use this function unless you know you need to. There are better functions now.

[1] http://php.net/manual/en/function.urlencode.php

2 comments

When was + treated as space in the path part of the URL? Sure it's been treated as space in the query part, but that would be a weird breaking change if early web treated path and query the same way, and then later standards made them different.
At the time S3 launched the URL spec was RFC 1738 and we had HTML 4.01[2]. And, the URI syntax (all the way back in 1998) noted to use %20 for a space[3].

As far as I can tell, this traces its history back to encoding for forms[4]. It's been used far beyond the encoding for forms and maybe someone can explain why.

It's also not just PHP whose function is that way. In Python urlencode encodes as a + (at least in 2.7.x).

I remember working on the web many years ago where "+" is what was used. This may have been a spec misinterpretation or something else. In any case, it was common enough.

Note, I'm not saying it was right. Just not uncommon.

[1] https://www.ietf.org/rfc/rfc1738.txt [2] https://www.w3.org/TR/html401/ [3] https://www.ietf.org/rfc/rfc2396.txt [4] https://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1

> If you're a PHP user, don't use this function unless you know you need to. There are better functions now.

Don't leave me hanging! What are the better functions now?

`rawurlencode()` is what you're after.

And here is where you'd ask that question, a coding forum https://stackoverflow.com/questions/996139/urlencode-vs-rawu...

Thanks!

And here is where you'd answer that question, a coding forum https://stackoverflow.com/questions/996139/urlencode-vs-rawu...

;)

Knowing PHPs standard library, probably something like "urlencode_safe_for_real_this_time".

Kidding aside, IIRC "rawurlencode" is the RFC compliant one.