Hacker News new | ask | show | jobs
by 1718627440 186 days ago
> Your code will match the first parameter that has <param> as a suffix, no necessarily <param> exactly

Depending on your requirements, that might be a feature.

> It also doesn't handle any percent encoding.

This does literal matches, so yes you would need to pass the param already percent encoded. This is a trade off I did, not for that case, but for similar issues. I don't like non-ASCII in my source code, so I would want to encode this in some way anyway.

But you are right, you shouldn't put this into a generic library. Whether it suffices for your project or not, depends on your requirements.

2 comments

This exact mindset is why so much software is irreparably broken and riddled with CVEs.

Written standard be damned; I’ll just bang out something that vaguely looks like it handles the main cases I can remember off the top of my head. What could go wrong?

Most commenters seem to miss that this is the throwaway code for HN, with a maximum allocated time of five minutes. I wouldn't commit it like this. The final code did cope with percent-encoding even though the project didn't took any user generated values at all. And I did read the RFCs, which honestly most developers I meet don't care to do. I also made sure the percent-decodation function did not rely on the ASCII ordering (it only relies on A-Z being continuous), because of portability (EBCDIC) and I have some professional honor.
I get that, but your initial comment implied you were about to showcase a counter to "Hundreds of lines just to grab a query parameter from a URL", but instead you showed "Poorly and incompletely parsing a single parameter can be done in less than 100 lines".

You said you allocated 5 minutes max to this snippet, well in php this would be 5 seconds and 1 line. And it would be a proper solution.

    $name = $_GET['name'] ?? SOME_DEFAULT;
And in the code in C it looks like this, which is also a proper solution, I did not measure the time, it took me to write that.

    name = cgiGetValue (cgi, "name");
    if (!name) name = SOME_DEFAULT;
If you allow for GCC extensions, it looks like this:

    name = cgiGetValue (cgi, "name") ?: SOME_DEFAULT;
That would fail on a user supplying a multiple where you don't expect.

> If multiple fields are used (i.e. a variable that may contain several values) the value returned contains all these values concatenated together with a newline character as separator.

In GP’s defense, there is no standard behavior in the spec for handling repeated GET query parameters. Therefore any implementation-defined behavior is reasonable, including: keeping only the first, keeping only the last, keeping one at random, allowing access to all of them, concatenating them all with a separator, discarding the entire thing, etc.
Why? The actual implementation of cgiGetValue I am talking about does exactly that:

> concatenated together with a newline character

Ampersands are ASCII, but also need to be encoded to be in a parameter value.
Yeah, but you can totally choose to not allow that in your software.
That's true. Your argument about how short parameter extraction can be gets a little weaker though if only solve it for the easy cases. Code can be shorter if it solves a simplified version of the problem statement.