Hacker News new | ask | show | jobs
by jrochkind1 4250 days ago
This very example -- requests were technically illegal all the time without devs realizing, but something in the stack changed to start rejecting them -- demonstrates the fallacy of the "be liberal in what you accept, strict in what you issue" principal. If all the web servers involved had been strict in rejecting the illegal request from the start, they would have noticed the bug in development before deploying to firmware in the field.
13 comments

I don't agree that "be liberal in what you accept, strict in what you issue" is a fallacy. The client actually failed to adhere to the "be strict in what you issue" principal, just as the Cowboy was not liberal in accepting. All software will sooner or later exhibit bugs or be stricter or more lenient about a standard.

I think the fallacy is to assume that once stuff works in production, only your changes can trigger a bug. There's way too much software involved in a standard webserver stack to assume anything about it. Any patch, any update to software or devices not under your control has the potential to break your stack. The thing the OP did was the right thing: Monitor, monitor, monitor.

The liberal/strict thing is a terrible idea. It introduces completely busted behavior.

Consider a client that emits \n instead of \r\n. How do you handle it? Liberally? OK, treat 'em like CRLFs. Now you read \n\n. Everything after that is content, right?

Oops, you're now ignoring headers, potentially security-sensitive ones.

I've run into this exact bug in production, leading to a security problem. The client, proxy, and endpoints had different ways of handling CRLF. Some would treat \n\n as the end of headers, some not. Exploiting this, clients could route requests through the proxy and add special headers that only the proxy should have been able to add (like X-Client-IP).

Apart from this, the whole "robustness principle" just leads to a bunch of guessing and even more incompatible implementations. See HTML as another example mess.

> The client actually failed to adhere to the "be strict in what you issue" principal

Well, that's the rub, right? How do you know how strict you're being if your tools accept things liberally? If anything, the lesson here is to test with the strictest possible tools.

> just as the Cowboy was not liberal in accepting

And this is hard too, because on what dimensions should you be liberal? How do you decide what the "real" set of inputs you're going to accept?

And that leads to my real issue with the principle: what should you, as the liberal accepter, do in those cases? Here it's easy enough to guess what the behavior should be with the extra space (just accept the damn request), but in general it's not -- you're creating implementation-specific behavior; what happens when you accept undefined or incorrect inputs will vary from implementation to implementation, creating a nightmare of uncertainty for people sending you stuff. Of course, you can always say, "they should send stricter stuff!" but then what's really the point of accepting inputs liberally?

The problem is that "be liberal in what you accept" is, by definition, saying to go beyond the standards, accepting things that are technically illegal according to the standards.

So different software will necessarily do it differently. For all software to be doing it the same, there would realistically need to be some specified standard on how to do it, and then we're no longer talking about 'be liberal in what you accept', but just 'accept exactly what the standards say.'

Of course, in this case the client software was not being 'strict in what you issue' -- I am not challenging that part, of course you should _always_ issue exactly correct according to standard requests or other protocol communications. But there will inevitably be bugs, bugs happen.

"Be liberal in what you accept" makes it harder to find those bugs, and leaves them waiting to surprise you when the (non-standard) level of "liberalness" on the receiving end changes, which it inevitably will because it was not according to standard in the first place.

I think the HTML/JS/CSS web provides another good example of the dangers of 'be liberal in what you accept', very similarly -- you may think your web page is 'correct' because one or more browsers render it correctly while being 'liberal', and not realize it's in fact buggy and will not render correctly on on or more other past, present, or future browsers. This example has been commented upon by others, and I think has led to a move away from 'be liberal in what you accept' in web user agents. http://books.google.com/books?id=5WXp4j4eV4UC&pg=PA136&lpg=P...

How about this as a middle-ground:

Be strict in what you issue (duh!), be liberal in what you accept - but both emit strong warnings when the input isn't strict, and have a strict mode.

That doesn't work. Strict mode ends up getting turned off by default, or turned off at the earliest problem. After all, what's the point in being so strict? I've seen security bugs arise from this, nicely commented in source with a "// spec says x but no need to be so pedantic".

If everyone can be strict in what's sent, then the problem is solved. But since that won't happen, even on accident, the only solution is to be harsh on receiving input and hope things fail early in the dev cycle.

Also, text-based protocols are especially prone to this poor handling, A: because spec writers (like HTTP's) go moronically overboard, being all creative (line folding? comments in HTTP headers? FFS!) and B: because text is so easy, everyone just figures anything goes and pays less attention.

I'd say the fault with HTML/JS/CSS is that the implementation of the rendered (the browser) broke the stack by not being strict in what it emitted. Put another way, a badly formed page should render badly and/or issue errors. For historical reasons, browsers did not and do not. Hence, the reason the browsers are "broken".
It's quite a game theoretical problem. Make a strictly standard compliant browser and nobody will use it, since it won't display most of the websites. You have to render badly formed pages somehow if you want your browser compete with other browsers, since they are doing the same.
Or, maybe instead of ballooning this thread with unending hairsplitting, we should recognize the principle as a heuristic that fails on non-representative or extreme cases...
This goes double if you're hosting on Heroku as you won't be able to correlate the changes Heroku makes with issues showing up for you. They're lucky that they hadn't pushed a change at the same time as Cowboy changed, or the debugging could have taken even longer.
I have to agree. I developed a proprietary embedded web server using a streaming HTTP parser. Complying with the HTTP parsing rules is a headache to say the least. Variable amounts of whitespace; 2 variants of line terminators (\r\n or \n) with the provision that the latter SHOULD be accepted by the server and line continuations make complying with the whole specification a real pain if you only have 100 bytes to parse pieces of your request.

Maybe for a server with massive resources (I am talking about megabytes of RAM compared to kilobytes I work with) being liberal in what you accept works, but not when you are on a budget.

Every appearance of SHOULD/MAY in a spec is just begging for bugs or incompatibility. We'd be better off if those words were banned. Spec writers would be less inclined (hopefully) to come up with all sorts of arbitrary behaviour that might happen and could be maybe handled.
Should and may are spec weasel words, in specs there must (hah!) be 'MUST' and 'MUST NOT'. Otherwise a spec is just a piece of rope with a pre-tied noose.
I think Postel's law should be read in the context of “when you cannot control the outside”. It's probably the least-bad option when you are forced to support unknown clients – see e.g. http://daniel.haxx.se/blog/2014/10/26/stricter-http-1-1-fram... for a very recent example – but that clearly doesn't apply in this case where they control both sides, or in many other cases where the number of clients is small and/or there's a solid communication mechanism to tell developers when they need to fix something.
Not "principal", "principle".

Not being critical, just pointing out a common mistake.

http://blog.oxforddictionaries.com/2011/08/principle-or-prin...

Principal: Main, most important

Principle: A rule, a system of belief

It's really interesting to see something like this down-voted. There's nothing pedantic about this. It's offered with nothing but respect. Perhaps the comment writer isn't a native speaker and this was an honest point of confusion. What is wrong with trying to be helpful?

It is a common mistake I see all the time here on HN (along with "your" vs. "you're" vs. "you are"). Why is it that is offensive to the point of deserving a down-vote? Please help me understand.

I didn't downvote.

People downvote corrections because they're usually noise. When someone makes a typo - and homophones are usually slips equivalent to typos - it's noise to point it out.

It's only noise if it has no value. A post such as mine would not have to appear too frequently for HN readers who might be having difficulties with such words to understand the problem and correct their writing. Not going after perfect English, few of us could approach that. But I see a few common patterns on HN all the time and nobody takes a second to say "hey buddy, just in case this wasn't clear to you, here's a helpful tip". Some of these are confusing to non-native speakers. When trying to be helpful is frowned-upon what are you left with?

This comment needed to be left alone. No down vote, perhaps an up-vote by the comment writer if s/he found it helpful and that's it.

One of the things that continues to disturb me the most about HN is how thin skinned the community seems to be. It is impossible to consistently offer a contrasting point of view here without down-vote attacks that make your point of view virtually disappear. Mind you, this particular post isn't that. It just reminds me that HN is really weird.

I get down voted a lot despite the fact that I am a successful entrepreneur since age 15 who has built several companies and continues to do so. My perspective, however, seems seldom welcome here (based on how often I am down-voted) because I don't tow the line of the 20-somethings that are the bulk of this audience. Instead of learning they choose to pound what they don't like out of existence. Weird.

> But I see a few common patterns on HN all the time

> A post such as mine would not have to appear too frequently

Which is it? All the time or not too frequently?

And while you might only make rare posts some people would point out every error and mistake and difference in style. People downvote your post to dissuade those other posts.

About your downvotes: I'm guessing they're for your incredible arrogance.

https://news.ycombinator.com/item?id=8443553

https://news.ycombinator.com/item?id=8440762

https://news.ycombinator.com/item?id=8440847

People see that level of arrogance as ugly. You might want to either change your posting style or stop complaining about the downvotes.

It's easy to make someone sound arrogant by (a) taking comments completely out of context and (b) not bothering to understand the frame of reference by at least asking the question.

HN only does well with well defined technical discussion. On everything else it has degraded to almost what happened to every USENET list in the past. USENET did not have any voting mechanism to make opposing views disappear. In that case those who wanted command of the list and felt ownership of it simply resorted to brutal flaming attacks. Some lists were really horrible places for anyone to say "I disagree".

HN can be like that, in a different way, if you are not a 20-something drinking from the same koolaid bowl. To the point of someone taking the time to take something out of context and then using it to call someone arrogant.

So, come to HN to agree with the herd or risk being called arrogant for presenting a different point of view. Brilliant.

> and nobody takes a second to say "hey buddy, just in case this wasn't clear to you, here's a helpful tip".

"hey" should be capitalized, it's at the beginning of a complete sentence. Also, a comma should be before the quotation.

> No down vote, perhaps an up-vote

down-vote and up-vote should at least be hyphenated consistently.

> One of the things that continues to disturb me the most about HN is how thin skinned the community seems to be. It is impossible to consistently offer a contrasting point of view here without down-vote attacks that make your point of view virtually disappear. Mind you, this particular post isn't that. It just reminds me that HN is really weird.

Likewise, down-vote should be hyphenated consistently with the previous use.

> I get down voted a lot despite the fact that I am a successful entrepreneur since age 15 who has built several companies and continues to do so. My perspective, however, seems seldom welcome here (based on how often I am down-voted) because I don't tow the line of the 20-somethings that are the bulk of this audience. Instead of learning they choose to pound what they don't like out of existence. Weird.

I don't get what kind of prank you're trying to pull here. Is it "down voted" or "down-voted"???

Just being helpful!!

Let me guess. You are 15 years old and just ditched school to screw around on HN. Right?

The difference, in case you did not understand that. Is that my earlier comment was purely constructive in nature.

Your comment was a juvenile "I'll show him. I'll rip his writing apart and put him to shame".

One is an adult constructive post. The other is what I would not allow from my eight year old kid.

This is true but typos/misspellings/etc still reflect poorly on the author in most situations. After all, isn't the subject of this post a rather critical typo?
Misspellings don't tend to bother me as much these days because of, well, the iPad. Seriously, I hate typing on that thing with a vengeance. The problem is exacerbated in my case (and those of others) because I have to turn off auto-correction. Why?

Because I communicate in multiple languages and auto-correction/completion makes it very difficult. Switching the keyboard back and forth doesn't help either because it isn't uncommon to use more than one language within a single email or comment (in other words, mixing languages).

My little post was about pointing out a mistake in usage that isn't a spelling problem but rather using the wrong words altogether. I see this A LOT in technical websites, writing, job posts and resume's.

Look around and see how many job positions are asking for a "Principle Engineer" instead of a "Principal Engineer". The first is some kind of a moral cop position within the company, I guess, the second is an engineer in charge of a project or department.

But, yes, you are right. If I know that someone is a native English speaker and they have bad typos, misspellings and generally can't communicate well in written form it does reflect poorly. If they are not native it is a matter of their position. I would expect someone with a university degree to not confuse "principle" with "principal" or "your" with "you're" (and other such examples).

Exactly. Totally agreed on all counts :)

Btw that is why I use the somewhat pretentious sounding "Written on my tablet" or "Written on my phone" in email signatures on devices like that in hopes that people will re-attribute typos that might otherwise reflect poorly. But it is still a good idea to proof read written communication of any significant value...

> resume's
Doesn't it just demonstrate that you shouldn't switch from being liberal to being strict?

For it to hold up, you need to provide the further argument that you frequently need to switch from liberal to strict.

Or... here the problem is that the "be liberal in what you accept" design principle failed to be captured by the HTTP specification writer, that forced a single SP character. It looks like a specification issue to me to use a syntax which is very prone to errors, and is even not much visible (you can't easily inspect double spaces in protocol traces when checking just with your eyes), and then be strict about it. Even changing separator, if you want to be strict, already helps, like in "foo|bar|zap" compared to "foo bar zap".

Humans are strage: many will spot "foo||bar|zap" as an error, but not "foo bar zap" as an error as serious as the previous one.

Especially when you have display technologies like HTML that will actively compress whitespace... As happened in your last example.
I agree http://www.win-vector.com/blog/2010/02/postels-law-not-sure-... . Correct code remains correct under various compositions and transformations (that may happen in the future). Code that is working only due to pity often does not have this property. Some Netflix style chaos-monkey that turns on and off strictness during testing would be cool.
In particular this philosophy is rejected in the Erlang community, where they prefer "crash if anything is not what you expect it to be"
This doesn't demonstrate a fallacy in "be liberal in what you accept" any more than closed source software demonstrates fallacies in Linus's Law.

The problem wasn't liberal acceptance, it was that liberal acceptance ended when Cowboy was added to the mix.

Strict acceptance would have shown the error earlier, but continued liberal acceptance would have allowed continued functionality.

You mean so long as everyone standardizes around a non-standard, rather than the actual specifications of the standard, it'll work?

I think I prefer just adhering to the standard in the first place.

That's what "strict in what you issue" means
If you are always going to blame the issuer for not being strict... what's the point of the accepter being liberal?

I agree that the issuer should always be strict; and if accepters were strict too, then buggy issuers would be detected immediately and never make it into production. Instead they make it into production, where they will sometimes work and sometimes not, depending on the accepter stack in use at the time and context and how the accepter stack chooses to interpret 'liberal'.

It locks you into the particular "liberal" implementation you started with, or at least significantly increases the risk of changing implementations.

"liberal" by definition here means _beyond the spec_, according to no spec. So different implementations may have different varieties or extents of 'liberalness,' and switching implemenentations will almost necessarily give you a different set of acceptable requests. If they were all the same, that'd be adhering to some spec, not being liberal in your acceptence of it.

"Liberal acceptance" may or may not have ended -- we don't really know if Cowboy accepts only exactly what is legal according to spec or not -- but the bounds of what is liberally accepted defintely changed. As it neccesarily will any time you switch implementations, since 'liberal' is by definition not according to any spec.

The right thing, I think, is to "accept but warn". Like those web browsers that used to show a yellow exclamation mark in the status bar when something was off; web devs could check for this and fix it, but normal users were unaffected. More protocols should include a way to indicate "nonfatal errors".
I recall reading that Postel's law did not mean "accept input that flagrantly ignores the standard", but merely wherever the standard might be read differently, accept all conceivable interpretations of the standard. Unfortunately, I can't remember for sure where I read this, or how authoritative it was.

Postel's original formulation is not written in an essay, but an RFC, and does not elaborate on what he meant: https://tools.ietf.org/html/rfc761

Here's one discussion that suggests this interpretation, without precisely ascribing it to Postel: http://cacm.acm.org/magazines/2011/8/114933-the-robustness-p...

> the fallacy of the "be liberal in what you accept, strict in what you issue" principal

The market (players) (can) manipulate it to create an (perceived) competitive advantage.

It's also a source where "evil" in IT comes from.

SIP takes this to the next level. http://tools.ietf.org/html/rfc4475 Is a spec for "torture tests", where the SIP authors revel in the hideously complex parsing rules they've come up with (which is basically HTTP parsing).

They even suggest that code should infer the meaning of messages. So I suppose you need some sort of AI to really handle things well.

Binary protocols would be a better choice. Or, a well-defined text format. JSON, XML, anything, really, would eliminate this class of bugs.

So SIP is like IRC?
I imagine IRC has the excuse of wanting the UI to be a simple line-based system, and thus in-band signalling is an unavoidable evil.

This same reasoning is why mail and other messages have the Header: Value format - you can compose in plain text. This is documented at least as far back as RFC 561, in 1973. Again, that's a good excuse: Users must compose by hand. Also, the RFC is just codifying what people were already doing.

HTTP, SIP, and others do not have these excuses. HTTP headers are essentially never written by hand, and the extremely few times they are, we don't need conveniences such as comments, line folding, lenient grammar, etc. (Proof: People write XML and JSON by hand far more often, without major ordeals.)

I think the core issue here is that we're directly manipulating strings instead of using DSLs and tooling based around grammars to build our responses (this has been a solved problem for more than 10 years!)

I'm a strong proponent of "do not manipulate strings". Having library writers be the only one doing that would greatly reduce the attack surface/bug potential.