Hacker News new | ask | show | jobs
by mytailorisrich 1969 days ago
RFC 2616 defined header fields as OCTETs, and regarding this change RFC 7230 states:

> Non-US-ASCII content in header fields and the reason phrase has been obsoleted and made opaque (the TEXT rule was removed).

RFC 2616:

field-value = ( field-content | LWS )

field-content = <the OCTETs making up the field-valu and consisting of either TEXT or combinations of token, separators, and quoted-string>

Hence to me fields must be treated as opaque data for backward compatibility and robustness. If anything, existing applications that are compliant with RFC 2616 already do that, right? ;)

1 comments

RFC 2616 OCTETs are defined as "<any 8-bit sequence of data>" quote unquote, nothing is said about their value beign opaque.

      TEXT           = <any OCTET except CTLs,
                        but including LWS>
IETF rewrote the productions not to use TEXT, but stopped short from banning the old behaviour.

So, for instance, where 2616 states: Reason-Phrase = <TEXT, excluding CR, LF> And 7230 has: reason-phrase = ( HTAB / SP / VCHAR / obs-text )

It is making sure that any application that conforms to 2616 still conforms to 7230 by not making it illegal (MUST) to parse obs-text... Just something you SHOULD not not do. They are simply making it so any new header added is defined as SP / VCHAR only (quoted, possibly).

Let's not argue semantics here. An arbitrary sequence of bytes is an opaque data type, it has no structure, no meaning, no assumption can be made, and it must simply be passed on as is because it can be anything.

That's why they write that it should be treated as opaque data. My point (and the point of the comment I was replying to) is that 'should' is perhaps too weak a word in the context because previous history. In any case for robustness it is a must to treat it that way.