Hacker News new | ask | show | jobs
by lurkerasdfh8 1867 days ago
fun fact. in the browser wars of 1993(?) i looked at the specs from netscape (mozilla dady for the young folks) and microsoft (what w3c? ha!) and netscape release a browser spec that said "X must support up to Y", as in "url must be up to 1024 chars", "cookies must be up to 1mb", etc...

then microsoft release IE4 (or 6?) web spec. It was literally a copy of netscape's but with "up to" replaced with "at least".

and from this day on, nobody knows about limits on the standard and everything was up in the air, just so sites could work on IE4 and be broken on netscape. Thanks microsoft!

I did some experiments to test the actual URL limit of IE. at the time it was around 4MB, but IE would still go over if you got creative with hostnames levels and odd schemas.

-- quick edit:

keep in mind, in 1993, the money from giving out free browsers where on the servers: netscape server vs microsoft IIS (just like today giving free browsers the money is on makig it easier to access YOUR content --e.g. default search, etc).

Making your browser crash the competitor server mean that server was seen as lower quality. (Same thing with google deliberately crashing performance of firefox on their services today[0])

The point of microsoft making this change was to force netscape to update their server as they increase the URL limit arbitrarily to all IE users.

[0] https://www.zdnet.com/article/former-mozilla-exec-google-has...

8 comments

I was on a 12-person failed project, the kind of which you owe millions to the govt. We had a problem with the search, we couldn’t get performance.

I told my boss: “See, they wrote ‘The old search responded in 2 seconds. The new search must take at least the same time.’ We could almost add a sleep(2000) before starting the search.”

He went with it. They dealt to drop the requirement on the performance of the search on a “mutual agreement.”

Ah yes. Checkbox Driven Development. AKA Monkey Paw Development, where you give exactly what was asked for; it remains surprisingly popular in the government and enterprise spaces.
I've worked in such places. The reason it is that way is because you will receive a broken description/specification/story of what you are supposed to implement. You have a choice to make when that happens, you either implement it as specified or you reject it because it is broken. The problem is that if you do reject it then it will take about 6 months to get back a specification that is broken in another way and then you have to make the same choice...

So after a few iterations you just say "fuck it" and implement it as specified and then hope that you get a chance to fix it before shipping it (or that it doesn't become your headache later on...).

I've been there too, and I know. I'm not speaking to the choices devs make (rock and hard place, as you say), but the choices the org makes. For government work is driven by Congress' procurement process, but for enterprise is entirely on upper leadership's perceived need to avoid risk. Which is ironically hilarious, since such approaches guarantee higher risk, in that they pretty much universally lead to late delivery of broken features.
Enterprise developer here. Exactly this. If you reject the spec, you won't get another one before the deadline that was committed before you got the spec you want to reject.
Instead of implementing the obvious intention of the spec and waiting to see if anyone complains?
The intention usually isn't obvious. The stakeholders have spent so much time documenting, in depth, the solution that they want, while spending no real time documenting or communicating the specifics intricacies of the -problem-.

That's the issue in a nutshell; Checkbox Driven Development implies "if we just define the solution well enough upfront, we'll get what we need!" instead of "if we define the problem well enough, and let dev pitch us solutions, and iterate as we go, we'll get what we need". Which implies that the devs are not to be trusted to come up with a solution themselves.

To deviate from expectations and be congratulated, you have to, A. Be certain you're doing the right thing, and B. Have an audience that can recognize you did the right thing. Both of those require a level of trust that is just missing in this sort of org.

"Monkey Paw Development" was a new one for me :) Thank you! Great analogy. Reminds me of this :D https://www.youtube.com/watch?v=cDA3_5982h8
Yeah, I just coined it while making the post. :P Less a "also (currently) known as" and more of a "also (should be) known as". Certainly how I'll be referring to it in cynical moments from here on out.
Requirements are hard in dysfunctional organizations, or those with more stakeholders than capability and agility.
Requirements are hard upfront, period, to the point I'd say that any organization trying to set them upfront is dysfunctional, tautologically. Making all the decisions when you have the least amount of information is a Bad Idea.
Also, how long will it take to build and how much will it cost? :)

This is where Agile ought to improve things, but then SAFE came along and we were back to square one.

but there are some architectual significant requirements, which have to be known upfront.
I both agree and disagree with you.

There are requirements that will affect architecture that, if they're guessed at and turn out to be wrong, will lead to massive refactoring and/or large amounts of effort being thrown out. 100%.

Where I disagree from most businesses is in the implicit belief they have that seems to be "better for devs to be idle than for devs to work on code that will be thrown away". I'd rather take a guess and start work; best case we're right and are ahead; worst case we're wrong and have learned some useful lessons.

Which you'll note is the same dilemma as every other decision related to the project, with the only difference being the scope.

I wonder the extent to which those two specifications describe the exact same organizations.
Don't ever go to https://i.reddit.com/r/maliciouscompliance

You only have so many hours in your day.

Way back when, we used to remind people to be careful what they wished for in case they got it.

Do you happen to have the exact wording? As far as I can tell these mean the same thing.

1. "You must support URL length up to 100 characters" -> your browser must support URLs that are 100 characters or less (and may or may not support longer ones)

2. "Your supported URL length must be at least 100 character" -> You must support URLs that are 100 characters or less (and may or may not support longer ones)

I don't know the exact wording, a gracious reading might be (as directed at people writing html)

1. Never use a URL longer than 100 characters

2. Go ahead and use a URL longer than 100 characters

As for the true intent? I've no clue.

Didn't you confuse 'at least' with 'at most'?
You are welcome to try to find it. I just failed :(
"At least" means more than or equal to. In other words, the 'least' it can be is 100 characters, with no upper bound.
Both sentences require browsers to support 100 characters.

Both sentences permit browsers to support 101 characters.

Exactly, they're functionally the same.
On second pass, you're right. They're the same.
Yes but "you must support up to 100 characters" also has no upper bound - supporting 200 characters also fits that requirement.
So if you were a programmer on a project and you were given a spec that says "up to 100", you would just make it unbounded, and for all intents and purposes completely ignore the spec?
"Must" and "Must Not" are keywords in formal spec. If it says "Must support up to 100" and doesn't say "Must Not support over 100" then I would consider the upper limit to be whatever limit is sane for the data type.
So you would pick an arbitrary upper limit based on your own notion of what is sane. Picking such a limit, you would still need to write the same error handling code for limits, but it would happen at maybe 200. And the next programmer who inherits your code looks at the spec and your code and has to guess "why 200"? And it becomes lore. Which is specifically worse than writing to the spec.
I can see where you're coming from, it does read like "MUST support up to 100 characters (and MAY support more of you choose).

But honestly I think it's a bad practice to build the "may" part, because it's not explicit. The person who wrote the spec just as easily could have intended it to be "MUST support up to 100 (and may not go over 100)". So by not setting a bound you're gambling with your implementation being rejected, but setting a bound at 100 satisfies both possible "implied clauses" of the requirement and should not be rejected.

The supported URL length is at least 100 characters, not the URL length.
I spent some time looking at similar specs for more recent browsers, but wasn't able to find anything useful. This was for a proof-of-concept I made that stores entire web pages in URLs (creatively named "URL Pages") by base64-encoding them and putting them in the URL fragment (the part after the "#").

https://github.com/jstrieb/urlpages

The URLs this thing generates get pretty damn big sometimes, since I never got around to implementing compression. I can confirm that massive URLs from pages with inline images do work, but probably take some not-so-optimized code paths because they make my computer's fans spin up. Click at your own peril:

https://git.io/Jss7V

I made a service to store arbitrary files as URLs that is similar. The hard part is files that are too large, I can handle files up to 5mb if you click on them all via local storage. Compression helps a lot as making them base64 increases the size quite a bit.

https://podje.li

Could you make whole webpages just through urls? Such as they will completely portable? Portable being taken with a grain of sand ofc.
Yes, I did this for self-contained reports in maybe 2014. All images referenced (containing diagrams) were embedded as data URIs. Restrictions are AFAIK more picky now, though so YMMV in 2021.
Cool project! It is kind of interesting that the link is the content, not sure it’s always useful, but for twitter like short form content perhaps?
Well webpages themselves including links, the embedding would need to be recursive.
not the entire content, but a hash id is pretty common https://en.wikipedia.org/wiki/Magnet_URI_scheme
I needed to send data over GET in 2012/2013 and built my own tiny LZW-alike compression to squeeze as much as possible into the 100kb which seemed to be the safe limit for non-ie browsers at the time
That's really interesting, I'd wondered if that was feasible! A few years ago I needed to send myself notes and URLs from a work computer to look at later, so I put it into the browser as https://< my website >.com/index.html?saveforlater=note%20to%20myself

When I got home I'd search the server logs for "saveforlater" and retrieve my note. Though it might have been faster to just write it on a slip of paper.

I did that too, but the limit in my language was around 1024 characters in URL's so had to make small packets to send data.
Okay, but you can already store web pages in URLs. `data:text/html,<h1>Hi%20there!</h1>`

You can even base64 encode them, if you want to.

This is true, but linking to data URIs no longer works. Many browsers block them for "security reasons." In Firefox, a link to that page is not clickable for me:

https://git.io/JssFK

From a convenience standpoint, it's also far less likely that a URL with an http: scheme will be blocked by a random web application than one with a data: scheme. For example it makes sharing on social media sites and chat applications more feasible.

I don't know. There's a lot of problems but to me "at least" sounds like a more helpful phrasing. Browsers run in such heterogeneous compute environments (even back then) that "up to" basically cripples you to the lowest common denominator of all platforms you target. "At least" makes it mostly the HW vendors problem. Sure, MS was encountering this problem more because Windows ran on such a large range of HW but think about what the world would look like today if you had browser vendors putting caps for desktop browsers based on what mobile could support.

EDIT: For some limits. For other limits "up to" wording may be more appropriate & is still in use (e.g. storage).

"At least" seems like a very good way of introducing a DoS vector.

I think that 1024 was probably too short as a limit, but I think that it does make sense to impose an arbitrary upper bound to reject malformed requests early.

I don't see what you mean by "the HW vendor's problem", I can assure you that any browser in existence is going to have an issue if you send a 1TB URL, while the NIC will have no issue transmitting it.

And here's the answer to the sibling asking why it's a problem, since they mean exactly the same on practice :)

What it literally means and what people understand when reading it aren't the same thing. On this case, for people creating sites, "up to" leads immediately into the real meaning of the phrase, while "at least" strongly implies the opposite. But for people creating browsers, the implication is inverted.

The URL can be up to 1024 characters. The browser must support at least 1024 character URLs.

They're 2 sides of the same coin, but MS didn't actually rephrase the sentence properly. Their version would have every URL have at least 1024 characters in it. Any less than that, and the browser should reject the URL as invalid.

> Any less than that, and the browser should reject the URL as invalid.

lol. that would have been awesome. domain squatters would be running for the 1000 character names while crying about all the money they paid for three letters one :)

It's a lot more likely that the commenter remembering something he read 28(!) years later didn't rephrase it properly.
Originalism vs strict constructionism vs loose constructionism.
I've written a number of front-end projects that used URLs for state, and yes, lol, IE was a hard no for such efforts.
Didn't IE have buffer overflow attacks due both to long headers and long URLs?

Talk about being hoisted on your own petard...

I remember there was a site / tools that fit the whole web page content within its URL. And it was precisely limited by this "standard" where every browser behaves differently.