Hacker News new | ask | show | jobs
by platz 4158 days ago
> It turns out a number of fundamental aspects of The Web have changed since this time.

I wish they went into more detail what exactly changed and why these browsers no longer work, unless it was the PPP protocol they mentioned that is causing the issue.

6 comments

Old browsers speak HTTP 1.0, which adhered to the old "one service per IP" rule. This doesn't work so well with cloud based stuff, or shared web hosting.

HTTP 1.1 added the `Host:` header, which let a single IP host many domains, which essentially created the web hosting industry.

You can fix this communication problem with a HTTP 1.1 to 1.0 translating proxy server: http://www.jwz.org/blog/2008/03/happy-run-some-old-web-brows...

Original author of the hack here. Several changes:

1) HTTP 1.0 vs 1.1 and lack of the Host: header, as you've all deduced

2) Additional encoding info tacked on after the Content-type causes parsing issues

3) Many sites now redirect to HTTPS by default. While Netscape 1.0 and Mosaic 1.0 both support HTTPS, it used SSLv1, and well, remember POODLE? :P

Newer versions of these browsers tend to work in native DOSBOX but present problems when running on the web. We're working on it.

Biggest problem right now is that the virtualized dial-up ISP is a bit flaky. For some reason PPP over TCP over Websockets via Trumpet Winsock isn't as rock solid as it should be :P

And once you get past all that, the JavaScript that's so pervasive these days wont run in those ancient browser. My Mac Centris 610 has problems with all those tracking scripts everywhere.
There's a number of problems, which folks like jwz have dealt with, and which we'll look at. Basically, you don't get https (of course) and the response codes of web servers have shifted enough that older browsers don't know what to do. Proxy injection of needed helper material will help this a bit.

Bear in mind, though, it will ALWAYS be insecure and it will ALWAYS be more of a "try this out" than popping on your unicycle, starting up your bagpipes and riding down the Information Superhighway permanently.

JWZ ran into similar issues when he brought http://home.mcom.com back from the dead and then got it working in old versions of netscape and mosaic

  In order to make these web sites work in the old browsers, it was necessary to
  host them specially. In this modern world, a single server will typically host 
  multiple web sites from a single IP address. This works because modern web 
  browsers send a "Host" header saying which site they're actually looking 
  for. Old web browsers didn't do that: if you wanted to host a dozen sites on
  a single server, that server had to have a dozen IP addresses, one for each
  site. So these sites have dedicated addresses!

  The web server also had to be configured to not send a "charset" parameter 
  on the "Content-Type" header, because the old browsers didn't know what 
  to make of that.
He also wanted to use these old browsers to surf the modern web, so he wrote a proxy that translates between HTTP/1.0 and HTTP/1.1. Maybe the textiles.com guys can implement something similar.

See here for details: http://www.jwz.org/blog/2008/03/happy-run-some-old-web-brows...

The most important change is that HTTP 1.0 assumed that each IP had a unique hostname. The HTTP request just got sent to the IP that resulted from the DNS lookup with no indication of what hostname was fed to DNS.

HTTP 1.1 (1999, so way postdating Netscape 1.0) has a Host header that's sent with every request, which allows the client to communicate to the server which hostname the client thinks it's talking to. That allows today's world, with multiple hostnames colocated on the same IP, to work properly. But if you leave out the Host header the server doesn't know which of those sites you meant and will do ... something.

For example, this explains the defcon.org failure in their screenshots. In fact, you can try this at home in your favorite command-line:

1) Type

    telnet www.defcon.org 80
You get output like:

    Trying 162.222.171.206...
    Connected to www.defcon.org.
    Escape character is '^]'.
2) Type:

    GET / HTTP/1.0
and hit enter twice. See what it responds with.

3) Repeat, but in step 2 type (or paste, since it closes the connection quickly):

    GET / HTTP/1.1
    Host: www.defcon.org
followed by two newlines. Observe the difference.

Same thing for www.whitehouse.gov (which is in fact a cname for www.whitehouse.gov.edgesuite.net which is a cname for www.eop-edge-lb.akadns.net which is a cname for a1128.dsch.akamai.net which you can bet needs the Host header to know which site you were accessing!).

And same thing for news.ycombinator.com, which is a cname for news.ycombinator.com.cdn.cloudflare.net which then resolves to an IP but doing a reverse DNS lookup on that IP says it's got at least "ns1.cloudflare.com" and "dns.cloudflare.com" as domain names that resolve to it... so it's clearly going to be looking at the Host header to see what you actually think you're talking to. You can even see this if you compare http://news.ycombinator.com.cdn.cloudflare.net/ to https://news.ycombinator.com/ even though one is a cname for the other.

From what I can tell, Netscape Navigator supported HTTP 1.1 from version 2.0 and up AND Navigator version 4.x ran on Windows 3.1. I wonder if there was a problem getting that version to run on their emulator.
There has been no problem - we were just focused on getting the oldest, earliest browsers we can get running. It's not about getting a fully standards compliant browser made 20 minutes ago to run in a browser made 10 minutes ago. It's about providing easy, hopefully pain-free access to web history.
it was.