Hacker News new | ask | show | jobs
by ivolimmen 2373 days ago
It's worse. In 1.4 they added URI because URL is very flawed. You should never use URL. The equals method actually does a DNS resolve and compares the ip addresses. This means that comparing two URL's on the same server will always return true when compared. A lot of URL's you compare will give you true as there are a lot of sites run on the same machine with the same ip address.
3 comments

Had this happen in production in a third party library we used many moons ago. Sysop came and asked us why we did six figure DNS lookups over a short period of time (24h? Less? Don't remember).

Would probably have gone unoticed at most AWS/GCP/Azure shops today.

Worse was when I had somebody adding the ip address as an Inet4Address on every message passed between machines in a production environment that explicitly didn't have DNS (banks have occasionally very odd ideas about securing subnets). Every single message was doing a reverse DNS lookup and then timing out. And there were a _lot_ of messages.
Wait, seriously? Did whoever designed that have any idea what URLs actually were? Even without DNS, comparing the hosts is no way to compare URLs. I can't think of a single example of two 'equal' URLs whose strings don't match (up until the parameters, at least)...
Yes - bear in mind the URL class has been around since 1995, when the landscape was somewhat different!

At that point, URLs resolving to the same IP were considered equal, even if the host names were different. Even now, there’s no real difference between say “http://example.com” and “http://example.com:80”; it would have been reasonable to consider these equal.

Those are certainly not equivalent, as the 1st would not even resolve in the majority of environments. I'm assuming the 2nd host is a reverse proxy for the 1st? Unless that fact is part of your design assumptions (which it certainly can't be for a general URL class in a standard library), those should be considered different.
If the two hostnames resolve to the same IP address, they are equivalent. This could happen if initech.com was a default domain. This is a very standard situation in internal networks which i am surprised you are not familiar with.
I am quite familiar with that kind of setup and it's even further from equivalent than what I assumed you were referring to. Shared web hosting is the common example, but some other protocols are capable of distinguishing between hostnames.

google.com and maps.google.com resolve to the same address but are not equivalent. Even if your example and many others are equivalent, there are many that aren't. A general library should not make assumptions like that, unless they hold 100% of the time.

You asked for a single example of two 'equal' URLs whose strings don't match up, and now you've got one.
Wow. And of course .equals is a synchronous method, so the whole thread blocks on a network operation whenever you do that.

Helluva way to thread starve your application. :(

https://docs.oracle.com/javase/7/docs/api/java/net/URL.html#...