Hacker News new | ask | show | jobs
by sgentle 3172 days ago
The root of the issue here is that URLs are trying to be human-meaningful and machine-meaningful at the same time, but those requirements are fundamentally incompatible.

Humans work well with ambiguity and context. You know that when your coworker says "Bob's birthday is this weekend" you know she means her husband Bob, not Bob from accounting who nobody likes. And you even prefer that system to having an unambiguous human identifier, even a friendly one like "Bob-4592-daring-weasel-horseradish".

Machines, on the other hand, hate ambiguity and context. Every bit of context is an extra bit of state that has to be stored somewhere, and now all your results are actually statistical guesses - how inelegant!

In the early days of computing, there was no separation between the internals of the machine and its interface. If you worked on a computer, you were as much the mechanic as the driver. We got used to usernames, filenames, and hostnames because they were a decent compromise; they were meaningful enough to humans, and unambiguous enough for machines, so we could use them as a kind of human-computer pidgin.

But we don't need them anymore, and they were never really very good at either job anyway. Google's (probably accidental) discovery was that we were using the web wrong. Everyone was building web directories and portals because they thought that URLs weren't discoverable, but the real problem was that they weren't usable. Search was the first human interface to the web.

So Google's going to kill the URL, Facebook's going to kill the username, and someone (apparently not Microsoft) is going to kill the filename. There'll be much wailing and gnashing of teeth from the old guard while it happens, but someday our grandchildren will grow up never having to memorise an arbitrary sequence of characters for a computer, and I think that's a future to look forward to.

11 comments

> Machines, on the other hand, hate ambiguity and context.

When I ask my car to call my wife using only her first name, it suggests a list of 3 people who I'm not even sure how they got in my contacts list. Siri, on the other hand, gets it right every time with the exact same request. I wouldn't say my car hates ambiguity, the programmers failed to bridge the gap to human/machine interaction and meet the person halfway. ("If you want to talk to a computer, you have to think like one.")

I'd say it's programmers or deadlines that mean that the extra work of accounting for ambiguous data gets skipped. It doesn't take a neural net to look at the recently called list for the most frequent or even most recently dialed [wife's first name].

One irony of your "Bob" example is that sometimes using someone's last name actually adds ambiguity: "It's Bob Lingendorfer's birthday this weekend!" ... "Who is Bob Lingendorfer? ... Ohhh, you mean your husband!".

Maybe it's not irony, it's just that people read a lot into data and might assume that all of it is relevant to the task at hand. My car kind of does the opposite and lazily stops at the first three "close enough" hits on my wife's name.

One thing that worries me about computers working with all that contextual information is that they then need to know all that information.

And since computing is so centralized these days, this means that whatever company made the software needs to know that context about you too.

There's something to be said for computers staying dumb. I'm okay with my co-workers knowing my social graph well enough to recognize my spouse's first name by context. I'm not okay with faceless corporations or governments having that same information.

Very good point. Can't disagree with you. I am ok, however, with a contacts system letting me specify a single name nickname that it prioritizes in matching / searches.

And I'm probably also ok with the computer knowing as much about me as my cellular provider does, since all that is probably hoovered up already. Why should Siri be dumber than the feds?

To take this further afield, it would be interesting to interact with a "smart" assistant that only learned from info likely to be accessible to third party law enforcement and/or aggregator, as a demonstration of the risk & power.

that’s funny. i have the exact inverse problem. when i ask siri to call my wife ( by her first name only ) it gives me a list of two to pick from, whereas my car does the opposite and calls my wife.

darn computers!

Why don't either of you just tell Siri who your wife is? You can say "My wife is" and her name, it was verify that it found the right one and after that you can just say "call my wife", "sms my wife", etc.. You can do the same thing with your boss ("call my boss") and various other tags.
Usernames and filenames are not just compromises, nor arbitrary sequences of characters.

Usernames reflect a fundamental human desire to create an alter ego free from the burden of their legal name and the socioeconomic context they're in. If Samual Clemens were a blogger, he would write under the username @marktwain. Alonso Quixano might call himself @donquixote69. Anakin Skywalker will want to be known (and feared) as @darth_vader, not because his real name is unusable, but because he prefers to be called Darth Vader.

People have had titles and pseudonyms for ages. Usernames are a continuation of this tradition, not merely an invention of the 20th century. The global uniqueness requirement is of course rather silly, but enforcing a real-name policy on everyone is just as silly. If our grandchildren have no concept of usernames/handles/whatever, it might be more a sign of great oppression and loss of privacy than of technological progress.

Ditto for filenames. We programmers have a habit of using weird filenames that really do look like arbitrary sequences of characters, but most of the rest of the world just uses human-readable filenames like "Financial report 3Q 2017". Change a few numbers inside, and it's still "Financial report 3Q 2017", content-addressing be damned. The document might not be stored as a physical file in the future, but then again, have files ever been physical? Filenames are just labels that we stick on a logical chunk of information. Implementation details can differ, but the concept itself is not going anywhere as long as humans like to put stable labels on mutable things. (This, unfortunately, tends to escape notice when your concept art for a filename-less system only contains a handful of photographs with pretty thumbnails.)

> Filenames are just labels that we stick on a logical chunk of information. Implementation details can differ, but the concept itself is not going anywhere as long as humans like to put stable labels on mutable things.

This is the point that I think is completely lost on the author of the article, probably because of a focus on API design. It's a good thing that we can replace that dog-eared copy of Moby Dick with a shiny new one when the time comes, and our users don't need to change their URLs.

APIs are intended to be used primarily by machines, so it's fine for the URL structure to preference the predictable uniqueness of ids. However, for most URLs intended for use by humans, the forces are different.

A human-readable URL is not a pointer, it's a symlink.

All good points, however one thing missing is that humans also want to be able to refer specifically to "that dog-eared copy of Moby Dick". Facts like "that dog-eared copy of Moby Dick is missing page 34" or "that dog-eared copy of Moby Dick is actually a super valuable early edition" should not change their referent when the library gets a shiny new copy of the book.

And that's exactly how I read the article: both mutable and immutable references are nice to have for different use cases.

Yes, that's true abstractly. However, A, those sorts of references are much less common in web pages than in physical descriptions (at least in my estimation) (though they're very common in APIs), and B, those repointable references are not the same as a search - I want to uniquely refer to the current value of this pointer, while allowing the publisher to relink as appropriate.

The article reads as universal URL design advice, but I'd argue the points only really apply to APIs.

"Humans work well with ambiguity and context"

Not so sure about the "well" part there. I've encountered people who love to make guesses about the context (and others who actually wish you'd do the same). That coupled with ambiguity creates disasters varying from ordering the wrong lunch to broken relationships.

I'd rather have humans take less pride in being ambiguous and make attempts to be as precise as possible.

There's a video of Dijkstra talking about Mozart and Beethoven as opposite poles -- the former wrote everything neat and right, the latter kept revising by gluing bits of paper in his scores. In order to further mark his position at some point Dijkstra stopped typesetting his papers at all and began to write them right the first time.

So there's this whole ambiguity aversion spectrum. Maybe it correlates to the autism spectrum, maybe it doesn't. It's arguably much more important. Even in mathematics you have Poincaré, a demigod among men that kept publishing papers with significant mistakes, while in the social sciences you have people like Niklas Luhmann and Bruno Latour who approach their subjects with utmost precision and dedication to detail.

I'm a more ambiguous, big-picture-even-in-small-problems thinker; and I thrive with more detail-oriented coworkers that walk me through the trees as I walk them through the forest. This has a lot to do with me being able to think in very ambiguous terms and narrow down as needed to interact or provide for the needs of others. Left to my own devices I come up with extremely abstract philosophical theories that are not useful at all! Conversely left to their own devices precision people become paperclip optimizers.

I want to speculate further into "edgy" territory: maybe the whole gender divide that seems to come up in psychometrics and the labor market and so on is really an ambiguity/precision divide. The evolution of technology has actually increased the value of ambiguity, as computers do much of the precision work for us -- maybe making tech "woman-friendly" is rather about identifying those big-picture/detail-oriented complementarities.

Apple are the ones to kill the file name, with iOS.
Yes. And yet they are still with us, and are still with ordinary users too.
Not in an iOS workflow.
> The root of the issue here is that URLs are trying to be human-meaningful and machine-meaningful at the same time, but those requirements are fundamentally incompatible.

The TLDR of TFA is that an API can support both human-meaningful and machine-meaningful URLs.

Not really. TFA doesn't talk about what happens when the search fails. sgentle is talking about using human-meaningful urls as identifiers, which doesn't work when the search fails.
> So Google's going to kill the URL, Facebook's going to kill the username, and someone (apparently not Microsoft) is going to kill the filename.

Not if we kill them first!

It's kinda funny you mention that Google will kill the url.

For at least the past decade, advertisement in Japan has been showing people which search term to enter to find the website instead of a url.

I can't tell if that's a worse idea than QR codes of URLs.
You have to remember that broad support for Japanese characters in URLs are a fairly recent, and haven't really caught on.

So advertisers want people to type something in their native script in order to get to the product website. So while an English advertisement campaign might tell people to go directly to johnnysmattresses.com, a Japanese campaign couldn't do this, and instead ask people to search for ジョニーの布団.

Hadn't heard of that. Looks singularly awful.

Like URLs, but proprietary, exclusionary, and riddled with privacy and security issues.

kind of like, 'aol keyword'?
I don’t disagree with your point, but aren’t there languages that do take into account the context in which functions are called in addition to the parameters and namespaces?

Both R and Perl seem like ones where it wouldn’t be extremely strange for the function to also look back to the context of the calling function. Then it could find out if the two parties had an affinity for this person, and whether it was a conversation about something like figuring out an excuse to miss a party or one like finding a gift in order to which Bob.

this was really beautifully written.
> "Bob-4592-daring-weasel-horseradish"

You could easily have a bijective encoding at a frontend proxy that translates between the above and e.g.

> "4592-13f7-de41-203a"

(i.e. discards the descriptive part of the slug, and then reverses the unique words back into their index-positions in the same static 64k-word dictionary used for generation, resulting in a regular UUID.)

That doesn't solve the issue that neither of those are human-friendly, which was the point as in understood it
Except all the actors you mentions create centralized closed system for profit to replace standard vastly compatible simple primitives.

So i'm not sure it's a win