Hacker News new | ask | show | jobs
"Unknown or expired link." - Why? (google.com)
145 points by bradharper 5359 days ago
Why would such a valuable site continue to allow itself to be plagued with this type of repulsive usability issue?
22 comments

It's an artefact of the way in which news.arc (actually srv.arc) uses functions for links. Here's the key code:

  (= dead-msg* "\nUnknown or expired link.")

  (defop-raw x (str req)
    (w/stdout str
      (aif (fns* (sym (arg req "fnid")))
           (it req)
           (pr dead-msg*))))
If the fnid (function ID) isn't in the fns* list then you get the dead message.

  (def flink (f)
    (string fnurl* "?fnid=" (fnid (fn (req) (prn) (f req)))))
In many places in the code closures are used to handle requests (see the flink code there). If the fns* list is cleared (say news.arc is restarted or harvest-fnids kills them) then you'll get the message.

The use of closures in this manner means that the code needed to handle say a form submission is really compact and set up when the form itself is generated.

I would be interested to know how much state is in those closures. If it is less than 200 or so bytes, it would not be impractical to encode it (b64) in the url for the next page (rather than a reference to the state).
You don't want to execute code from URLs. (Yes, you can use cryptography to "sign" URLs you create. Don't try that at home unless you know the difference between MACs and hashes, and how to avoid timing attacks.)
Whoa, whoa. Putting the state from the closure in the url is not the same as putting the closure in the url.
But you are still likely to want to sign the state so you can tell if it has been corrupted (or deliberately doctored) and reject it if so.
Sign or sanity check, whichever you prefer. Personally in a simple interface like this site I'd rather sanity check the few simple parameters.
I wouldn't call it a closure unless it has something like a "next-instruction" field, which is probably enough to get control - and certainly enough to do nasty things (think 'debug-mode, 'restart-server, 'shutdown.)
The PLT-Scheme (aka Racket) folks do this. Its kind of awesome, and I used it for a side project. It works pretty well except some browsers have a max url length, so you can only safely shove so much data to the client side, and then your stuck with old fashion IDs again. Still lots of fun to play with :)

In a scary way

Racket has support for serialization of closures and continuations, which it's web server makes extensive use of. It seems like it'd be possible to expose the relevant functions to make that happen, though I have no idea how large the closures actually are once serialized.
I believe the Racket web server works similarly to the Arc server. At least it did last I checked (back when it was still PLT Scheme).
> how much state is in those closures [...] to encode it (b64) in the url

I would be interested to know if this is even possible to determine from a macro which variables are bound in the closure.

My (unofficial) documentation of the Arc web server may help understand this: http://files.arcfn.com/doc/srv.html

In particular, harvest-fnids has as maximum number of allowed fnids. If there are too many, it purges any fnids that are older than their expiration time, and the oldest 10%.

Thus, the more fnids created (i.e. the more users), the sooner fnids will get harvested and you'll get the expired error.

A website written in Lisp? That is awesome.
It was created by pg. What other language (type) would he have used?
You get that when your session expires, which takes just slightly less time than writing a well thought out comment.
It doesn't just happen when writing comments; it can happen almost anywhere on the site if you linger too long. It's the most glaring fault with the software behind this site and would be completely impracticable if this site weren't populated by technology-minded people who aren't bothered by error messages.
Agreed. I've pretty much gotten used to refreshing before I click to the next page every time...
I just open up each page I'm interested in reading in its own tab, I've never really had a problem with the error message unless I take too long on a comment.
I've become accustomed to getting the error when I linger on the page for a bit, and even in that context it's pretty irritating, but tolerable. Just this morning however, I'm able to click on the logo link at top-left, immediately navigate to the bottom, select "more," and get the nasty - that's remarkably dysfunctional.
PG is using a really cool programming technique that I'm afraid is ahead of its time relative to current hardware. An upgrade to the HN server should allay the problem.

To see the potential, look at this code snippet from an academic paper on the topic. The web server presents a form asking for a number, then presents a form asking for another number, then displays their product. This technique makes event-driven web applications feel (to the programmer) like sequential imperative programs.

  ;; main body
  ‘(html (head (title ”Product”))
         (body
           (p ”The product is: ”
              ,(number→string (∗ (get-number ”first”) (get-number ”second”))))))
The paper: http://cs.brown.edu/~sk/Publications/Papers/Published/khmgpf...
It's not so much that it's ahead of its time relative to hardware as it is something you do in the early versions of a program.

Using closures to store state on the server is a rapid prototyping technique, like using lists as data structures. It's elegant but inefficient. In the initial version of HN I used closures for practically all links. As traffic has increased over the years, I've gradually replaced them with hard-coded urls.

Lately traffic has grown rapidly (it usually does in the fall) and I've been working on other things (mostly banning crawlers that don't respect robots.txt), so the rate of expired links has become more conspicuous. I'll add a few more hard-coded urls and that will get it down again.

Over the last week the home page appears to be cached longer than the arc timeout, no doubt due to the spike in traffic. As I throw away cookies when closing the browser, I need to login daily. It's been impossible to login from the HN home page because of this. Refreshing the page doesn't help; I've had to click through to a story to be able to login.

You should hard-code that one too.

The problem there is that we switched to a new deliberately slow hashing function for passwords.

Edit: I investigated further, and actually you're right, the problem was due to caching. It should be better now because we're not caching for as long. But I will work on making login links not use closures.

What'd you go with, and how much of a pain was it to get working in Arc?

I ask because I'd love to be able to make a claim like "even Hacker News, which is written in a Lisp, managed to implement a modern password hash".

We use bcrypt. Rtm did it. I never looked at the code till now; it's about a page of Scheme.
Gauche Scheme has a bcrypt implementation, but I don't know what the compatibility story is between mzscheme and Gauche. I think they're both R5RS compliant, so it should work.

I see that newer versions of Arc run on Racket, but I have no idea if that's what HN is using or not.

I haven't seen a scheme powered PBKDF2 implementation so I'd guess that's out.

The only other expensive KDF I can think of is scrypt, but I would be pretty surprised if that's got a scheme implementation.

Of course, I guess pg could have decided to call out to the OS to run any of those functions too.

Is that specifically to inconvenience someone who would break in, steal your password list, and crack it offline?

If not, what was the design goal?

If slowing down web login attempts isn't part of it, why not get a dedicated auth server and offload the crypt stuff onto it?

And if it is the goal, you could use CPU-friendly sleeps on the front-end to give increasing delays to the repeated guesser.

> Is that specifically to inconvenience someone who would break in, steal your password list, and crack it offline?

Probably: http://codahale.com/how-to-safely-store-a-password/

Hashing functions designed for speed are absolutely the wrong thing for passwords.

Out of curiosity, are there any places where the hn codebase would be smaller if you used full continuations instead of just closures, allowing code akin to what I quoted from the PLT paper?
Possibly, but not many. It was surprising (in this application at least) how rarely I needed full continuations.
The HN server uses a table of closures to implement those links (the id code for the closure is the bit after fnid= in the url).

When the HN server starts running out of memory, it drops entries from this table. When your browser asks for an entry that is no longer in this table, you get the "Unknown or expired link" error.

This is a crazy design, but unless someone would like to patch the source code and get PG to accept it, we're stuck with it.

It's not crazy at all, it greatly simplifies development to use callbacks for actions rather than manually encoding the necessary state into the URL. Techniques such as this are what enable a single developer to be so productive by automating boring and time consuming stuff.
Except it doesn't appear to work robustly, which makes it poor design. "Automating boring and time consuming stuff," is all well and good if it actually produces a functional system, but that concern is secondary to robustness.
> but that concern is secondary to robustness.

To you, but that's a value judgement, it obviously was the other way around for pg. Had he not taken those shortcuts, there would be no hacker news at all; be thankful he automating that boring stuff and bothered to build the site.

It'd work robustly enough if the links didn't expire, and if we believe other posts on this page, the links are expiring due to memory limits on the system. (The other possibility is a timeout, I guess, which is easily fixed.) If it's running out of memory to store the closures it would run out of memory to store the interaction state.

In other words, there's a problem here, but it's not the programming model that pg chose.

Except the links do expire, so its not robust. I expect that when I visit a web page, I can let it sit for an extended period of time before moving on to the next page and have it work. HN doesn't work.

Furthermore, the technique of holding important state authoritatively in memory like this is not a good web-development practice for various reasons. Doubly so if its state data which can be round-tripped. Links should not break when the web server or cache (I'm not sure which one it is) runs low on memory. So yes, there is a problem with the programming model that pg chose.

If he hadn't used that technique, there would be no hacker news for you to use at all. You're entirely missing the point that this is a technique to make hobby programming more fun, it's not about being robust or best practice, it's about making programming simpler so pg finds it worth his time to build this site in the first place.
> If it's running out of memory to store the closures it would run out of memory to store the interaction state.

Not necessarily. The way I would approach this is to keep the link cache in memory, but have the links contain the minimal necessary state to reconstruct the link from disk-based storage in the case where the cache is gone. That gives the same excellent median performance without any breakage.

> The way I would approach this is to keep the link cache in memory, but have the links contain the minimal necessary state to reconstruct the link from disk-based storage in the case where the cache is gone.

It's not a cache, and you clearly don't understand the issue. These are callbacks to closures, embedding the state in the URL is what they attempt to avoid because doing that is tedious.

Simplifies development? This is not a complicated piece of software and the techniques to build it are well known. You wouldn't need any more state than the id number you need for the callback anyway.

Building broken software is always much easier than building robust correct software so this is hardly a good argument.

> You wouldn't need any more state than the id number you need for the callback anyway.

I don't think you grasp the issue... the link is expired because the callback no longer exists to link to.

No, I understand the issue. But you're presupposing a specific implementation here. If you were just designing this in, for example, PHP then you'd just need one piece of state: the page # (for more...) or the parent comment id (for the comment) and so on.

The real issue is that there's a whole bunch of saved state on the server for operations that could be (and should be) completely stateless.

No, the issue is time, specifically, pg's time; using callbacks takes less programmer time than manually building every URL statelessly. Yes, it could be done another way, but it wouldn't exist at all if he'd had to do that for every link because it'd have taken too much of his time.
It's not productive to have a site that randomly locks out visitors. No matter how clever the code design is, this is a product flaw.
No, it's a design choice, he favors ease of programming more than user experience. You might not agree with that choice, but it's not a flaw, he did it on purpose and knew the consequences.
A better name might be technical debt. It is a flaw from the perspective of someone seeing the error message. But for the developer it is a way of saving development time, which can be paid off later to fix it.
Yes, this is a better way to say it.
It's not a feature so it's a flaw.

You're right that pg did it this way in the beginning to save time but years have gone by and the site is now more central to his business - especially as a tech demo. This back and forth argument presupposes that there isn't a better fix than the naive 'use old-style code' solution.

Anyways, the discussion is worth having. Only by pointing out problems do you fix them.

You're justifying code that doesn't work on the grounds that it was quick to write. (Facetious comment: if the website doesn't have to work, I can write the whole thing in under a minute. Someone wrote a HN clone on these lines a few weeks back, but I don't know how long it took them.)

Given that the HN code was written by an increasingly busy man in his spare time, his use of an unreliable but quick-to-write implementation technique may be an acceptable trade-off. But it doesn't make the design any less crazy.

> You're justifying code that doesn't work on the grounds that it was quick to write.

No, the code works fine for an acceptable period of time, it doesn't just not work.

If you clicked on the links, you'd know why.

The software stores the current state in a closure. The closure gets cached. When the cache is full, the older closures get flushed, hence the error message.

I find it interesting that most discussions I've seen about this exact topic are about Arc and closures... instead of about the fact that this may well be an interesting programming thing to do but it's a moronic user experience thing to do.
Your comment is in a sense its own refutation, because the ultimate test of user experience is whether users continue to use the software.

Getting user experience right depends on the users. I wouldn't use this technique in an online store. Random online shoppers would be confused by expired links, and you'd lose sales. But HN users aren't confused by them. What HN users care about is the quality of the stuff on the site.

Since I can't work full time on HN, I focus on the things that matter most. What I spend my time thinking about is e.g. detecting voting rings. Those affect what you see on the frontpage, which is what users of this site care most about.

I think you underestimate how annoying the issue is. It's one of those things you put up with because of the content, but which are annoying enough that they detract from the site experience.

So far I'd rate the user experience of the site around 3/5 and the content 5/5. You don't need to work any more on the content unless it starts dropping!

I've been here a long time and seen this expired linky thing happen roughly every other week; on a very few occasions it's been an annoyance, but mostly it makes me smile; after reading news.arc, it's a reminder of what a hack HN is.

I could care less if this issue got fixed. I have never once felt, "man, I'd definitely jump to another site if it didn't have this expired linky thing happen".

(Now, politics stories on the front page, on the other hand... I've often wished for a site with as good a crowd as HN but without the politics...)

You're right, but at times, I've found it nearly impossible to log in because I can't seem to get a new login URL. Nothing works except waiting it out. So please look into that if you can. I almost submitted a story like this because I've spent 5+ minutes trying to log in several times now.
You're right, there was a problem due to caching login links for the last couple days. It should be fixed though. Sorry about that.
I use HN as much as anyone and it doesn't get in my way at all. The most annoying thing about the issue is hearing the repeated discussion around it.

I'd rather have any kind of cool new feature, like messaging, than have this fixed.

We've had messaging for years. I just never turned it on in the general case. (We use it to talk to YC applicants.)
Well, I applied to YC for the first time, so maybe I'll get to see it!
I don't dispute that you're a busy guy.

That said: I come for the community - and the community has obviously noticed that the site occasionally throws up an annoying 'error'. The fact that you've done something cool programatically has no bearing on what I get from HN.

> because the ultimate test of user experience is whether users continue to use the software.

TIL Windows has always been an amazing user experience. Just look at their numbers.

> But HN users aren't confused by them.

"About 2,200,000 results" -- says they are.

I have a question: What advice would you give to one of your YC startups if they were having this same issue?

"Ah, your users won't be confused. Ignore all evidence that says they are."

I can tell you exactly why people continue to use Hacker News. It's because it's YOUR SITE. They put up with the broken web design where links die after a few minutes.

Your site won't get beat out by a competitor because it has you, and you fund people. So people continuing to use the site is orthogonal to whether the user experience is any good. You have lots of feedback indicating that it isn't.

Aren't you the one who says "listen to your users"?

Please don't call people or their work 'moronic' (or the like) here.
I, too, wanted to ask this question, but feared that someone would respond with something akin to "submit a patch" or "grep the source code"

Obviously, it makes sense for someone who is versed in the news.arc internals to fix the problem; nonetheless this issue certainly bugs me.

Good question, but in my opinion it's somewhat rhetorical. Given bugs like these, the ongoing optimization battle, and fairly reasonable feature requests (see the huge HN topic on that), isn't it about time Paul Graham hired someone full or part time to work all these issues out? Given how important HN is to YC, I would think it's worth it. Are any of these tasks really things pg has to do or are the best use of his time?
It's good to do things you enjoy. Best use of his time? I don't think anyone can definitively answer that.
Steps to reproduce:

1. Open Hacker News

2. Go to lunch

3. Come back from lunch and click next

Every. Time.

Back when it took a lunch break to cause this issue it didn't bother me at all. These days it happens so fast that it is constantly interrupting me while actively browsing through the site.
Restrain from eating then.
Hm... Didn't know that obviously jest comments are ill-favoured in HN. Hackering is a serious business apparently.
So don't do that.
I wonder if this is exacerbated by people enabling the aptly named "noprocrast" setting in their profile, not knowing what it does...

http://ycombinator.com/newsfaq.html

Noprocrast page is entirely different page from "Unknown or expired link"
The first complaints are from 1575 days (over 4 years) ago, including about the more button breaking, so I am guessing pg has no interest in fixing it.
Hacker News is open source, so it seems as if nobody is interested in fixing it. Or are there fixes and pg has rejected them?
Or it's so ingrained in the architecture of the software that a fix isn't possible without completely rewriting it and changing the entire design philosophy.
Sort of yes, sort of no. It's a rapid prototyping technique. Essentially you fix it case by case, by taking individual bits of code that use this technique and replacing them with the uglier and less flexible but more efficient alternative of a hard-coded url.
Paul, I really do not mean any disrespect here because you are truly a class act and first rate player in the start up world. You are also a great hacker that loves to push the limits. You've created an amazing community here that I have been able to learn a ton from.

I have to ask, and I'll probably get down voted to hell because I'm naive or something, but what is so elegant about a coding technique that breaks under normal usage conditions? If I put out a customer facing piece of code, especially after 4 years, wouldn't it make sense to use an "uglier and less flexible but more efficient alternative" that doesn't break?

I understand your previous explanations of why this happens and of rapid prototyping etc. But at what point does the architecture actually get changed to eliminate this bug?

It doesn't make sense to call any specific amount of traffic "normal conditions."

What's good about this technique, and about rapid prototyping in general, is that you can write an initial version quickly in very little code, then gradually make it more efficient as the demands on the app increase.

The rate of expired links says more about how busy I personally have been lately than about the desirability of storing state in closures.

Maybe pg uses it to 'monitor' the exact amount of time spent by users on the website. :evillaugh:
To be honest, if it was written in a more popular language, there would probably be a fix.
I'm also pretty sure that the downloadable news.arc is not the exact code running on this site.
Hacker News is open source

Is this true? If so, where can I find the code?

I think it is included with arc (the language).
This is apparently an old problem that hasn't been fixed yet. As you can imagine, pg has a lot to do these days ;)

See here http://news.ycombinator.com/item?id=28944

This happens to me a lot while reading HN. I hit "More" and by the time I am done reading a few comments on a handful of entries, the next "More" has expired, and so has the current.

This only happens on HN (at least to me).

Actually _right_now_ I cannot click "more"(on the first page) without hitting the "Unknown or expired link" page ... so I cannot go past the first page :/ -- Someone should submit a patch :)
I thought this was to combat cross-site request forgery attacks?

Else couldn't one set up malicious scripts to up-or downvote many stories, or post comments under someone else's name?

Those are orthogonal issues.
Everyone who has literally answered the question "Why?" has completely missed the point. What would PG say about a primary site feature that is so completely broken that it drives users to complain actively, and maybe stop using the site? That it is their problem because they don't understand the technical details? Well, obviously, no one is losing any money here, so maybe that's the answer after all.
I'm getting this error when using the login link right now!

EDIT: The only way I was able to login was to use the 'add comment' button on this post.

I also just tried to log in about 6-7 times in a row (clicking the "login" link on the front page, reloading the front page in between attempts), and I repeatedly received the "expired link" page.

It also reliably happens clicking the "next page" link on the bottom of the front page; by the time I'm done reading the front page the next page link usually expires.

Please fix?

The problem could made less painful by including a link back to http://news.ycombinator.com on the "Unknown or expired link" page. That would save me fishing around with the mouse and the back button to get a new start.
I would go one step further and just send me back to the homepage.
I've been wondering about this for a long time, but just as a data point if anyone cares, it has reached the point recently that HN is basically unusable for me a lot of the time, and I really am starting to give up on trying and spend more time elsewhere instead.

Perhaps one visitor is no great loss -- I'm hardly the personality around here that someone like patio11 is -- but I hope my contribution is constructive, and my comment scores have always suggested so.

However, subjectively, it seems like the quality of posting and voting has taken a sharp nosedive since the "Unknown or expired link" problems have become a several-times-per-session occurrence over the past few weeks. I can't help wondering whether long-standing regular contributors are being put off as a result. If positive contributors can't even log in to refute an objectively incorrect post with a verifiable link or downvote Redditesque diversions, a downward slide seems inevitable, and then the loss of high quality posting and voting becomes a self-sustaining decline.

When writing comments, always make a copy of your text before hitting submit (CTRL+A, CTRL+C). A good strategy for any text form on the web.
Automated way of doing this...Lazarus form recovery is a free plugin that has saved my butt numerous times:

Firefox: https://addons.mozilla.org/en-US/firefox/addon/lazarus-form-...

Chrome: https://chrome.google.com/webstore/detail/loljledaigphbcpfhf...

"go back one page" (alt-left arrow) recovers your text on HN
This is browser dependent (although many modern browsers do keep form content in the history).
Indeed, that became a habit for me a while ago after certain social discussion sites and on-line tools I use frequently went all Web 2.0 and broke the back button when a form submission failed, typically because the form fields were only added dynamically using JS so when you go back they simple aren't there any more according to your browser. Mercifully, HN has yet to introduce that particular "improvement".

That's not really the point, though, is it? The important thing is whether posters who want to offer a useful comment and/or mitigate a poor comment can do so. Once HN gets into unknown/expired mode at the moment, it seems common that even basic things like "More" links and logging in can fail as soon as you load/refresh a page, at which point the site is effectively unusable: you can't contribute even if you have something worthwhile to add saved away in your clipboard from the previous failed attempt.

Funny, I suspect it's just the opposite. Long-standing regular contributors are unlikely to be put off by the error messages, especially if they're technically knowledgeable and understand why the error is occurring.

On the other hand, new users who might not be accustomed to The Way We Do Things Around Here would be more likely to get upset at the superficial inconveniences and leave.

It's probably even the case that improving the site or adding features to it would work against its best interests by making it more accessible. HN's implementation is such that the more traffic it sees, the more frequently those errors will occur -- and as an unexpected side-effect, the popularity and instability of the site will work against each other until equilibrium is reached.

A small barrier to entry like Reddit's spartan design or MeFi's $5 fee can go a long way toward delaying the onset of the entertainment-seeking masses.

I'm a long-time user (created: 1668 days ago), and I hate this error. I understand why it's occurring, but it seems bizarre that such an obvious flaw has gone unfixed for so long. It feels amateurish. (That said, pg has bigger fish to fry, and he's probably right to ignore this. C'est dommage.)
I'll just throw in a "me too" with the other responders and say this error annoys the dickens out of me -- and I'm a long-time user and a medium-long-time initiate in the knowledge of the error's source (I tracked it down in the source in a fit of pique about 6 months ago after getting the error for the Nth time).

Besides the actual annoyance of the error, what's extra rankling is it is an example of privileging a neat trick over user-experience, which is one of my Least Favorite Things Ever that programmers tend to do.

(As an aside, I am extremely skeptical that increasing rates of this error occurring will help keep the original user community of the site -- it seems equally likely that longtime users will just get fed up and wander off.)

While there are plenty of cautionary tales of fora that failed their original purpose due popularity, growth and loss of focus, there are equally many cautionary tales of fora that failed due insular communities, group-think and stagnation.

It's a fine line to walk and it may not be wise to rely on programming bugs to point the way.

It seems kind of ironic to have such an egregious bug on a site dealing with coding and technology...
Silly question, but, if HN is purportedly open-source where can I download the source code?

  1. http://arclanguage.org/install
  2. Extract arc3.tar
  3. See news.arc
However, AFAIK pg forked from the latest public news.arc, so the current Hacker News platform is not open source.
Yes please, fix it already. I this problem where in another site we would be posting complains from random blog posts over and over again.
I think some pages are generated statically and expire. Why the login/logout pages "expire", I can't really guess at.

Wow, not sure why this is so deserving of downvotes. Trying to find a source, but I thought there was a previous discussion of many HN pages being statically generated and served quickly with links that expire after a certain time (or become invalid because of what may happen on the server side of things). But oh well.

One reason might be that posts/users are getting axed. Would be nice if everything except spam was unmoderated, imo. Also, the current rate limiters to keep spam out are blocking those that would otherwise be more active.