| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by digikata 3172 days ago

This seems like it's vulnerable to some form of abuse.

library.com/books/1as03jf08e/Moby-Dick/

library.com/books/1as03jf08e/Hitchhikers-Guide-to-the-Galaxy

Now lead to the same place...

2 comments

madeofpalk 3172 days ago

eh. You can do that with query strings and hashes in URLS anyway. https://news.ycombinator.com/user?id=digikata&profile=bad-pe...

link

digikata 3171 days ago

standards wise, you know the part after ? is variable though...

link

madeofpalk 3171 days ago

variable? Not sure if I 100% get what you're saying, but what I know is that https://news.ycombinator.com/user?id=digikataWaitNoThisOther... won't go to the same place as your user profile. There's standards, and then there's "Standards".

link

always_good 3172 days ago

You would redirect to the canonical one.

link

awj 3172 days ago

I think the concern is in the way it obscures the target. Replace "Moby Dick" with a Chuck Tingle (warning, probably nsfw) book. Now that second link is a serious problem.

link

always_good 3172 days ago

I see what you're saying, but it doesn't seem like much more than a funny gag you might pull on a friend.

If a website is concerned about that case, then instead of letting it inform their URL design, they should have a "Warning: Adult content. [Continue] [Back]" interstitial like Reddit or Steam.

link

digikata 3172 days ago

I'm not even sure it's a serious problem - a possible annoyance, and perhaps, for a spammy site owner, maybe even a feature. But as a web user, I'm not really fond of that added uncertainty.

link

WorldMaker 3172 days ago

You don't necessarily have to redirect, but you should at least include `<link rel="canonical" href="..." />` (as given example StackOverflow does) so that search robots and other website (scrape and/or API) clients know which one is the canonical path, to avoid duplicate efforts.

link

always_good 3172 days ago

That only works for some crawlers. Certainly not for users. Meanwhile, everything obeys redirects.

Since you bring up Stack Overflow, notice that they do the canonical redirect. Change the title in the URL and you'll get redirected.

link

WorldMaker 3172 days ago

Yes, the best approach is probably both, but it is crawlers that it matters more that they know the canonical paths more than users, and a crawler ignoring rel="canonical" is likely not much better than/as buggy as a crawler ignoring robots.txt; it's a specification they can ignore at their own peril.

link