Hacker News new | ask | show | jobs
by fabulist 4490 days ago
I'm attempting to replicate this. Searching the last sentence (which is behind the paywall) brings it up in google right away, so I think you're right. However, using Googlebot's user agent doesn't work, so it must be slightly more sophisticated. The result in Google is also not-paywalled, though going directly to the link is. So maybe they use a simpler strategy, and just mess with the parameters. This is the result from google: http://online.wsj.com/news/articles/SB1000142405270230388060...
1 comments

Searching at Google for

    "cache:http://online.wsj.com/news/articles/SB10001424052702303880604579405852448992982?"
gets me the full text article, they could just be stripping the header from the page and displaying that? I know it does detection of cached Google pages in some circumstances.