Hacker News new | ask | show | jobs
by ChuckMcM 2349 days ago
In the specific case of date based searches they are pretty difficult because of how pages are ranked. For a long time (and still to a large extent) Google ranks 'newer' pages higher than 'relevant' pages. At Blekko[1] there was a lot of code that tried to figure out that actual date of the document (be it a forum post, news article, or blog post). That date would often be months or years earlier than the 'last change' information would have you think.

Sometimes its pretty innocuous, a CMS system updates every page with an updated copyright notice at the start of each year. Other times its less innocuous where the page simply updates the "related links" or side bar material and refreshes the content.

It is still an unsolved ranking relevance problem where a student written, 3 month old description of how AM modulation works ranks higher than a 12 year old, professor written description. There isn't a ranking signal for 'author authority'. I believe it is possible to build such a system but doing so doesn't align well with the advertising goals of a search engine these days.

[1] disclaimer I worked at Blekko.