| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by syx 2552 days ago

This was more of a kludge than a serious project, but glad I was not the only one to find this useful, will definitley continue working on it.

>Is it possible to avoid duplicates?

I thought about this as well in the past but never actually found a solution. Maybe someone here knows if there's research or an algorithm to uniquely identify the URL.

Edit: typos

2 comments

333c 2552 days ago

Some pages will link to a canonical URL in the head: https://en.wikipedia.org/wiki/Canonical_link_element

link

weaksauce 2552 days ago

Stripping out the query string probably works for 80%(maybe even more) of the sites. Maybe that by default and then an option to search for the whole url as a fallback?

link