| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gingerjoos 5058 days ago

Beat me to it! This was something I had been planning to build on my own for a while, but didn't get around to . Congrats!

Whenever I have tech discussions with friends I would recall something mentioned in a article I read via HN. But it would take me a whole lot of effort to get that link. Oftentimes I simply couldn't get hold of the link even after an hour of searching.

Please do get the Firefox extension out. Would love to use it. Also, please do make sure the extensions/addons are stable. Have been facing problems with Annotary's extensions [1], for instance.

By the way, do you have a crawler fetch the link content or do you send it from the user's browser?

[1] https://getsatisfaction.com/annotary/topics/unstable_browser...

1 comments

vinnyglennon 5058 days ago

cofounder here. Our first version spidered out for the content but a far more efficient way was to upload compressed version of the data from the user as we can then do hash checks for reference counting. Chrome extension has been used in the wild for last 3 months on 6 continents. Firefox extension too unstable at the moment(also Mozilla ten day review process), but hope to get it out with 1-2 weeks. Would love any feedback, good or bad!

link

gingerjoos 5057 days ago

Was mainly concerned about the scalability. For a large number of users, your server would have to handle a large number of concurrent connections while they uploaded their data. If you used spiders, you could push the URLs to a queue and process them at your convenience.

How do you deal with 2 users looking at the same URL but seeing different things? example.com/me would be different for user1 and user2.

Some pages would be very dynamic, eg. Facebook. And not everyone browses facebook/twitter behind https (which you do not index). Do you not index social networks?

I like the fact that the extension requires no user input and works silently in the background. Has some trade-offs, but worth it. Cannot comment on the search quality yet because Chrome is only my secondary browser; not enough history to search for anything meaningful.

link

gingerjoos 5057 days ago

Few annoyances I noted in the FAQ "What Google search sites does it support?" section. google.co.in is by default in English, you would have to explicitly set it to another language [1]. "Indian" is not a language (Hindi, Malayalam, Bengali etc. is [2]). Farsi is not spelt Farsai [3]

[1] https://www.google.co.in/?hl=ml

[2] http://en.wikipedia.org/wiki/Languages_of_India

[3] http://en.wikipedia.org/wiki/Persian_language

link

vinnyglennon 5057 days ago

Fixed. Switched example to Turkish and Iranian, as that is were most of your traffic came from this post. Just read an 800 page book on India, can't believe I made that mistake.

link

gingerjoos 5055 days ago

It's, unfortunately, a common mistake.

link

vidarh 5057 days ago

You need to work on stemming and clustering terms, I think... I've visited a number of Postgres related pages, and some of them contains only "Postgres" while others contains only "PostgreSQL", and searching for Postgres will only give me the former pages. It confused me for a little bit.

link