| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by crntaylor 5059 days ago
	Obvious point to raise: the reason people regularly delete their browser history is because they watch porn without turning on private browsing. How do you propose to deal with this? You'd need to provide at least the ability to selectively delete portions of the history. But you can selectively delete portions of your browser history too, and people don't - because it would be too easy to miss something. Instead, they just nuke the whole thing. How is your tool different?

5 comments

rickdale 5059 days ago

I take advantage of my browsers history with porn. When I was in college the first bash script I wrote was to open a movie in my porn collection that I hadn't watched in the longest time. This was great. But now with streaming porn sites I don't have a huge collection and I often watch scenes that I can't seem to find later. There is a lot of porn out there.

Sure people clear their browser history because their embarrassed by their porn obsession, but I think this tool could be very useful for pornaholics too.

beaumartinez 5059 days ago

You could do something similar today—just hook into your browser's history API

vinnyglennon 5059 days ago

Vinny Glennon, One of the founders here. Thanks very much for the up votes. The Chrome extension does not work in private browsing. I have a set of porn sites(1.7 million stored in redis) that I check if incoming links are a member of. You can selectively block sites ( https://www.seenbefore.com/blacklist_items).

andrewthornton 5059 days ago

Where did you get your list from, and can you share it?

user24 5059 days ago

For research purposes.

jparishy 5058 days ago

For science.

krassif 5056 days ago

Perhaps http://www.shallalist.de/ ?

gingerjoos 5059 days ago

Do you store the entire set of 1.7 million entries in redis? Or is redis an index to data stored elsewhere, in a relational DB perhaps?

I was under the impression that redis wouldn't be all that useful to store a lot of data. Would be great if something as quick as redis could work with large data sets.

vinnyglennon 5059 days ago

Storing a list of 1.7 million strings for us takes 70mb stored in memory. Testing for membership is an O(1) op. Very happy with it. We use mongo as a dumb data store as well as a bunch of other infrastructure tools, like http://circleci.com we could have only dreamt of years ago.

genwin 5058 days ago

> Testing for membership is an O(1) op

Curious how. O(1) an array index lookup, not a string lookup, I thought.

milkshakes 5058 days ago

http://redis.io/commands/sismember

genwin 5058 days ago

Again just curious, but I'd still like to know how. Someone there asks the mod how it could be O(1), the mod replies it's a "hash table lookup". But Wikipedia at http://en.wikipedia.org/wiki/Big_O_notation suggests that such lookup is no faster than O(log log n). I think the redis info is incorrect.

O(1) implies that the location of the member in the list is already known, with no search required. I don't see how that could be the case when it's a key lookup. The key could be anywhere in the list, even if the list is sorted. They key would have to be searched for, it seems.

StavrosK 5058 days ago

Why not use a bloom filter?

vidarh 5058 days ago

The main benefit of Bloom filters is that they can be made small. Given that his database takes only 70MB or so and he's not trying to ship this to devices that might have much in terms of space limitations, there would appear to be little point.

StavrosK 5058 days ago

Eh, true, I guess redis is sufficiently awesome.

genwin 5058 days ago

Maybe because of this fact (according to Wikipedia)?: "The more elements that are added to the set, the larger the probability of false positives."

StavrosK 5058 days ago

That depends on its size, though. You can make it larger and get fewer false positives.

siculars 5059 days ago

Wouldn't one simply use one specific browser, say either safari or firefox or chrome, and only that browser for their... unsavory activities? I think that is a great way to keep accounts separate and keep "bad" sites from knowing about "good" sites and vice versa. Just saying. Not that I partake in any such unsavory activities.

user-id 5058 days ago

For testing purposes I use Chrome's "Users" feature to keep an extra profile with no extensions installed handy.

The same could be done for a "Porn" profile too I guess, sand-boxing any history, extensions and bookmarks to that profile. You could even associate tie it to a Google account for portability.

tammer 5058 days ago

This problem is nullified by private browsing. I think the idea is BRILLIANT, as Google's already tracking all my 'legitimate' searches, and I find that most of what I Google are things I've looked at on other machines, or seen already.

The noise introduced by phrasing my query differently is a real problem in search that Google hasn't fixed yet.

derekorgan 5059 days ago

Porn sites are not recorded

chris24 5059 days ago

How does your system define "porn sites"? What about if it was some porn site no one has ever heard of with an innocent-sounding name/domain?

vidarh 5058 days ago

He apprently uses a list of 1.7 million sites. But you can also blacklist sites and have any existing entries for it removed:

https://www.seenbefore.com/blacklist_items