| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by HNLurker2 2627 days ago
	I wish people used the deep web for something besides illegal buying and child pornography

8 comments

cyphar 2627 days ago

Less than 3% of Tor traffic is to onion services of any kind (which means 97% is to websites already accessible on the public internet), and the most popular onion service on the internet by a large margin is Facebook's (facebookcorewwwi.onion). More than 2 million people use Tor every day -- are they all bad people? Heck, government agents use Tor when traveling abroad.

Do bad people do bad things using Tor? Yes. Do political dissidents in oppressive regimes use Tor? Yes.

However the vast majority of people are just ordinary citizens using Tor to access the internet -- the cross-section of Tor users is the same as the cross-section of ordinary internet users.

dooglius 2627 days ago

> the most popular onion service on the internet by a large margin is Facebook's

How do you know? It shouldn't be possible to collect this sort of data.

tgragnato 2627 days ago

Counting hits on HSDir(s) and extrapolating a statistic. Related: https://trac.torproject.org/projects/tor/ticket/8106

cyphar 2627 days ago

In 2016, Facebook published an article saying that 1 million people use Facebook (over their onion address) every month[1]. Comparing this with the privacy preserving statistics provided by the Tor project (based on extrapolating HSDir hits) leads you to believe that 1 million per month is the overwhelming majority of .onion site users.

Roger Dingledine mentions this in quite a few of his talks, I'm fairly sure it's an accurate statement.

[1]: https://www.facebook.com/notes/facebook-over-tor/1-million-p...

jandrese 2627 days ago

Exit nodes can track which sites are hit to a degree. CDNs make this more difficult, but it's not too hard to figure out what percentage of your traffic is Facebook. It also won't work if you're going to the Facebook onion site of course.

cyphar 2627 days ago

Exit nodes aren't used like that for .onion sites, so they cannot track usage of .onion sites.

The way it works is that the client and server pick a "rendezvous node" (the server generates 6 HSDir entries, each with 3 random nodes every day, and the client picks a random HSDir entry and a random one of those node to use). Then, they communicate through the rendezvous node which doesn't know who the client or server are (because both are connected through Tor circuits and neither reveals the .onion URL that was looked up in the HSDir).

The way the statistics work is that some Tor relays opt-in to sharing statistics about how many HSDir lookups happened through them, and then those figures are extrapolated to figure out how many .onion service accesses happen. The relay doesn't know which service is being looked up, and the rendezvous node doesn't know which service is being talked to.

cyphar 2626 days ago

(Correction, 3 introduction points and the client picks the rendezvous point -- so even a compromised introduction point is useless because the node used for communication is different for all communications.)

Izmaki 2627 days ago

I wish people would at least learn the difference between "deep web" and "dark web". ;) I bet you use the "deep web" multiple times each week. The "dark web" on the other hand, probably not.

tgragnato 2627 days ago

Depends who you ask.

I transparently use the darknet continuously every day. Multiple home servers owned by me and my colleagues make up a VPN we share with friends and family.

Amongst the trusted recursive resolvers we use there's the DoT v3 onion from Cloudflare. A proxy redirects our traffic for Facebook and DuckDuckGo over the respective onions, same for Debian updates. A next generation firewall inspects our traffic and use Tor for some websites that are censored or geoblocked.

friendly_chap 2627 days ago

I do and a lot of people I know also do. Just for added privacy, or anything sensitive but legal.

Tor became such a pleasant (and fast, unlike it used to be) experience that it can be used for general anon surfing.

edoo 2627 days ago

You have to be a little weary using tor. Anyone can run an exit node and it is trivial to rewrite and inject onto web pages. You can also on the fly intercept SSL requests and generate your own self signed certificate that fails proper verification but looks real enough if inspected that will always trick a percentage of users. If you've used tor with any frequency you've probably hit weird SSL cert errors that go away if you change routes.

friendly_chap 2627 days ago

To be fair I mostly use it for not overly sensitive stuff. Let me give you an idea: I prefer to not have my ISP log my requests to reddit.com/r/LSD.

Not because I do anything illegal (I don't even take acid), but in this dystopian world where every action on the internet is recorded, the last thing I want is to end up on lists purely because of my curiosity.

If I would do anything I could get into trouble for (which I won't), I would definitely research more about how to use Tor safely.

ikornaselur 2627 days ago

Please correct me if I'm wrong, but can't your ISP only see that you're requesting reddit.com, as long as you're using https? Now sure, if you go to lsd.reddit.com, it can be logged as a subdomain, but anything beyond reddit.com shouldn't be viewable by your ISP.

I'm not saying that you shouldn't use tor, just that as far as I understand, the whole request, including path and method, is encrypted over tls/ssl after your browser establishes a tcp connection to the server.

friendly_chap 2627 days ago

I do believe the url path is visible even over HTTPS. Off to do some research on this.

Edit: apparently the url is not visible, but the domain (more like IP, which can be easily resolved to domain).

Same thing still applies, perhaps not with reddit subreddits, but with specific domains/websites.

tialaramex 2627 days ago

With ordinary DNS you are asking in plain text hey, what's the IP address for reddit.com and it does not take a genius to guess that's because you're visiting reddit.com

With HTTPS using TLS 1.2 or earlier the site sends its certificate in plaintext too, so even if you just remember the IP address, it will tell anybody snooping "Hi, this is reddit.com".

In TLS 1.3 the site's certificate is encrypted. However the SNI, which is used to make virtual hosting work, is not encrypted. So your ISP can see where you said you were going, but not whether they proved they were the real deal.

DPRIVE such as DNS over HTTPS cures the first thing, you use an encrypted transport to do DNS queries against somebody trustworthy who won't rat you out.

eSNI (encrypted SNI) is intended to one day cure the other problem.

Even with both these, seeing that you visited a very popular system like Facebook or Reddit is always going to be easy. So Tor remains important.

Grangar 2627 days ago

Your ISP won't log the request going to /r/LSD. It's over SSL, so the only thing your ISP sees is a request to reddit.com.

friendly_chap 2627 days ago

You are correct. Domains can be still sensitive though.

cyphar 2627 days ago

It is fair to say that using unauthenticated protocols like HTTP over Tor is a pretty bad idea (and there really should be more warning bells about this in the Tor Browser). However on the TLS comment -- almost all modern websites use HSTS, so sslstrip doesn't really work any more.

edoo 2627 days ago

I mean you can intercept the request, retrieve the real cert, generate a self signed cert with the exact same details, then submit that to the user and be man in the middle. Of course the user gets the blank SSL cert error page on the browser, but a percentage of those users will override and continue. Copying the cert details increases that percentage as some will actually look at the invalid cert. It is quite blatant but it is just a numbers game at that point. If you ever hit an SSL cert error with TOR you should force a new onion path.

cyphar 2627 days ago

Yes, you could do that but then your node would be kicked off the Tor network (because you'd need to do it indiscriminately since you don't know who the user is you're trying to target). In addition, relays are load-balanced based on trustworthiness and bandwidth so in order to attack a significant portion of users you'd need to be running a large and trusted node (which would be hard to do if you're just doing this to attack people).

edoo 2627 days ago

I wasn't aware that Tor tested services and had a trustworthiness score but an attack like that could still be quite useful for certain purposes and possibly stay well hidden. If you set something up that only did it for Google IP blocks for example it might go undetected. If you actually got shut down you could refine it by only targeting a small percentage of those users. There would be some rate of account collection, however small.

throwaway66hack 2627 days ago

How about sslstrip2 ([1], check demo)? A weakness of HSTS is that is stored per domain and the exit node can also control your DNS traffic. I wonder how hard it is to pull this off as a Tor exit node, for local networks there are tools like bettercap [2].

[1] https://github.com/byt3bl33d3r/sslstrip2

[2] https://www.bettercap.org/legacy/

cyphar 2627 days ago

That is a pretty neat attack, but I disagree it would be useful against Tor.

DNS traffic is funneled through a different Tor circuit than the web traffic. You'd need to apply the bad DNS to all users, which would almost certainly in your exit node being dropped from the network.

I'm also not sure how this would be handled with HSTS preload lists -- HSTS preload applies to all subdomains so you'd need to come up with a completely different domain (and protections against homograph attacks mean that avenue is restricted). It'd probably be simpler to just set up an actual website with LetsEncrypt than to bother with stripping the TLS in this manner.

throwaway66hack 2627 days ago

You are right. With different Tor circuits, the attacker needs to control a lot of exit nodes to correlate the initial HTTP request to ssl-stripped page and the DNS query (to be a global adversary).

sametmax 2627 days ago

They do, not everybody enjoy democraty.

r3bl 2627 days ago

Unfortunately, such cases will always be less appealing to write about compared to "assassins for hire on the dark web", leading to this wrong generalization of what Tor is about.

mikorym 2627 days ago

If using the definition for the dark/deep web that I think, then it includes traffic to and from any networked entity that does not have a URL (or otherwise public frontend).

This could then include stored data, VPNs or other company/govt/organisational data that is not accessible via normal web traffic.

bepvte 2627 days ago

I believe thats just the definition for deep web.

r3bl 2627 days ago

Both terms are just stupid.

Deep web: stuff not indexed by search engines. Private forums, non-public social media accounts, Telegram rooms, Discord servers etc. are technically "deep web".

Dark web: a subset of deep web that requires specific software or configuration to access. Slightly more precise, but still includes every possible use case for IPFS, Dat, ".onion" etc. Note that this is nowhere close to what people usually mean when they use the term "dark web". They're referring to the subset of a subset of deep web that's used for criminal activities.

rendx 2627 days ago

There are gateways to onion services and IPFS, so those are "indexed by search engines" without any change necessary. Furthermore, any search engine has to be adapted to the medium used, and there are specialized search engines for pretty much anything including Freenet and I2P etc, so saying that the "dark web" is a subset of the "deep web" is incorrect. There is some overlap, but it's not a "part of" relationship.

The problem is that there is one (academic) definition of "deep web", but many incompatible definitions of "dark web", invented by the media basically for whatever they want it to be.

superkuh 2627 days ago

I host all kinds of completely normal websites (ie, amateur radio) as tor hidden services. TOR is great because you actually own your domain instead of just leasing it on the whim of some corporation.

Once you get past the controversy TOR hidden services are more like the 1990s web than what you describe.

Nursie 2627 days ago

I've used it to maintain normal(ish) internet service for myself when visiting places like China.

edoo 2627 days ago

When I went to China I expected problems so I setup my laptop with an SSL tunnel on port 443 to a virtual server and then routed openvpn over that. It worked like a charm. My favorite feature of openvpn is it can maintain state, so even if the tunnel resets and openvpn has to reconnect all the tcp connections just pick up where they left off.

walrus01 2627 days ago

This will work for a short while, but consistent long term openvpn-matching packets are now seen the the GFW's automated dpi systems, eventually the IP of your non-china VPN endpoint will get blocked.

Nursie 2627 days ago

That sounds much more prepared than I was, I arrived and then wanted a quick solution on the fly, Tor fit the bill nicely.

I would probably use my StrongSwan IPSEC VPN setup to home now that I have one.

mikorym 2627 days ago

What is the difference between this and just having a normal SSH tunnel; for example, how does this differ from using sshuttle?

edoo 2627 days ago

Openvpn allows you to connect to and have a routable IP on the network. SSH tunnels are great for some things but being logically on a network is another thing.

bepvte 2627 days ago

For one, openvpn can use udp unlike ssh, which means the annoying overhead of double tcp is gone

mikeq101101 2627 days ago

Thank you for your service.