Hacker News new | ask | show | jobs
by ohmygodel 2320 days ago
Running a hosting server for onion services, as was done in this case, is a terrible idea. It greatly increases the risk of deanonymization. The question is less how this hosting service was discovered and more how it ever stayed up long enough to become so notorious. Here's why:

1. Each hidden service chooses a "guard" relay to serve as the first hop for all connections.

2. A server running multiple hidden services has a guard for each of them. Each new guard is another chance to choose a guard run by the adversary.

3. An adversary running a fraction p of the guards (by bandwidth) has a probability p of being chosen by a given hidden service. A hosting service with k hidden services is exposed to k guards and thus has ~kp probability of chosen an adversary's guard. With, say, 50 hidden services, an adversary with only 2% of guards has nearly 100% chance of being chosen by one of those 50 hidden services.

4. The adversary can tell when it is chosen as a guard by connecting to the hidden service as a client and looking for a circuit with the same pattern of communication as observed at the client. Bauer at el. [0] showed a long time ago this worked even using only the circuit construction times.

5. The adversary's guard can observe the hidden service's IP directly.

The risk of deanonymization with onion services in general (i.e. even not using an onion hosting service) is significant against an adversary with some resources and time. Getting 1% of guard bandwidth probably costs <$500/month using IP transit providers (e.g. relay 8ac97a37 currently has 0.3% guard probability with only ~750Mbps [1]). And every month or so a new guard is chosen, yielding another chance to choose an adversarial guard. Not to mention the risk of choosing a guard that isn't inherently malicious but is subject to legal compulsion in a given jurisdiction (discovering the guard of a hidden service has always been and remains quite feasible with little time or money, as demonstrated by Øverlier and Syverson [2]).

[0] "Low-Resource Routing Attacks Against Tor" by Kevin Bauer, Damon McCoy, Dirk Grunwald, Tadayoshi Kohno, and Douglas Sicker. In the Proceedings of the Workshop on Privacy in the Electronic Society (WPES 2007), Washington, DC, USA, October 2007.

[1] <https://metrics.torproject.org/rs.html#details/014E24C0CD21D...

[2] "Locating Hidden Servers" by Lasse Øverlier and Paul Syverson. In the Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.

10 comments

This is some great info for the less technically knowledgeable about Tor (like me!). However, I think your math in #3 is wrong.

Assuming random assignment/selection of the guards, each time one is chosen it has a 98% chance of not being "caught" by choosing an adversary's guard. Going with 50 services as you said would be .98^50=.364, meaning the chance of getting caught is 1-.364=.635 - 63.5%. This is vastly different than being nearly 100%.

Fair enough! I was using as a heuristic the expected number of compromised guards, which would be 0.02*50 = 1. Moreover, things degrade exponentially over time. If half the guards rotate every month, the chance of choosing a bad guard is after 2 months is >86%, after 4 months is >95%, after 6 months is >98%.
There was a posts week or two ago from a person running a legit Tor service that was analyzing all of the attacks he received.

He said something seemed to be dos'ing the guard nodes, causing his service to automatically choose a new guard, in an attempt to get his service to connect to a guard node controlled by the adversary. He said in one case, they found his server's actual IP address and dos'd it.

Could that be what happened?

I assume you refer to [0]. He says "If [the adversary] can knock me off enough guards, my tor daemon will eventually choose one of his guards. Then he can identify my actual network address and directly attack my server. (This happened to me once.)" I question how the author is sure this is what happened to him. But he may be right, and moreover that attack may have been performed against the "dark web tycoon" that is the subject of this post. However, it does seem to be somewhat challenging to perform, as Tor keeps trying to use all recent guards ever contacted, and so you'd have to simultaneously make all chosen guards unresponsive until a malicious guard is selected.

[0] http://www.hackerfactor.com/blog/index.php?/archives/868-Dea...

> 5. The adversary's guard can observe the hidden service's IP directly.

So does the guard know that it is a guard and that the traffic comes from a hidden service? I thought Tor worked by jumping from node to node, and that each node didn't know whether the traffic came from the original client/service or from another node in the chain. So each time you make a connection over Tor you're essentially telling a guard node "here's my real IP, send this traffic to this hidden service and return the response please" and you have to trust that they keep it a secret? I feel like I'm missing something here.

The Tor protocol doesn't explicitly signal the guard relay that it is in the guard position. However, the guard relay (call it R) can use several indicators to conclude that the preceding hop (call it S) is indeed the source (e.g. the onion service):

1. S is at an IP address that is not a public Tor relay as listed in the Tor consensus. It's not impossible that S is a bridge (i.e. private Tor relay), but statistically unlikely because using a bridge isn't all that common.

2. During circuit construction, S extends the circuit beyond R two times. I don't see why Tor couldn't easily create dummy circuit extensions to fool R, but it doesn't (probably because there are so many other indicators that this change alone wouldn't solve the problem).

3. R observes what appear to be HTTP-level request-response pairs between it and S at about the same round-trip time (RTT) as the RTT R observes between it and S at the TCP layer, which should only happen if there were no more hops beyond S.

If I recall correctly, Kwon et al. [0] describe several more statistical indicators of being a guard for an onion service.

Also, you are right that a client doesn't tell the guard node the destination (e.g. the onion service) of its traffic. The guard node is not trusted with that because it already directly observes the client, and so giving it the other side would deanonymize the connection.

[0] https://www.usenix.org/conference/usenixsecurity15/technical...

I always assumed the issue was not just finding the servers, but that they are often in countries that are hostile to US law enforcement.

You can do fancy attacks all you want, if the server is in Russia they're probably not going to be honoring any MLATs

Wasn't this 2013??

Its 2020 now so much has to have changed. Tor sucked 7 years ago.

Tor has made some improvements that would reduce the threat of deanonymizing an onion service, but none affect the above analysis (or rather, the above analysis has taken them into account). The main improvements, in my opinion, have been:

1. The biggest improvement is that (in 2014 or 2015?) they reduced the number of entry guards from 3 to 1 [0], reducing the risk of a malicious guard by a factor of 3.

2. The time until a guard choice expires was increased from 2–3 months to 3–4 [1] (this maybe happened 3 years ago?). This increases by ~40% the expected time an adversary would need to passively wait to have his relay selected as a guard by a victim.

3. The bandwidth threshold to become a guard relay was raised from 250KB/s to 2000KB/s [2] (looks like in 2014). However, 2000KB/s=16Mbit/s is still a very low bar, and, moreover, for an adversary that can run relays above the threshold, this change increases the adversarial guard fraction as there are fewer guards above the threshold to compete with.

4. A new guard-selection algorithm was implemented that prevents a denial-of-service attack from forcing a large number of guards (i.e. > 20) from being selected in a short period of time [3]. I believe this merged in 2017. If an adversary can force guard reselection by an attack, you are still extremely vulnerable, though, as a limit of 20 still provides a 20x risk multiple.

[0] https://trac.torproject.org/projects/tor/ticket/12688

[1] https://trac.torproject.org/projects/tor/ticket/8240

[2] https://trac.torproject.org/projects/tor/ticket/12690

[3] https://trac.torproject.org/projects/tor/ticket/19877

These are well known attacks. In case of Freedom Hosting this maybe was the cause for finding the server. Mitigation exists. Today big illegal darknet websites run lots of Tor servers on their own. You can also manually set trusted guards or other nodes in the chain so no malicious node will ever be part of your path through the network.
Yes, if you manually and wisely choose your own guard nodes, then you can avoid these attacks. You should be sure that those guards can't themselves be linked to you, either.
Interesting. Looking for more info on what you were talking about (with regard to "guards"), I dug up this post[1] which has some info too.

[1]: https://blog.torproject.org/announcing-vanguards-add-onion-s...

This is probably the best description of how Tor uses guards: https://gitweb.torproject.org/torspec.git/tree/guard-spec.tx....
The page you link describes "vanguards" which apply the guard logic to positions beyond the first hop. They are only available as a plug-in that you must separately download and configure. My understanding is that no plans currently exist to integrate vanguards into Tor due to cost of engineering challenges that appear if everybody were to use them (including especially how they would affect load balancing).
Thanks for the follow up info and additional explanation!
That only leads you to the server though, not to the person managing it.
In this case, the main question is how the server was discovered, not how the operator was then deanonymized. As the article describes, after the server was discovered to be in France and run by OVH, authorities used legal treaties ("MLATs") to obtain the subscriber information, leading them to the person that recently plead guilty in court.
This seems incredibly naive. Who would register a VPS hosting different kinds of the most illegal content imaginable using their real name or IP address? Even if they thought hidden services were impenetrable, there are always other possible slip-ups you could make which could disclose the server's real IP, and of course they'd be ignorant to think any security measure is impenetrable, including Tor.

DPR made extremely careless mistakes, too, to the point that even a random amateur investigator could've identified him, using only Google.

It's shocking how many of these people aren't caught sooner when they don't even know OPSEC 101.

To people who were paying attention to the wishful thinking at the time about tor's security guarantees, it doesn't seem so incredible.
Sure, but even if you assumed Tor was perfectly secure, there are still other ways of being exposed (like someone causing your web server to issue a network request to a host they control).

No matter one's assumptions, it makes no sense to me that someone would register a VPS with their own information when it's pretty trivial to do so anonymously. Especially if you're running an illegal content hosting empire.

DPR's mistakes at least made sense to me; they're something anyone could have overlooked, even if they were still very naive mistakes. But I doubt DPR used his personal information when paying for servers. That's well beyond "unrealized mistake" into pure incomprehensibility.

They supposedly caught on to him by connecting an email address associated with DPR to his real-world identity. Wouldn't surprise me if that was an ex post facto lie concocted to conceal the true method, though.
But that's all they need though.

A simple national security letter (NSL) without even needing to get a warrant and BOOM you can tap the server and get all info about the person running it.

Not if the server is paid for anonymously and you only connect to it over tor. That connection isn't through a hidden service and so isn't vulnerable to this attack.
A national security letter can not compel someone to tap a server for the government or allow the government to tap a server. A NSL can only request existing collected records. So for exampe a NSL could request any logs a service provider has regarding who paid for the server or any access logs they retain regarding the server. If they do not have any logs a NSL can't compel them to start collecting them. A NSL which requests actions or information outside of the scope allowed by law can be challenged in court.
That's a very good explanation!
Saving this answer, thanks!