Hacker News new | ask | show | jobs
by feral 4959 days ago
>Given that Bitcoin records all transactions for posterity, and given the ongoing rise of "big data" analytics, I'd say Bitcoin is likely to be harder, in the long run, to use for shenanigans. A government currency has forms in which transactions create no paper trail. Bitcoin does not.

I did some research on anonymity (mentioned in that document); I find it hard to project how private Bitcoin will be in future.

Our impression was that currently many users were careless, and that many identities (in the form of publicly disclosed Bitcoin address ownerships) were linked to meaningful transactions; such that with basic network analysis it was possible to passively observe semantically meaningful transactions, like "the person who owns this twitter account, which seems to be a real person, donated to wikileaks" or "this account which is a public organisation donation address is linked to an address that transferred bitcoins to that other organisation".

We speculated that if Bitcoin became widely used, without changes in usage patterns, then a large e-commerce site (someone like Amazon accepting payments in Bitcoin - leaving questions of scalability aside) could passively observe much of what was going on on the network, because they had so many known identity-address pairs to start with (e.g. shipping addresses).

But the other argument is that its probably relatively easy for usage patterns to change.

I think you have to assume that end-users will always be careless. We see this in almost every security setting. So it doesn't matter whether its possible for sophisticated users to guard their privacy, if you don't get privacy by default.

But people could build overlay systems which are backed onto Bitcoin. A lot of the wallet services are like this already, and are not readily amenable to the blockchain level analysis (although of course you then you are trusting your wallet service with your privacy and money). Alternatively core or client developers could add protocol-level or low-level Bitcoin mixing (again, with an overhead cost, so there might be scalability concerns), or develop client interfaces which encourage more privacy by default.

Its too early to tell how observable/analyzable it'll be in steady state, if it builds traction. I think its possible the system will end up much more observable, for casual users, than cash or even credit cards currently are, but I don't think that's inevitable.

2 comments

It seems that to do shenanigans with Bitcoin would require the same care to avoid information leakage that would be required to do similar things with global digital payment networks and banks.

The key would be to begin by considering any address linked to an identity that is you or related to you "dirty," and to be careful about avoiding linkage to any of those dirty addresses. To really be careful I think you'd have to delve into graph theory and data mining a bit yourself, or follow the precautions of someone who knows what they're talking about. You'd also have to take care to pay attention to network addresses under the protocol, using Tor or similar proxying systems and considering any address that's been used from an IP that can be linked to you similarly "dirty." Use of VPSes that accept anonymous payment would also be an option, though again... don't SSH to them from your house! Use Tor or smurf the data around by way of proxies and drop sites and such.

Laundering money would require extreme caution to avoid such contamination, and would present many of the same challenges as money laundering in the fiat currency world. BTC/fiat conversions would be very risky. In-person BTC/fiat conversions are vulnerable to old fashioned gumshoe police work: "hey, that BTC you exchanged on localbitcoin... you happen to remember what that guy looked like?" These are also only feasible for small quantities. Large quantities would present a huge challenge.

What this really means is that Bitcoin is not intrinsically an instrument for villainy as some lazy press articles make it out to be. In fact, criminal use of Bitcoin requires orders of magnitude more technical sophistication, which the vast majority of criminals do not have. A highly educated or sophisticated criminal or intelligence network could surely pull off shenanigans with Bitcoin, but I doubt your average thug or child porn wanker is going to even comprehend the stuff I wrote above. So at the very least, Bitcoin is only a criminal tool for very geeky high-IQ criminals.

If they were to automate some big system that cycles money among new addresses throughout the network while preserving ownership, then any address connected with someone would be instantly emptied and its money mixed in with numerous unknown addresses.

Unless you intend to prosecute everyone who spends BTC that was ever connected with a "known, even address" (which could very well be an option they take!), anonymity and laundering capability is preserved.

No, such a simple approach would be vulnerable to "big data" mining, it would show up as an unusual cluster all connected to itself. Non-laundering transactions splay out quickly.
You mean, connected to a constant stream of new addresses. And wouldn't finding such a cluster be NP-complete?
Finding clusters in graphs is a big research interest in the research group I'm part of.

When we started looking at Bitcoin we thought that we would have to use such sophisticated algorithms to uncover interesting structure, but it turned out to be much easier than we expected to find structure and meaning, so we never got too sophisticated.

There's a very active field of research on these cluster finding algorithms - the term to search with is 'community finding algorithms' - http://en.wikipedia.org/wiki/Community_structure has a reasonable introduction.

>And wouldn't finding such a cluster be NP-complete?

That isn't a problem, in practice.

Finding the maximum clique in a graph is an NP complete problem, which you might be thinking of - but 'clique' is a stricter definition than most people would use for 'community', (in that a clique requires all nodes to be connected to each other), and even then very good heuristic clique finding algorithms exist in practice (E.g. the Bron Kerbosch algorithm).

Some community finding algorithms have objective functions which are NP complete to maximise, but again, often fast heuristics are available.

Consequently, there are many good community finding algorithms out there that will quickly find clusters on networks the size of the Bitcoin graph - we ran some, but we didn't do much with their output.

Its difficult to dig into such problems without a ground truth, and again, we could uncover a lot of meaning using simpler techniques.

Thanks for the explanation!

I think the benefit of this cycling, though, is in the size, not the obscurity. That is, if 60% of the users (and 99% of the addresses) are cycling money to obscure connection to a person, then either:

- You have to accept that "Joe spent a bitcoin that was once in a crime" is insufficient evidence Joe had any connection whatsoever to it, since "most users have touched that bitcoin too"; or

- You have to make it a crime to be a part of such a cycler altogether, which would effectively require an outright ban on Bitcoin.

These conclusions follow no matter how much structure to the trades you can detect.

Determining whether a thing's a single stream would be down to finding where it touched down in the world of either fiat currency exchange, or purchasable stuff.

I don't know if it's NP complete. But my guess is no. I can immediately think of algorithms (maybe crude ones, I am no statistician) that could be used to attack it, and they require a lot of iteration, not an explosion of recursion.

What about bitcoin tumbling services?

They obscure transactions quickly and cheaply.

As long as they don't keep records - the mixer knows everything. And I'm not aware of any running instances of a decentralized mixer like e.g. described here: http://blog.ezyang.com/2012/07/secure-multiparty-bitcoin-ano...
Obscure being the key word. There's no proof of security there, right?
Obscuring isn't the method.. it's the goal. The question is how obscure do they make things.