Hacker News new | ask | show | jobs
by ihinsdale 2443 days ago
The cost of complying with such regulations would effectively fall on companies as a regressive tax. Which companies benefit in relative terms from a regressive tax? The already-large ones.
4 comments

Only large companies are capable of building a car that complies with emissions and safety expectations, but that doesn't mean that we'd be better off if there were tens of millions of black-smoke-spewing death-trap shitboxes on the freeway.

Frankly, given the long history of data abuse, security problems, and generally anti-human practices by IT and tech companies, I'd be perfectly fine if cowboy developers not be allowed to touch this product space.

That's not to say that large tech companies are blameless angels in any of this.

Unfortunately I think your view is going to become the prevailing view and the internet is going to become cable 2.0. People are just obsessed with finding corners to round.

For some reason people think comparing loss of privacy when voluntarily visiting a website to deaths caused by car accidents isnt intellectually dishonest.

Why would it be?
You're almost right.

Sure, I don't value my privacy, the security of my financial information, etc as much as my life.

But it's close.

Again a false equivalence. Financial information is already protected by numerous regulations. We are talking about generic cookies that are being used for tracking user behavior for a large number of useful and benign purposes. Most of the data is already pseudonymous.
So it just falls on users as a regressive theft instead. Great!
Many data protection regulations, GDPR included, have exemptions that ensure that smaller organizations are not impacted until they are sufficiently large to bear responsibility for and provability of their actions.
I’d agree that theoretically the regulations could be made more complex in order to mitigate their regressive second-order effect. I believe that familiarity with the evolution of regulation in other domains should disabuse one of the idealistic notion that such complexity would “ensure that smaller organizations are not impacted” by the regulation. In practice, such complexity has a way of inhibiting companies from growing, for one thing because it creates levers which the incumbents can co-opt to make life more difficult for their aspiring competition.
Yang is also fairly pro-blockchain. One of the key innovations of blockchains is being able to encode data as a property right and enforce it through software, moving the cost of compliance from the legal system (expensive) to the CPU (cheap).
How do blockchains help you "enforce" restrictions on downstream uses of personal data?
Encrypt your data; publish to public blockchain. Instead of ever giving out the data itself, you give out the right to do something with that data, in the form of signing a method call on a smart contract platform. This is basically the capability security model applied to data in the cloud. Capabilities have well-known patterns for things like revoking access over time (see eg. the Membrane Pattern).

To prevent cases where the "something" that the smart contract does is "copy all your data bit-for-bit and upload it to my evil masters", you could perhaps apply information entropy, on a platform level, to the source and output data, and only allow the transaction if the output data contains many fewer bits than the input. So say you have all of your location data and review history on the blockchain, encrypted with a private key known only to you, and you want to grant an application the ability to recommend nearby restaurants that you might like. You authorize the transaction, and expect that the contract will release ~1K to you (a restaurant name, description, menu, reviews, and geocode) and will increase the data stored within its own data ownership by 0 bytes. If it does something otherwise, it's broken the contract, and can be automatically penalized financially (because this is a blockchain, it inherently needs a cryptocurrency).

I've had an idea for something like this since hearing about Google's Federated Machine Learning research paper and reading the Ethereum spec, but have other more pressing projects right now and don't have time to implement it. If anyone feels it's interesting, feel free to steal - I'd still love to work on it at some point in the future, but there're still some holes in the idea (notably around the information theory & federated learning aspects) and another speculative research project isn't really what I'm looking for now.

To do this meaningfully on any dataset that changes (finance, health, etc) will cost too much to store on blockchain. It’s expensive enough trying to store it on s3 or even glacier. If the chain stores everything forever it will get too expensive too fast.

I think a more realistic (but way less money for speculators) is to store PKI on a blockchain, then encrypt any blob anywhere and sign. Send that signature to the smart contract and have them pull blob from non-blockchain store.

If it’s something that the owner has agency issues with (eg, calculating fico score) then register the hashes of the data with a blockchain.

No need to store the data on the chain unless you’re worried about it disappearing.

Yeah, this assumes the existence of something like FileCoin/Storj that stores the actual data off-chain, with metadata & access keys on-chain. The blockchain is used to validate the integrity of the data blob and to financially compensate the host(s) that are physically storing the data.
This isn't making a whole lot of sense to me. Capabilities seem to be about making sure that sandboxed software doesn't get unauthorized access; it presumes you have a trusted environment (whether software, OS, or hardware) to enforce the capabilities.

Meanwhile, blockchain computation is about getting useful work out of untrusted participants. It doesn't seem like a fit.

Also, how do you do any calculation at all without decrypting the data? Or if you're thinking homomorphic encyption, what does a blockchain have to do with it?

It's cool, but the older I get the less time I want to spend learning tricks like this, in the same way that I don't want to waste as much time learning the intricacies of new videogames. Technology can be liberating but only in proportion to the amount of time people can invest in it. If technological liberation is just about getting an asymmetrical advantage for oneself and not extending that to everyone else (without demanding that they become experts in this latest way of gaming the system) then it's trash.