|
|
|
|
|
by jandrewrogers
378 days ago
|
|
> As an engineer, I find it neat that figuring out how to delete data is often a more complicated problem than figuring out how to create it. Unfortunately, this is a deeply hard problem in theory. It is not as though it has not been thoroughly studied in computer science. When GDPR first came out I was actually doing core research on “delete-optimized” databases. It is a problem in other domains. Regulations don’t have the power to dictate mathematics. I know of several examples in multiple countries where data deletion laws are flatly ignored by the government because it is literally impossible to comply even though they want to. Often this data supports a critical public good, so simply not collecting it would have adverse consequences to their citizens. tl;dr: delete-optimized architectures are so profoundly pathological to query performance, and a lesser extent insert performance, that no one can use them for most practical applications. It is fundamental to the computer science of the problem. Denial of this reality leads to issues like the above where non-compliance is required because the law didn’t concern itself with the physics of computation. If the database is too slow to load the data then it doesn’t matter how fast your deterministic hard deletion is because there is no data to delete in the system. Any improvements in the situation are solving minor problems in narrow cases. The core theory problems are what they are. No amount of wishful thinking will change this situation. |
|
Seems that companies are capable of moving mountains when the task is tracking the user and bypassing privacy protections. But when the task is deleting the users data it’s “literally impossible”