Hacker News new | ask | show | jobs
by AgentMatt 49 days ago
> There are ways to robustly clean this up analytically but it is largely beyond the capabilities of current tech stacks.

Can you expand on that? Even just conceptually that sounds really hard, how would you know whether you're measuring genuine (unexpected) changes in the environment rather than the result of (possibly sophisticated and coordinated) deliberate manipulation?

1 comments

If you don't trust your measurements, and you shouldn't because all physical world measurements are proxies, the alternative is to find several unrelated and redundant proxies for "ground truth" and have them corroborate and correct each other. There is a lot of errors, bugs, noise, idiosyncratic behavior, etc degrading the data even ignoring intentional manipulation so you really should be doing this anyway.

Stitching unrelated proxies and sensing modalities into a coherent data model is a spatiotemporal graph reconstruction problem. The join predicates require non-trivial inference algorithms if you want to avoid being buried in false positives. From this you can derive an estimate of ground truth and a model of uncertainty at a point in space and time.

The model of uncertainty is dynamic and unpredictable. It is difficult to manipulate the measurement without producing data that falls outside the uncertainty model across every proxy by which someone might construct that uncertainty model. This is similar to how e.g. GPS spoofing is detected in military systems. All GPS updates must fit within a (classified) dynamic uncertainty model relative to INS; if an update falls outside the model then the GPS signal is presumed compromised and updates ignored.

At the limit, this restricts manipulation to values within the uncertainty model. If you have a lot of unrelated proxies, you can make the window of uncertainty tight enough that manipulation becomes effectively impossible. At a minimum, the adversary would need to be able to manipulate every proxy and modality feeding your uncertainty model simultaneously.

These graph, spatial, and spatiotemporal algorithms scale very poorly on traditional data infrastructure and these data models easily run into petabytes if you are stacking multiple independent data sources.

I recommend for you to read Feyerabend's Science in a Free Society.
I'm not sure I see the relevance?
Yours is a rather pedestrian dorm-room take on epistemology and relevance of the moral dimension to social progress whose flaws are addressed in longform by Feyerabend.
What a bizarre response.

I was relaying the technical details of working in these data environments based on deep, real-world operational experience in the domain. There is no "moral dimension" to it, I was describing the world as it exists.

Does Feyerabend also have an opinion on compiler flags and sorting algorithms?

There is a moral dimension, you're just choosing not to acknowledge it.

Feyerabend speaks to things that add context and nuance to the effects, consequences, and provisions of the things you've felt comfortable discussing so far.

Hence the recommendation. Your awareness could use some expanding, if I may be so blunt.