Hacker News new | ask | show | jobs
by eganist 1190 days ago
Neat concept. In regulated environments, how would you propose implementing this to minimize the spread of live, regulated data into non-prod environments? think full PANs (PCI) etc.

I don't know that there's necessarily a wrong answer here (well, there probably are, but wrong only in the sense that a given solution might be prohibited by the regulation), just want to see how y'all have thought through the prompt.

2 comments

We have a anonymiser which identifies common sensitive /Personally identifiable data like credit card, zip code and replaces them with anonymised data.

We also provide configuration option to specify additional fields are needed to be anonymised

Do you tokenize that data so that it stays consistent through all the flows? Fits the same parameters etc?
This seems like the burning question, maybe that's the .ai...

HAR can already be recorded in middleware (e.g. loadmill/har-recorder) and replayed in multiple CI compatible ways.

How would that potentially require AI?
Unrelated to this but responding to your other question that was deleted but still valuable:

> What do you mean by "live, regulated data into non-prod environments" exactly? Could you provide some examples?

Credit card numbers, card verification codes, protected health data/electronic medical records... list goes on.

Every environment that has live data in it functionally increases the valuable attack surface for most adversaries. I.e why bother attacking production when they can slurp up production data from test environments that are less likely to be well protected?

As for the "AI," I think the op was just commenting on the TLD used by the startup.