Hacker News new | ask | show | jobs
by munchler 1003 days ago
> This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data.

It seems like a stretch to associate this risk with AI specifically. The era of "big data" started several years before the current AI boom.

6 comments

This is the risk of using, checks notes, Azure and working with Microsoft.

Except there is no risk for them. They've proven time and again they have major security snafus and not be held accountable.

Virtual networks are a nightmare to setup and manage in Azure which is why everyone just takes the easy path and not bother.

Almost every Azure service we deal with has virtual networks as an after thought because they want to get to market as quickly as possible, and even to them managing vnets is a nightmare.

Not to excuse developers/users though. There are plenty of unsecured S3 buckets, docker containers, and Github repos that expose too much "because it's easier". I've had a developer checkin their ftp creds into a repo the whole company has access to. He even broke the keys up and concat them in shell to work around the static checks "because it's easier" for their dev/test flow.

They have all the regulatory paperwork in place, so it must be fine.
They are also the top line investment for the majority of mutual and pension funds. Don't crab too much, they are funding your retirement.
Agreed. It should say "new risks organizations face when starting to leverage the power of Azure" or "the power of cloud computing". But as clickbait worthy a title.
The second clause covers that: this isn’t an AI problem, just as it wasn’t a big data problem when the same kinda of things happened a decade ago. It’s a problem caused when you set up something new outside of what the organization is used to and have people without appropriate training asked to make security decisions: I’d bet that this work was being done by people who were used to the academic style, blending personal and corporate use on the same device, etc. and simply weren’t thinking of this class of problem. The description sounds a lot like the grad students & postdocs I used to support – you’d see some dude with Steam on his workstation because it faster than his laptop and since he was in the lab 70 hours a week anyway, why not 90?

The challenge for organizations is figuring out how to support research projects and other experiments without opening themselves up to this kind of problem or stymieing R&D.

This comment is a good bit of rationalization, and whichever the categorical mismatch you feel is happening, it misses the overarching point, the focus should be on the broader systemic issues: data security is not a first or second tier priority to "big data" or "AI"... largely because there's no cost to doing it poorly.
With big data comes big responsibility
AI has magnified the use cases, though. Before, Big Data was an advertising machine meant to tokenize and market to every living being on the planet. Now, machine learning can create "averaged" behavior of just about anything, given enough data and specificity.