Hacker News new | ask | show | jobs
by rademacher 2775 days ago
This latest craze of "AI" research seems to be fueled by a sudden glut of computational power (GPUs) that wasn't available previously. I think that most technical people would agree that the mid 2020s is extremely ambitious. I'd also argue that we're actually more likely to experience another AI winter.

The frightening part of the current deep learning research is how susceptible they are to adversarial attacks. Adding small amounts of noise causes misclassification in images, and some papers even explore the inevitability of adversarial examples [1]. This is especially frightening given the amount of autonomous vehicle work being done. I could imagine a situation in which the sensor noise varies just enough to cause such an error. Obviously, the systems will have redundancies built in, but I'm convinced the self-driving cars are still a ways off as well.

EDIT: As others, have stated just adding noise is not enough and it is often used to generalize the model. The paper does discuss that the perturbations can be incredibly small to cause this deviation and that the set of such deviations may be larger than expected especially for complex images.

Regarding the AI winter, I suppose I should have defined it as a reduction in the amount of research and the extent of the progress being made in the area rather than the utility of such research.

[1] https://arxiv.org/abs/1809.02104

7 comments

This is a common misconception. It is not a small amount of noise that causes misclassification of images. It is a carefully designed and quite unique pattern that causes misclassification. It only looks like noise to the human eye, but it really isn't.

Yes, neural networks are susceptible to adversarial attacks. No, just adding noise to an image doesn't break neural networks.

Adding small amounts of noise is actually sometimes used to improve the performance of various AI techniques. It helps prevent overfitting.

In fact, if your technique or model is seriously affected by a little noise this is usually enough to brand it brittle and maybe even a failure, as it's a sign of overfitting. Anyone working in this field knows to look for this and will try to make what they create more robust.

The design of visual captchas is one obvious indication of just how successful AI techniques have been at image recognition in the presence of noise. It's no longer enough to make them a little noisy. In order to resist being solved by mechanical means, visual captchas have to include so much noise that even humans have problems recognizing them.

Read it as: A small amount of carefully constructed noise. Then you are correct to literature and pop-science. No misconception needed. There are 1-pixel attacks now. Randomly shuffling a small amount of pixels around can cause predictions to shift.

The issue is that there is no scene understanding. No common sense. No 3D modeling. Just 10x10 pattern matching on a very large fuzzy database of natural images (which works really really well in most cases).

The hype of ML is driven by 3 things: Big companies vying for AI dominance, militaries that want to finally use neural nets that work, and international competition between the West and the East to be the first to largely automate their economies (or AGI if you want to call it that). Catalysts were big data hoarding, GPU training on ImageNet, and then AlphaGo.

I am not a machine learning expert, but could not these adversarial example issues be resolved by solving an image classification problem by (1)producing multiple non-equivalent classification solutions with adequate accuracy, then (2)fusion (e.g. voting) to produce a consensus classification? (3) Maybe random shuffling of which X of Z solutions get to vote in each classification attempt.

What might fool one solution might not fool another, and adversarial examples seem to depend on idiosyncrasies of a particular solution.

When we do that (and we usually do it for accuracy improvements, not resistance to atrack, that happens at the same time for the same reasons - adversarial examples are a misclassification problem), we change the exact attack that breaks the system, but there is no 100% accurate system - and since it's completely foreign, the examples that a machine would misclassify would likely not confuse a human.

The issue is classification currently relies on a very small embedding of the data which is pattern-matched, with no semantics. It has no way of telling that the difference between a dog and an elephant ISN'T that noise gradient, at least some of the time!

Some of them yeah. There is active research on this. But it is also possible to create adversarial images for soft voting ensembles of the 6 most popular architectures. Those strong adversarial images that beat the consensus, also have a large chance to fool new neural network architectures that the adversarial image creator never had access to.
Or just adding a small amount of random noise to the input, which would wipe out the carefully constructed attack.
You can try out this technique at https://github.com/google/unrestricted-adversarial-examples My guess is it would have the same result as adding noise to the normal images too (resulting in a slightly worse performance overall).
Can I have a tool to add this noise please? So that Facebook et al. can't find me and build a profile on me based on random images that I didn't even know existed?
This is the most cyberpunk thing I've ever seen
This style made an appearance on Elementary a few years ago: https://www.youtube.com/watch?v=A1_9aHo0S30
Nice. I just sent that to a friend who has a hair salon in SF.
That is a really cool research/art project!
Sounds like those networks need an adversarial network or two to improve their performance and make them less susceptible to attack.
> I'd also argue that we're actually more likely to experience another AI winter.

We'll experience an AI winter again like we experienced an Internet winter in 2001-2004. Which is to say, not really at all. AI is now being widely commercialized for the benefit of consumers and businesses. That process will not stop, even if the hype train deflates before rising again at a later date. There is large, tangible commercial value in AI at the current general level of capability and near-term potential. That will result in pursuing maxing out whatever this era is capable of, before the next leap occurs at some point down the road. It's a progress track of higher highs during the exploratory boom and higher lows during the winter.

I don't think adversarial examples give any evidence of relevant problems with these models because they occur on a very specific subset of images that can only be discovered using detailed knowledge of how these networks process images.

For all we know, humans have similar problems on some obscure subset of images, but we can't find human's adversarial examples because we don't have detailed knowledge of how the brain processes images.

I think we do actually know enough about how the human mind does process images to have some idea of what is different. It is not that uncommon for humans to be uncertain about what they are looking at, but the first thing about such occurrences is that the human is usually aware of the fact that they are having a problem, and the second thing is that they take steps to resolve it, such as making hypotheses as to what's going on and checking them out, and/or seeking to get a better view (or other evidence) in a way that is specifically designed to resolve the uncertainty. It is this higher-level semantic analysis that is missing from current image processing software.

In these discussions, someone always mentions optical illusions, but only humans (so far) understand the concept of 'optical illusion', and recognize that they are experiencing them.

> It is not that uncommon for humans to be uncertain about what they are looking at, but the first thing about such occurrences is that the human is usually aware of the fact that they are having a problem, and the second thing is that they take steps to resolve it

This is true, but step one is "move your head" (or in your words, "get a better view" -- but you get more value from just the fact that your head is in a different place than from the possibility of a better angle on whatever you're looking at).

That strategy doesn't work at all when you're trying to classify static images rather than physical objects.

That raises the interesting question of how object recognition in streams of images is progressing, beyond being just object recognition within the individual frames. Humans are capable of extracting a lot of additional information in such situations, and are actually helped when the perspective on a given object changes. One cannot give current machine vision a pass if, through lacking this capability, it is under-performing.

And moving one's head to get a a better view is only one thing that a human might do. Firstly, of course, we must recognize that we are having a difficulty, and current machine vision seems to be somewhat deficient in this regard. Then, even without being able to get a different perspective, we will do things like make guesses as to what might be there (using our extensive semantic models of the world) and figure out if they might be a good fit to what we see, and/or we might try to extract specific features of the problematic area and search our memories for objects that might plausibly match, bearing in mind that it might be from a different perspective than we are accustomed to. We are also quite good at estimating whether an object might be a problem for us, even if we have not positively identified it. There is a lot more to it than just moving one's head.

GP's statement applies as much to observing objects in 3D space as it does to looking at photos, where just moving your head ain't gonna help you much. Optical illusions are great to study this process, because most of them are delivered in form of flat images on paper or computer screen.
Optical illusions are delivered as flat images because moving your head doesn't affect those.
Humans are rarely aware of optical illusions unless they're extreme images they don't see in real life - crawling dots, impossible geometry - or they're explicitly labelled as optical illusions.

Some more subtle examples:

http://www.terrycolon.com/1features/optical-illusions.html

In fact human perceptual processes are only kind of reliable some of the time. Low and/or unusual light, suggestibility, and unusual contexts all have a very negative effect on reliability, but humans are often unaware of this.

Cognitive and semantic illusions are even more persistent. People literally believe all kinds of nonsense, and will carry on believing it even when offered robust evidence that they're wrong.

The point being that human perception and cognition are not some kind of gold standard. They have plenty of issues of their own. But there's a kind of assumption/requirement of perfection with machine intelligence that doesn't apply to human cognition. So bugs in our own evolutionary firmware tend to be overlooked, while equivalent-level bugs in ML are seen as terrible failures which undermine the entire premise of AI.

Recent update — we do know, and humans are vulnerable to perturbations created to fool a multitude of existing AI: https://arxiv.org/pdf/1802.08195.pdf
I think adversarial examples for humans are called "optical illusions".
There is a categorical difference between "a [specifically designed] image that can be construed as a duck or a rabbit" and "a human can regularly mis-categorize random pictures of ducks as rabbits if a weird filter is overlayed". The first is well-known and fun and trite -- the second is unheard of and probably impossible for humans, yet provably possible for trained computers.
It's called camoflauge. The natural world is full of adversarial examples.
I'd imagine GP was referring to "humans perceive straight lines to be curved when certain shapes are overlayed", or "humans perceive shapes of the same color to be different colors when filters are applied" sorts of optical illusions.

There are plenty of those, and I personally I think they're probably analogous to how adversarial filters fool AI classifiers.

It's really easy to cause humans to misclassify all kinds of images as containing faces ;). Humans also regularly misclassify random noise as words. You can even suggest which words we hear by telling us what the noise is supposed to be.
The point outlined is that we don't know enough about how we identify objects to discard a simple adversarial attack; probably not a filter-based but maybe something else.
"probably impossible for humans"

Based on what?

The difference is that humans are aware that there is an illusion happening, they just can't help seeing it.
The important difference here is that most adversarial examples for the human perception:

(a) do not occur frequently in nature,

(b) are not frequently - if at all - produced in man-made architecture or transit-constructions,

(c) often contain repetitive and regular geometric and chromatic patterns which further make them stand out from everything else, and

(d) practically cannot be produced by digital (ergo noisy/less-than-perfect) images of any common real-world scenario.

In short: optical illusions don't accidentally occur in places where they can be seen by meatbag drivers.

I don’t see how you can make any claim about “most human adversarial examples”. There is a huge space of images and we have explored a negligible part of it.

Also a) and b) empirically seem to be true of the test sets people have collected thus far of the natural world for these models.

In short, we have no evidence that adversarial examples of the type being studied occur commonly in images collected by self driving cars.

The issue with regard to self-driving cars is that these cases demonstrate a disturbing level of fragility: we don't have a good handle on where the boundary between acceptable and chaotic responses lies.

You hypothesize that there are comparable examples for humans somewhere out there in the domain of all possible images, but the fact that, for all the countless cases of people looking at things that have occurred in humanity's existence, no-one has found a good example, suggests that, from the pragmatic point of view that you propose, image-recognition software has some catching-up to do.

Maybe a system that seeks consensus among several differently-trained models would be more robust.

https://arxiv.org/pdf/1802.08195.pdf

Looks like we are starting to find examples.

I think your intuition is wrong because humans are adapted to what exists naturally so of course there are no naturally occurring adversarial examples. It seems like the same is true for models trained on large natural image sets though.

My point is not wow let’s stop developing neural networks they are perfect. It’s more let’s go collect real world test sets to find and then fix gaps. Adversarial examples actually help very little in making nets more robust in the ways that matter.

The difference is that you can calculate an adversarial example for our classifiers, but it's too slow to calculate on a human.

Even if you could, the result would be specific to that particular person, so it won't work as good on others. And these bastards learn while you're constructing the example (which isn't fair at all to a helpless classifier that's just sitting there and doesn't change).

Fairness doesn't come into it - machine vision has to be up to the task it is given, period. If humans depend on their more general intelligence to deal with problem cases, machine vision either has to do something similar, or compensate adequately in some other way.
> The important difference here is that most adversarial examples for the human perception:

> (a) do not occur frequently in nature

You've never heard of walking sticks? Ever seen one of those leaf moths?

If you had seen one, would you realize you had?

Is a deer visible on that stretch of hillside, or is that just dead grass?

Well yes adversarial images cannot appear naturally because by definition are created out of the network itself, but they highlight the same issue that caused misclassification of the lady in the Uber incident.
Not true. There are black and grey box adversarial techniques as well.
> This latest craze of "AI" research seems to be fueled by a sudden glut of computational power (GPUs) that wasn't available previously. I think that most technical people would agree that the mid 2020s is extremely ambitious. I'd also argue that we're actually more likely to experience another AI winter.

I think that is extremely unlikely. "AI" (read: machine learning) is actually being used for business purposes now, it's delivering enormous value to nearly every business on the planet. We're now in a long phase of descending the gradient of the current batch of broad techniques. This is likely a decently long gradient, with lots of marginal improvements to be made for a long time. And whereas with research projects, people don't care much about marginal improvements, they really do for business use-cases. For those reasons, I think AI/ML is basically here to stay just as much as basic biological research, or physics, or whatever is, if not more.

>> "AI" (read: machine learning) [is] delivering enormous value to nearly every business on the planet

That statement appears to contain two fairly bold claims - could you share sources?

Not OP, but of course even 0.1% improvements are worth millions to search engines, social network feeds, financial forecasters, and self driving car companies. Also worth thousands or millions to manufacturing processes, and to small businesses, which might be using ecommerce optimization systems through providers.
Every big website uses some kind of machine learning to prevent fraud. Banks do the same. Data mining is used everywhere to improve customer experience. Data mining is used in industrial applications for preventative maintenance.
I don't think it is just a matter of computational power: I have been quite surprised by how effective word embedding, for example, has been in abetting language translation (note that I am not claiming that it has solved the problem; I am too familiar with the problems of idiomatic Vietnamese-English translation.) Of course, if you had different intuitions (or more knowledge) than me, you might not be so impressed.

Having said that, I agree that the projections seem highly optimistic, but maybe I will be surprised again.

>I'm convinced the self-driving cars are still a ways off as well.

Technology-wise, absolutely they are. The problem is that in actuality, they aren't. Companies will continue to push as hard as they can for as wide of a launch as they can, while governments (and any kind of sorely-needed oversight) will be ages behind.

> how susceptible they are to adversarial attacks. Adding small amounts of noise causes misclassification in images, and some papers even explore the inevitability of adversarial examples

What if, in a distant future, computers turn out to be the correct one, humans's perception are biased?

In what sense an image of a rabbit perturbed by adversarial noise can be _correctly_ recognized as a duck?

There can be a general consensus that the image looks like a duck at most. And if humans see a rabbit and AIs see a duck there just won't be consensus.