Hacker News new | ask | show | jobs
by mistrial9 2435 days ago
This paper shows measured results of using "popular image recognition services" .. that include Azure, AWS, Google, IBM and other current commercial offerings.. (the implication right away is that the tested services are using some DeepLearning system on the server side). The paper specifically says "from the point of view of a software developer".. and spends quite a bit of effort to question the assumptions of a user of these services, and identify potential pitfalls from a mis-match of user assumptions, including consistency over time, consistency between services themselves, and employing a machine that produces deterministic outcomes versus probabilistic ones. The paper looks at the behavior of a Vision-as-Service use from a Software Quality Assurance (SQA) point of view - is the result -of commercial services on the web- reliable over time. Liability within safety-critical environments is questioned.

The comments here (so far) address "does DeepLearning image analysis work" .. which is a broader question than what is being addressed in the paper.. Importantly, other kinds of image analysis methods, including other ML approaches, are not being compared..

The authors seem to be raising a bit of an alarm about services like these, reflected in the paper title (weakly):

[RH1] Computer vision services do not respond with consistent outputs between services, given the same input image.

[RH2] The responses from computer vision services are non-deterministic and evolving, and the same service can change its top-most response over time given the same input image.

[RH3] Computer vision services do not effectively communicate this evolution and instability, introducing risk into engineering these systems

To a non-specialist, this seems like detailed description of a useful real-world investigation, like a lab. The authors' skepticism is healthy, and the paper overall looks good. On the negative side, the discussion of labels in Computer Vision seems to be insufficiently distinguishing between fundamental problems in taxonomy and classification, problems with data grouping in general, and then specifically problems associated with this kind of DeepLearning image identification.

1 comments

One thing to keep in mind is that any neural net based ML system is essentially just a mathematical function imitator. The observations in this paper are spot on in the sense that many mathematical functions can have the same subset of results (success within your training data set), but can have wildly varying behavior in the general case.

This is known as overfitting, and its one of the main things that should cast doubt on any ML system's capability to reliably produce results outside of a training set.

In a way, these outcomes are to a point predictable (in the sense of "the possibility exists" as opposed to "this set of weights yields these generalizability results") if you take the whole "neural network" thing a bit more straight than many academics are comfortable with you doing. The human brain, or any collection of biological neurons, is in a state of constant flux, creating different networks in order to react to stimuli in the environment and implement actions that bring us closer to achieving $goals.

Who hasn't experienced an off day where the gears of your mind just aren't producing what you darn well know they should be? It's just a fundamental change in the primary set of neural tools you've got to work with that day. When the weights change, so too does the output, and the function modeled. The cerebellum if I recall correctly, actually acts as a QA like functionality built into our own minds.

There's nothing magic about simulating neural networks in silicon that suddenly gets you to a more "free-of-mistakes" state besides being able to condense way more dimensions of data into the networks "sensory space" as it were. Even then though, the possibility of suboptimal functions being imitated is inescapable.

Try explaining that to someone that wants to save millions on workforce, or automate safety-critical tasks without concern for the consequences though. It's amazing the cognitive barriers we can build.