Great question! This is actually a surprisingly deep problem in ML, known as "anomaly detection" or "out-of-distribution" (OoD) detection.
Another way to formulate this question: "given training data that only tells you about digits, how do you know whether something is a digit or not?" Given that the training data never actually defines what isn't a digit, how can we ensure that the model actually sees a digit at test time? If we cannot ensure this (e.g. an adversary or the real world supplies inputs), how can we "filter out" bad inputs?
A quick hack solution that works well in practice is to examine the "predictive distribution" across digit classes. Researchers have empirically found that entropy tends to be higher (i.e. more smooth) when the model sees an OoD input. However, the OoD problem is not fully solved.
Note that methods that tie OoD to the task at hand (classification) are not actually solving OoD, they are solving "predictive uncertainty" of the task.
You mean to get either 0-9 or 'no number'? Here are two approaches:
1) Integrated. Represent 'no number' as class number 11 in the original model. Retrain it with this additional class (needs additional training data).
2) Cascading. Train a dedicated model for 'number' versus 'no number' (binary classifier), and use that in front of the original model.
Note that the MNIST data comes already extracted from original image, centered in fixed-size images of 28x28 pixels. In a practical ML application these steps would also need to be done before classification can be performed.
In the work shown in the article, the segmentation and centering of digits looks to be done by the user holding the camera. Which can be workable for some applications!
The predictions variable has a confidence value for each digit. You can put a cutoff and say if none is above a certain confidence, assume there's no number at all.
This could work, but it is important to note that a lot of ML algorithms trained in a closed domain (no "other" class) will be pretty bad at knowing what they don't know. This is an open problem in ML.
Choosing the threshold will be hard. And (as mentioned by other poster) the model is unlikely to generalize well to classes of data it has not seen.
I suspect that this approach will get things similar to numbers wrong quite often, like handwritten characters (a,b,c). Including these into the training set is much more likely to yield a model which will successfully discriminate it.
Another way to formulate this question: "given training data that only tells you about digits, how do you know whether something is a digit or not?" Given that the training data never actually defines what isn't a digit, how can we ensure that the model actually sees a digit at test time? If we cannot ensure this (e.g. an adversary or the real world supplies inputs), how can we "filter out" bad inputs?
A quick hack solution that works well in practice is to examine the "predictive distribution" across digit classes. Researchers have empirically found that entropy tends to be higher (i.e. more smooth) when the model sees an OoD input. However, the OoD problem is not fully solved.
Here's a nice survey paper on the topic: https://arxiv.org/abs/1809.04729
Note that methods that tie OoD to the task at hand (classification) are not actually solving OoD, they are solving "predictive uncertainty" of the task.