| HN Mirror

Could still do human work, a random audit and estimate what percentage of them are valid with statistics.

There probably was peer review of some sort making sure it isn't just a crackpot claim given they link his published paper:

https://iopscience.iop.org/article/10.3847/1538-3881/ad7fe6

And calibrated against known objects:

"We opt to primarily measure the success of our model using the F1 score as a more robust metric than overall accuracy. In order to ignore the effects of our class imbalances in the true positive catalog, we take the macro averages of F1 score, precision, and recall. As can be derived from this confusion matrix, the model achieves a precision of 0.918, a recall of 0.910, an accuracy of 92.2%, and an F1 score of 0.914. These values are satisfactory for our studies. It should be noted that the confusion between the null class and all other classes is the most important to keep track of. Another confusion matrix is available in Figure 9(b), which is the result of simply collapsing all the variable classes from the four-class confusion matrix into one, in order to study the real–bogus distinction that VARnet is making. The result is a precision of 0.973, a recall of 0.975, an accuracy of 97.4%, and an F1 score of 0.974. When observing the final confusion matrix, it is also apparent that there is the most confusion between the pulsator and transit classes. This is understandable, as some transits, particularly eclipsing binaries of the W Ursae Majoris type, have smoother short-period fluctuations in brightness, very similar to short-period pulsators. If the distinction between the pulsator and transit class were to be made via other methods after a secondary classification step, we could combine these classes for this step and greatly improve performance. By combining both the synthesizers and true positives for the pulsators and transits and retraining the model, the final confusion matrix in Figure 9 showcases the result of this approach. It yields an improved precision, recall, and F1 score of 0.980 and an accuracy of 97.4%."