|
|
|
|
|
by eqvinox
405 days ago
|
|
> 1.5 million potentially new variable objects How many of these have been peer reviewed? If it's not peer reviewed, it's not science. (NB: peer review doesn't necessarily mean manual human work, just independent work. In this case, that could be some 'classic' machine processing. Or even a distinct AI model, if there's some consensus for that.) |
|
There probably was peer review of some sort making sure it isn't just a crackpot claim given they link his published paper:
https://iopscience.iop.org/article/10.3847/1538-3881/ad7fe6
And calibrated against known objects:
"We opt to primarily measure the success of our model using the F1 score as a more robust metric than overall accuracy. In order to ignore the effects of our class imbalances in the true positive catalog, we take the macro averages of F1 score, precision, and recall. As can be derived from this confusion matrix, the model achieves a precision of 0.918, a recall of 0.910, an accuracy of 92.2%, and an F1 score of 0.914. These values are satisfactory for our studies. It should be noted that the confusion between the null class and all other classes is the most important to keep track of. Another confusion matrix is available in Figure 9(b), which is the result of simply collapsing all the variable classes from the four-class confusion matrix into one, in order to study the real–bogus distinction that VARnet is making. The result is a precision of 0.973, a recall of 0.975, an accuracy of 97.4%, and an F1 score of 0.974. When observing the final confusion matrix, it is also apparent that there is the most confusion between the pulsator and transit classes. This is understandable, as some transits, particularly eclipsing binaries of the W Ursae Majoris type, have smoother short-period fluctuations in brightness, very similar to short-period pulsators. If the distinction between the pulsator and transit class were to be made via other methods after a secondary classification step, we could combine these classes for this step and greatly improve performance. By combining both the synthesizers and true positives for the pulsators and transits and retraining the model, the final confusion matrix in Figure 9 showcases the result of this approach. It yields an improved precision, recall, and F1 score of 0.980 and an accuracy of 97.4%."