| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by thesz 80 days ago

Thanks.

The paper [1] referenced in your link follows the lagacy of the paper on the HIGGS dataset, and does not operate with quantities like accuracy and/or perplexity. HIGGS dataset paper provided area under ROC, from which one had to approximate accuracy. I used accuracy from the ADMM paper [2] to compare my results with. As I checked later, area under ROC in [1] mostly agrees with [2] SGD training results on HIGGS.

  [1] https://arxiv.org/pdf/2505.19689
  [2] https://proceedings.mlr.press/v48/taylor16.pdf

I think that perplexity measure is appropriate there in [1] because we need to discern between three outcomes. This calls for softmax and for perplexity as a standard measure.

So, my questions are: 1) what perplexity should I target when dealing with "mc-flavtag-ttbar-small" dataset? And 2) what is the split of train/validate/test ratio there?

2 comments

dguest 79 days ago

For better or worse the people working on this don't really use perplexity or accuracy to evaluate models. The target is whatever you'd get for those metrics if you used the discriminants that were provided in the dataset (i.e. the GN2v01 values).

As for why accuracy and perplexity aren't reported: the experiments generally choose a threshold to consider something a "b-hadron" (basically picking a point along the ROC curve) and quantify the TPR and FPR at that point. There are reasons for this, mostly that picking a standard point lets them verify that the simulation actually reflects data. See, for example, the FPR [1] and TPR [2] "calibrations".

It's a good point, though, the physicists should probably try harder to report standard metrics that the rest of the ML community uses.

[1]: https://arxiv.org/pdf/2301.06319

[2]: https://arxiv.org/abs/1907.05120

mpierini 76 days ago

Perplexity, aka measuring how much a network is sure about its answer. Which might be wrong. It would not pass the pier review of any particle physics journal. (Real) science is about being right, not about being sure about itself.