Hacker News new | ask | show | jobs
by sea-shunned 1663 days ago
It's also a shame that this paper in turn doesn't cite "Testing Heuristics: We Have It All Wrong" (J. N. Hooker, 1995) (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71....), which discusses these issues in much the same way. It's good to see in "The Benchmark Lottery", however, that they look more into specific tasks and their algorithmic rankings and provide some sound recommendations.

One thing that I'd add (somewhat selfishly as it relates to my PhD work), is the idea of generating datasets that are deliberately challenging for different algorithms. Scale this across a test suite of algorithms, and their relative strengths and weaknesses become clearer. The caveat here is that it requires having a set of measures that quantify different types of problem difficulty, which depending on the task/domain can range from well-defined to near-impossible.

1 comments

I've been looking at parsing paragraph structure and have started thinking about a conceptual mechanical turk/e e cummings line in the sand where it's just going to be easier to pay some kid with a cell phone to read words for you. The working implementations I've seen are heavily tied to domain and need to nail down language, which isn't really a thing.

Quantification is fascinating, it seems to be something I take for granted until I actually want to make decisions. It's like I'm constantly trying to forget that analog and digital are two totally separate concepts. I wouldn't really recommend reading Castaneda to anyone but he describes people living comfortably with mutually exclusive ideas in their head walled off by context, and I'd like that sort of understanding.