Hacker News new | ask | show | jobs
by amelius 1202 days ago
We should be working on benchmarking this kind of tool. Instead of saying "this version/implementation gives interesting results sometimes", we should get some kind of score out of it (like the score of a test). Then we can better compare different versions and also test if the version we just installed is actually working as it should.