Hacker News new | ask | show | jobs
by pu_pe 149 days ago
Every time I see some complex orchestration like this, I feel that the authors should have compared it to simpler alternatives. One of the metrics they use is that human review suggests the system is right 83% of the time. How much performance would they achieve by just having a reasoning "judge" decide without all the other procedure?
1 comments

I agree. If they're not testing against a simple baseline of standard best practice, then they're either ignorant about how to do even basic research, or trying to show off / win internet points. Occam's razor folks.