Hacker News new | ask | show | jobs
by mattcollins 254 days ago
I'm the person who ran the test.

To hopefully clarify a bit...

I intentionally chose input data large enough that the LLM would be scoring in the region of 50% accuracy in order to maximise the discriminative power of the test.

1 comments

Can you expand on how you did this?
I did a small test with just a couple of formats and something like 100 records, saw that the accuracy was higher than I wanted, then increased the number of records until the accuracy was down to 50%-ish (e.g. 100 -> 200 -> 500 -> 1000, though I forget the precise numbers.)