Hacker News new | ask | show | jobs
by bjornlouser 801 days ago
‘We provide a total of 50 tuples… and then ask the model to generate y51, corresponding to x51…’

I wonder if testing a single point is masking a larger llm error.

1 comments

The experiments were repeated with 100 random seeds.