Hacker News new | ask | show | jobs
by Havoc 930 days ago
That actually looks like a pretty good rebuttal of the original test.

I wonder if this also works on other 200k models like yi

1 comments

Yes, I think I agree if I am understanding correctly - the test is not a good fit for how it works, because it "wants" to weigh things based on surrounding context and to give a lower weight to things that it feels are out of place. That makes it likely a great candidate for certain kinds of work, like sentiment analysis and just overall literary understanding.