Hacker News new | ask | show | jobs
by jbay808 491 days ago
Can you say more precisely what you mean?
1 comments

I mean that maybe gradient descent is a passable sorting algorithm, once the weights have been learned to properly describe ordering. It may be a speciality of transformers that they can sort things well. Which wouldn’t tell us that much about whether they are mentalists or not.