Hacker News new | ask | show | jobs
by jbay808 1025 days ago
It is only difficult for a LLM to sort a list of numbers if the list is longer than half of the context window. (Source: I tested this myself[1]). The sorts are not error-free every time, but with sufficient training they become error-free the vast majority of the time, even for long lists. This is not especially surprising because transformers are capable of directly representing sorting programs.[2]

[1] https://jbconsulting.substack.com/p/its-not-just-statistics-...

[2] https://arxiv.org/abs/2106.06981

1 comments

Of course you can train a neural network to sort numbers, but I'm talking about a general LLM which hasn't been trained to sort numbers specifically. Training a GPT network to sort numbers is not what I would consider to be a Large Language Model.