Hacker News new | ask | show | jobs
by sadkingbilly 4167 days ago
"so I implemented a small neural network and trained it to sort 10 6-bit numbers, which was easy to do to my surprise"

Does anyone know what the inputs and outputs of a neural network that sorts numbers would look like?

1 comments

Input: a 60-dimensional vector that is the concatenation of 10 6-dimensional binary vectors encoding the binary representation of the input numbers.

Output: the same, sorted.

At least that's one dead simple way to formulate the problem, multiple other solutions would work as well, and some would probably work better.

I started playing with this. I take each digit and normalize it by dividing by 9. Then use each normalized digit as an input:

Example sorting 987654 and 123456

  Input: 1, .9, .8, .7, .6, .5, .2, .3, .4, .5, .6, .7
  Expected output: .2, .3, .4, .5, .6, .7, 1, .9, .8, .7, .6, .5
You can then encode/decode the inputs and outputs accordingly. if (value <= 1) digit = 9; if (value <= 0.9) digit = 8; ... if (value <= 0.2) digit = 1; if (value <= 0.1) digit = 0; etc.

I'm able to get 100% accuracy on a limited training set with 2 hidden layers of 10 nodes. 33% accuracy on the test set (but likely need a lot more data to train with).

Update: I was able to train the network to sort sets of two 3-digit numbers. I used a neural network with 2 hidden layers of 25 nodes. The training/test accuracy after 10 minutes is 78%/74%. Not bad. https://github.com/primaryobjects/nnsorting
Interesting. It sounds like he used binary. I wonder how different bases affect how well the network can learn things like sorting?
you'd be better off if you mapped those 10 numbers to a number in [0:(10!-1)] interval since there are only 10! different potential answers. So you make the job of the algorithm easier (figuring out the "structure" of the function it's trying to solve for is easier when the result space is compressed without any 'signal' loss)

another improvement could be "enhancing" of the inputs: when you figure out how you will permute the numbers to sort them, create specific and randomized variations of that specific list of numbers and feed the learning algorithm with the correct results of those permutations too. for instance if you have 5 70 2 13 as a training input, the trainer algorithm could generate the following extra inputs based on this so that the algorithm will get a better chance of figuring out the sorting for a test input like 2 15 5 65:

2 5 23 70 2 5 33 70 ... 2 5 63 70 2 5 73 70 also: 2 5 14 70 2 5 13 69 etc. also modify more than 1 number at the same time(both systematically and also randomly) to generate even more "gray"-input

Could also have an 18 bit output where each 3 bit block told you the index of that input in the sorted array.