| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by YeGoblynQueenne 2173 days ago
	No, that is the training agent implemented as an instance of quicksort. I don't know why the OP reports that it's "a few percent faster than quicksort" because I can't find where that is claimed in the paper. In fact, as far as I understand it the paper claims that the learned model (i.e. the student agent) has learned to reproduce the behaviour of this teacher agent after training for a smaller number of steps on average than what the teacher needs to sort a list. That is what the entire claim about superior efficiency rests on (there is no example of a learned model and no asymptotic analysis of the program such a model represents).