| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CasperDern 1600 days ago
	They used a fixed size transformer, where the vocab determines the functions and input/output range. So unless the model needs more 'memory' for your class of expression there wouldn't necessarily be a big change in performance. They have experiments in the paper with bigger/smaller vocabs.