| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jonincanada 940 days ago
	balderdash? "Q-star". Yes, the Q as in q-learning -- optimize a long term goal. The "star points" are the embedded algorithms discovered and joined within the transformer/NN architecture. Stars where formed after SGD discovered the best representation of said embedded alg type. I'm running a scaled down version myself -- somewhat impressive. Do it at 1k B parameters? hold my beer.