| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by metasj 611 days ago
	At 5 cents per neuron with 4o-mini, for pretty satisfying descriptions. "we fine-tune Llama-3.1-8B-Instruct to directly predict per-token activations ... [this] allows us to use smaller models, and the task of directly predicting the output (integer from 0-10) gets rid of the extra tokens, making the prompt much shorter."