| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mark_l_watson 904 days ago
	Maybe LLMs should follow best practices for 1980s style backprop models and later deep learning models: starve model size to force maximum generalization, minimal remembering.