| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by libraryofbabel 288 days ago
	This is precisely what I recommend to people starting out with LLMs: do not start with the architecture, start with their behavior - use them for a while as a black box and then circle back and learn about transformers and cross entropy loss functions and whatever. Bottom-up approaches to learning work well in other areas of computing, but not this - there is nothing in the architecture to suggest the emergent behavior that we see.

1 comments

teucris 288 days ago

This is more or less how I came to the mental model I have that I refer to above. It helps me tremendously in knowing what to expect from every model I’ve used.

link