|
|
|
|
|
by pushedx
108 days ago
|
|
Yes, most people (including myself) do not understand how modern LLMs work (especially if we consider the most recent architectural and training improvements). There's the 3b1b video series which does a pretty good job, but now we are interfacing with models that probably have parameter counts in each layer larger than the first models that we interacted with. The novel insights that these models can produce is truly shocking, I would guess even for someone who does understand the latest techniques. |
|
[1] https://www.manning.com/books/build-a-large-language-model-f...