|
|
|
|
|
by godelski
1081 days ago
|
|
> The goal at the end is to have a deep understanding of the LLM space and its adjacency. This is kinda a hard thing to quantify. How are we defining deep? Like you want to understand how they work? The Karpathy videos are good for that. But I wouldn't call this "deep". If you want to get down into the weeds and into the mud, you need a hell of a lot more than 13hrs of education. You're also going to have a hard time doing this because most people are going from an engineering perspective of "enough to work with it" rather than "I fundamentally want to understand all inner workings". If you are the former, then the fastai course and others are great for you. If you want to really get deep though, you're going to need a lot more than programming. You're going to need some pretty advanced maths too: high dimensional statistics, metric theory, and optimization theory are some. (Most researchers aren't doing this btw) But if you do go down this path you'll also be able to understand the full spectrum of generative models and have a clearer picture. But I should also say that there is still a black box element to these models as they are so large that they are near impossible to analyze. But it is definitely achievable to learn a 2 layer transformer autoregressive network and fully understand its inner workings. But programming skills alone won't get you there. |
|