|
|
|
|
|
by rvz
932 days ago
|
|
Rather than looking at the visuals of this network, it is more better to focus on the actual problem with these LLMs which the author already has shown: With in the transformer section: > As is common in deep learning, it's hard to say exactly what each of these layers is doing, but we have some general ideas: the earlier layers tend to focus on learning lower-level features and patterns, while the later layers learn to recognize and understand higher-level abstractions and relationships. That is the problem and yet these black boxes are just as explainable as a magic scroll. |
|
For decades we’ve puzzled at how the inner workings of the brain works, and thought we’ve learned a lot we still don’t fully understand it. So, we figure, we’ll just make an artificial brain and THEN we’ll be able to figure it out.
And here we are, finally a big step closer to an artificial brain and once again, we don’t know how it works :)
(Although to be fair we’re spending all of our efforts making the models better and better and not on learning their low level behaviors. Thankfully when we decide to study them it’ll be a wee less invasive and actually doable, in theory.)