|
|
|
|
|
by thaway_thaway34
1102 days ago
|
|
>> I've always felt like with new technology, I could grok at a high level how things worked. But LLMs like GPT seem like magic This is exactly how I feel. I felt so out of my depth looking at the ML architectures and I could not make any sense of it. I thought perhaps, they get inspired by neuroscience for the layers etc. But a friend who works on LLMs mentioned, the architecture of large ML models, are mostly experimentally discovered, not designed. If that's the case, that's even worse... it means an entire field which perhaps could replace me in future, doesn't even have a knowledge foundation for its breakthroughs, but just goes by experiment... I thought it was only the weights inside the model that evolves, not the architecture itself. Which body of knowledge do I study then, and is it even engineering anymore? That's something else, which I am not sure if my programming experience applies. The amount of GPU/Capital it takes to evolve such architectures, run such experiments has to be prohibitively expensive. |
|
If you held a gun to my head and asked me to tell you even at a sky-high architectural level (let alone in any detail) how ChatGPT worked, well... tell my family I love them. This is the first time in my 20+ year career I have felt like some computing thing is total unexplainable black magic.