Hacker News new | ask | show | jobs
by seydor 1194 days ago
Well if the model is so smart, could it be that it is actually aware of its layers and parameters?
1 comments

It's no problem to put model's architecture and even some python code into the generous 32k context window, the real problem seems to be as you say "awareness" - at least the facet of it that'd allow to answer complex novel questions.

For me the slightly positive takeaway from OpenAI's paper is good uncertainty calibration of the base pretrained GPT-4. It could be interpreted as one example of "awareness" of its inner workings.

Of course it's hard to say much about the model we don't even know architecture of, not to mean such luxuries as access to weights... Meta's LLaMA release did more to democratize deep learning than OpenAI's GPT-4, that's for sure.