| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kir-gadjello 1193 days ago

It's no problem to put model's architecture and even some python code into the generous 32k context window, the real problem seems to be as you say "awareness" - at least the facet of it that'd allow to answer complex novel questions.

For me the slightly positive takeaway from OpenAI's paper is good uncertainty calibration of the base pretrained GPT-4. It could be interpreted as one example of "awareness" of its inner workings.

Of course it's hard to say much about the model we don't even know architecture of, not to mean such luxuries as access to weights... Meta's LLaMA release did more to democratize deep learning than OpenAI's GPT-4, that's for sure.