Hacker News new | ask | show | jobs
by Workaccount2 851 days ago
I think what the author is trying to get across, and what I tend to agree with having touched on the mathematics behind transformers at least, is that we don't know how these models actually arrive at the outputs they do.

We know the rules they play by thoroughly - we made those ourselves(the math/model structure). But the outputs we are getting in many cases were never explicitly outlined in the rule set. We can follow the prompts step-by-step but quickly end up on seemingly non-nonsensical paths that explode into a web of what appears to be completely unrelated concepts. It could be that our meat brains simply don't have the working memory necessary to track these meta and meta-meta emergent systems at play that arrive at an output.