| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nickpsecurity 611 days ago

“This particular sufficiently advanced technology has been around long enough now that it is no longer indistinguishable from magic.”

I wouldn’t say that. They don’t even really know how it works. Papers are periodically written challenging a fundamental claim about them, like in training or reasoning.

What we do know also isn’t clear or formulaic enough for reliable predictions of model behavior. That’s why they do all the “YOLO’s.” It is more an art form than a science.

I think a few principles about them are well-understood. We know how to assess what models can and can’t do well. Past that, there’s a lot of unknown in them. Both the experimenters and the field of mechanistic interoperability are trying to figure out the rest.