Hacker News new | ask | show | jobs
by the8472 14 days ago
Humans come vaguely prealigned due to whatever is encoded in genes and also due to limitations of human bodies that put important constraints on individuals (e.g. no infinitely copyable trusted subagents). Even if you made them superhuman in some aspects a lot of that would still remain. It seems unlikely that minds constructed by a different process would end up humanlike because they lack the evolutionary path-dependencies that shaped humans. Current models appear somewhat human due to imitation learning/pretraining, but A) this could be deceptive as we don't know what's going on inside B) history has shown that that imitation learning becomes unnecessary once RL becomes good enough (e.g. AlphaGo -> AlphaZero), meaning we might end up with minds created from random initialization.