Can we bootstrap AI Safety despite being unable to even define it?

Y	Hacker News new \| ask \| show \| jobs

	Can we bootstrap AI Safety despite being unable to even define it? (arxiv.org)
	2 points by cryptohell 219 days ago

2 comments

conartist6 212 days ago

AI output is modeled on human behavior. Are humans safe?

link

cryptohell 219 days ago

Given several models, assuming only that some unknown subset is "safe", can we construct a single model as safe as that subset? This reduces obtaining a trustworthy model to a plausibly easier task.

link