Hacker News new | ask | show | jobs
by jey 3571 days ago
What is your definition of practical?
2 comments

MIRI is entirely Blue Team - they work to create theoretical [0] safeguards on AI agents yet to be developed. I've long envisioned a counterpart Red Team that does nothing but build AIs that attempt to subvert these safety features, since the rest of the world of non Friendly AI [1] research is only unco-ordinated para-red behavior.

[0] In the sense of "valid under these known precepts", not "speculative".

[1] Non "Friendly AI", not "non-Friendly" AI.

I find that perception fairly surprising, as for a very long time it felt like we did more red team than blue team. I do acknowledge that this has been changing recently, but only significantly in the context of building on the results in this paper.
Would you please direct me to an example of MIRI's Red Team efforts that isn't the recent "Malevolent AI" paper [0]? Adherence to the belief that UFAI is a threshold-grade existential risk seems to compel a "define first, delay implementation" strategy, lest any step forward be the irrevocable wrong one.

[0] http://arxiv.org/abs/1605.02817

This makes me think the only thing we disagree on is the meaning of the words "red team" and "blue team" :)

When I say it feels like we spend a lot of time red teaming, that means I think we spend somewhere between 30 and 60% of research time trying to break things and see how they fail. This is fully compatible with not immediately implementing things - it's much less expensive to break something /before/ you build it.

It is refreshing when only the maps, and not the objects, are under serious contention. I suspect I still might prefer walking a shade closer to the line dividing definitely intra and potentially extra boxed agents, but you are the ones actually in the arena - do keep up the interesting work.
My concept of practical is stuff like: implementing things, doing experiments with prototypes, producing stuff that is imperfect but does something and can be improved upon.

Theoretical stuff is like: proving theorems, conceptualizing the task at hand, philosophical inquiry into the nature of agents/intelligence/reasoning/goals/human values

I'm not trying to argue which is more important, but surely MIRI focuses more on the theoretical.