Hacker News new | ask | show | jobs
by joe_the_user 1846 days ago
Looks like an interesting project. The thing is, I don't think ideal qualities like "reliable, interpretable, and steerable" can really be simply added "on top of" existing deep learning systems and methods.

Much is made of GPT-3's ability to sometimes do logic or even arithmetic. But that ability is unreliable and even more spread through the whole giant model. Extracting a particular piece of specifically logical reasoning from the model is hard problem. You can do it - N-times the cost of the model. And in general, you can add extras to the basic functionality of deep neural nets (few-shot, generational, etc) but with a cost of, again, N-times the base (plus decreased reliability). But the "full" qualities mentioned initially would many-many extras-equivalent to one-shot and need to have them happen on the fly. (And one-shot is fairly easy seeming. Take a system that recognizes images by label ("red", "vehicle", etc). Show it thing X - it uses the categories thing X activates to decide whether other things are similar to thing X. Simple but there's still lots of tuning to do here).

Just to emphasize, I think they'll need something extra in the basic approach.

1 comments

Go check out the entire project of captum for pytorch. I assure you that gradient based explanations can be simply added to existing deep learning systems...
All sorts of explanation scheme can and have be added to existing processes. They just tend to fail to be what an ordinary human would take as an explanation.

Note - I never argued that "extras" (including formal "explanations") can't be added to deep learning system. My point is you absolutely can add some steps at generally high cost. The argument is those sequence of small steps won't get you to the ideal of broad flexibility that the OP landing page outlines.