I totally agree with your statement about AGI, but wouldn't be as pessimistic about neural networks in general. Of course our data is enough for useful deep neural models! Many problems can be solved without it, but in areas of computer vision and speech recognition they seem to be the best (currently known) choice.
Your point about AGI which needs to ask questions about data provenance is super interesting. Are you aware of the line of inquiry into active learning? It's fascinating and has a long history:
https://papers.nips.cc/paper/1011-active-learning-with-stati...