Most "black box" quantitative hedge funds still use massive amounts of public data, even public data that might appear somewhat obvious. They absolutely develop new insights from that data.
Yeah, to clarify I was agreeing that those firms certainly pull and use that data, but not necessarily in the same traditional fashion that paper seems to imply.