Because saying "more than a quarter of all new code at Google is generated by free crowdsourcing from internet scraping" doesn't roll off the tongue as easily ;)
Google has two billion lines of proprietary code, conformant to their style guides and proprietary requirements. I can't imagine they'd poison their model with non-conformant third party source.