I have an open source project on crawling public datasets and make them searchable in one place: https://github.com/findopendata/findopendata.