| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dunham 2295 days ago

Many years ago there was a browser plugin out of MIT called "piggy bank". It included a browser for exploring RDF data (longwell), and the ability to define scrapers of RDF data on a per-site basis. (I think it was javascript and tagged with the hostname of the site, but it's been a while.) It stored stuff locally, but could also upload to a server hosted longwell instance.

Every now and then I wish I still had something like that, but the team has long since moved on and it's bit-rotted a bit.

More recently, I've found that a lot of recipe web sites have been embedding the recipe as json-ld data (I presume to appease Google), so I've written a grease-monkey script to collect those as I browse and post to a personal couchdb instance. I haven't gotten around to putting a UI on that or further processing the data yet (e.g. I need some agents to fetch images or pull them from browser cache), too many irons in the fire, but maybe someday.