| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gklitt 2296 days ago

Strongly agree with the original article, and it's fun to see all the niche use cases that people are mentioning here.

But I have a major frustration with user scripts: writing them requires experience with Javascript and reverse engineering websites. This is fine for the HN crowd, but locks out most web users, who can't program at all.

I bet that if it were slightly easier to develop user scripts, there'd be 10x as many of them. I'm sure I'm not the only one who's helped a coworker write a bookmarklet / user script essential for their workplace sanity.

Would be curious what people's experiences have been helping nontechnical people extend websites, or if you know of tools in this area.

My current attempt at this is a project called Wildcard, which requires a programmer to write some site-specific scraping code, but then shows the scraped data to the end user in a spreadsheet and lets them decide what to do with it:

https://www.geoffreylitt.com/wildcard/salon2020/

1 comments

matheusmoreira 2295 days ago

Would be nice if every website had a scraper. People maintain huge content blocking databases so why not scraping code? It should be possible to treat every website like an API.

dunham 2295 days ago

Many years ago there was a browser plugin out of MIT called "piggy bank". It included a browser for exploring RDF data (longwell), and the ability to define scrapers of RDF data on a per-site basis. (I think it was javascript and tagged with the hostname of the site, but it's been a while.) It stored stuff locally, but could also upload to a server hosted longwell instance.

Every now and then I wish I still had something like that, but the team has long since moved on and it's bit-rotted a bit.

More recently, I've found that a lot of recipe web sites have been embedding the recipe as json-ld data (I presume to appease Google), so I've written a grease-monkey script to collect those as I browse and post to a personal couchdb instance. I haven't gotten around to putting a UI on that or further processing the data yet (e.g. I need some agents to fetch images or pull them from browser cache), too many irons in the fire, but maybe someday.