Hacker News new | ask | show | jobs
by xviia 3675 days ago
How would this be done? What kind of language, etc. would enable this?
4 comments

I decided to try it.

https://gist.github.com/eli173/c089c27db9c1302fe4e003716b402...

Perhaps I am the kind of person they are trying to avoid, but I did this in a bit less than an hour. I am very ignorant of the Python library ecosystem, so if I had known the request library better and if I knew an html parsing library, I may have been able to complete this in the time limit. I also suspect there are a few cases which break my solution.

> if I knew an html parsing library

Beautiful soup https://pypi.python.org/pypi/beautifulsoup4

Wget/sh/sed/awk/perl soup
A simple shell script with curl piped through sed or awk...
Any language with support for web crawling? I can sorta see it being done in Node with cheerio[1] and an HTTP client.

[1] https://github.com/cheeriojs/cheerio

Would probably be my first choice these days too...
Thanks! Always good to learn new things