| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by x3blah 2222 days ago

The long line of sed is out-of-date and thus "broken". For something simpler that works, try this:

   nyt tr |sed 's/ *//;/</!d'|uniq > travel.html

This will produce a simple web page of titles and URLs for each article page.

An interesting point of discussion might be the amount of third party cruft on the template article page versus the more dynamic front page. When Javascript is disabled, on each article page all images display and there are no ads. Downloading any video in the page is as simple as

   curl -O `grep -o https://[^\"]*mp4 article.html`