|
|
|
|
|
by x3blah
2222 days ago
|
|
The long line of sed is out-of-date and thus "broken". For something simpler that works, try this: nyt tr |sed 's/ *//;/</!d'|uniq > travel.html
This will produce a simple web page of titles and URLs for each article page.An interesting point of discussion might be the amount of third party cruft on the template article page versus the more dynamic front page. When Javascript is disabled, on each article page all images display and there are no ads. Downloading any video in the page is as simple as curl -O `grep -o https://[^\"]*mp4 article.html`
|
|