| "It's text, just send the damn text." They only send what the user requests. Using a software program that makes automatic requests that you are not easily in control of, e.g., a popular web browser, might give the impression that they control what is sent. They do not control what is sent. The user does.^1 The user makes a request and they send a response. One of the requests a fully-automatic web browser makes to NYT is to static01.nyt.com Personally, as a user who prefers text-only, this is the only request I need to make. As such I don't really need a heavily marketed, fully-automatic, graphical, ad-blocking web browser to make a single request for some text.^2 #! /bin/sh
case $1 in
world |w*) x=world # shortcut: w
;;us |u*) x=us # shortcut: u
;;politics |p*) x=politics # shortcut: p
;;nyregion |n*) x=nyregion # shortcut: n
;;business |bu*) x=business # shortcut: bu
;;opinion |o*) x=opinion # shortcut: o
;;technology |te*) x=technology # shortcut: te
;;science |sc*) x=science # shortcut: sc
;;health |h*) x=health # shortcut: h
;;sports |sp*) x=sports # shortcut: sp
;;arts |a*) x=arts # shortcut: a
;;books |bo*) x=books # shortcut: bo
;;style |st*) x=style # shortcut: st
;;food |f*) x=food # shortcut: f
;;travel |tr*) x=travel # shortcut: tr
;;magazine |m*) x=magazine # shortcut: m
;;t-magazine |t-*) x=t-magazine # shortcut: t-
;;realestate |r*) x=realestate # shortcut: r
;;*)
echo usage: $0 section
exec sed -n '/x=/!d;s/.*x=//;/sed/!p' $0
esac
curl -s https://static01.nyt.com/services/json/sectionfronts/$x/index.jsonp
Example: Make simple page of titles, article urls and captions, where above script is named "nyt".
nyt tr | sed '/\"headline\": \"/{s//<p>/;s/\".*/<\/p>/;p};/\"full\": \"/{s//<p>/;s/..$/<\/p>/;p};/\"link\": \"/{s///;s/ *//;s/\".*//;s|.*|<a href=&>&</a>|;p}' > travel.html
firefox ./travel.html
Source: https://news.ycombinator.com/item?id=22125882The truth is that they are just sending the damn text. However you are voluntarily choosing to use a software program that is automatically making requests for things other than the text of the article, i.e., "cruft". 1. The Google-sponsored HTTP/[23] protocol is seeking to change this dynamic, so if websites sending stuff to you without you requesting it first bothers you, you might want to think about how online advertisers and the companies that enable them might use these new protocols. 2. However I might use one for for viewing images, watching video, reading PDFs, etc., offline. Web browsers are useful programs for consuming media. It is in the simple task of making HTTP requests that their utility has diminished over time. The user is not really in control. |
I go to a restaurant and I can't just walk into the kitchen and grab a plate of food. Nor can I walk into the refrigerator, grab some supplied, and then walk over to the stations and start cooking. Instead I have wait to be seated, order indirectly via a waiter, wait for the chef and staff to prepare more order, etc...
It seems to me visiting a website is similar. The user choose to visit the site. That includes the 3rd parties and less controls. Just like I don't get to pick what sources the restaurant used for their food, nor do I have any say in their hiring or management practices. Nor do I have any choice in the music they play or the TVs they have on (bar like restaurants often have TVs). If I don't like their choices my choice is to be or not be a customer. I don't get to hack around that, walking in the back door and taking the food.
I know the analogy isn't perfect. It's my computer and I have no obligation to let them use it as they please vs as I please. But still, there's some middle ground IMO between the 2 extremes.