Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 1488 days ago
Taking the examples from https://www.youtube-nocookie.com/embed/hA1ZsxE8VJg I am sharing how I approach the simple problems in the video without using Python or having any knowledge of CSS selectors.

Retrieving the HTML

   echo https://www.nytimes.com|yy025|nc -vv proxy 80 > 1.htm
yy025 is a flexible utility I wrote to generate custom HTTP from URLs. It is controlled through environmental variables. nc is a tcpclient, such as netcat. proxy is a HOSTS file entry for a localhost TLS proxy. The sequence "yy025|tcpclient" is normally contained in a shell script that adds a <base href> tag, something like

   #! /bin/sh
   yy025 5>.1 >.2
   read x < .1;
   echo "<base href=https://$x />";
   nc -vv proxy 80 < .2|yy045;
yy045 is a utility that removes chunked transfer encoding.

The benefit of using separate, small programs that do one thing will be illustrated in the solution for Problem 3.