|
"Have you ever tried to download videos from YouTube? I mean manually without relying on software like youtube-dl, yt-dlp or one of "these" websites. It's much more complicated than you might think." This reminds me of some sort of fizzbuzz test. This is not complicated at all. There is no need to use the Range header or run Javascript. The short script below does not download anything because there is no need. It does not use Range headers, it does not run Javascript and it makes only one TCP connection. With the JSON it fetches, one can simply extract the videoplayback URLs and put them in a locally-hosted HTML page with no Javascript. #!/bin/sh
# usage: echo videoId | $0 <-- this will indicate len to use
# usage: echo videoId | $0 len | openssl s_client -connect www.youtube.com:443 -ign_eof
# usage: $0 len < videoId-list | openssl s_client -connect www.youtube.com:443 -ign_eof
(
while read x;do
test ${#x} -eq 11||continue
if test $# -ne 1;then len=${#x};x=$(grep -m1 ^\{ $0|sed 's/\$x//'|wc -c);exec echo usage: ${0##*/} $((x+len));fi
cr=$(printf '\r');
sed "/^[a-zA-Z].*: /s/$/$cr/;s/^$/$cr/" << eof
POST /youtubei/v1/player?key=AIzaSyA8eiZmM1FaDVjRy-df2KTyQ_vz_yYM39w HTTP/1.1
Host: www.youtube.com
Content-Type: application/json
Content-Length: $1
Connection: keep-alive
{"context": {"client": {"clientName": "IOS", "clientVersion": "17.33.2" }}, "videoId": "$x", "params": "CgIQBg==", "playbackContext": {"contentPlaybackContext": {"html5Preference": "HTML5_PREF_WANTS"}}, "contentCheckOk": true, "racyCheckOk": true}
eof
done
printf '\r\n'
printf 'GET /robots.txt HTTP/1.0\r\nHost: www.youtube.com\r\nConnection: close\r\n\r\n';
)
For processing the JSON I wrote custom utilities in C that (a) extract videoIds and other useful strings, (b) generate HTTP similar to above, and (c) filter the returned JSON into CSV, SQL or HTML. For me, these run faster than Python and jq and are easier to edit. Using these utilities I can also do full searches that return hundreds to thousands of results and I can easily exclude all "suggested" or "recommended" videos.CSV output 1666520150,23 Oct 2022 10:15:50 UTC,22,aqz-KE-bpKQ,"Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film",00:10:35,635,UCSMOQeBJ2RAnuFungnQOxLg,19211597,"Blender" SQL output INSERT INTO t1(ts,utc,itag,vid,title,dur,len,cid,views,author) VALUES(1666520150,'23 Oct 2022 10:15:50 UTC',22,'aqz-KE-bpKQ','Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film','00:10:35',635,'UCSMOQeBJ2RAnuFungnQOxLg',19211597,'Blender') ON CONFLICT(vid) DO UPDATE SET views=excluded.views; HTML output Looks just like CSV except vid is a hyperlink |