Hacker News new | ask | show | jobs
by 101914 4229 days ago
Remarkably, YouTube makes scripting downloads very easy. The script below needs only sed and some http client and it has worked for years. I have only had to change it once when there was a change at YouTube; the change was very small.

   # this script uses sh, sed, awk, tr and some http client
   # here, some http client = tnftp
   # awk and tr are optional
   
   
   # wrapper for tnftp to accept urls from stdin
   ftp1(){
   while read a;do 
   ftp ${@--4vdo-} "$a" 
   done;}
   
   
   # uniq
   awk1(){ awk '!($0 in a){a[$0];print}' ;}
   
   
   # some url decoding
   f1(){
   sed '
   s,%3D,=,g;
   s,%3A,:,g;
   s,%2F,/,g;
   s,%3F,?,g;
   s/^M      
   //g;
   #  ^ thats Ctrl-V then Ctrl-M in vi   
   ' 
   }
   
   # remove redundant itags
   f0(){
   sed -e '
   s/&itag=5//;t1
   s/&itag=1[78]//;t1
   s/&itag=22//;t1
   s/&itag=3[4-8]//;t1
   s/&itag=4[3-6]//;t1
   s/&itag=1[346][0-9]//;t1
   ' -e :1
   }
   
   # separate urls 
   f2(){
   sed '
   s,http,\
   &,g' 
   }
   
   # remove unneeded lines
   f3(){
   sed '
   #/^http%3A%2F.*c.youtube.com/!d;
   /^http%3A%2F.*googlevideo.com/!d;
   /crossdomain.xml/d;
   s/%25/%/g;
   s,sig=,\&signature=,;
   s,\\u0026,\&,g;
   /&author=.*/d;
   ' 
   }
   
   
   
   # separate cgi arguments for debugging
   f4(){
   sed '
   s,%26,\
   ,g;
   s,&,\
   ,g;
   ' 
   }
   
   # remove more unneeded lines
   f5(){
   sed '
   /./!d;
   /quality=/d;
   /type=/d;
   /fallback_host=/d;
   /url=/d;
   /^http:/!s/^/\&/
   /^[^h].*:/d;
   /^http:.*doubleclick.net/d;
   /itag.*,/d;
   '
   }
   
   # print urls 
   f6(){
   sed 's/^http:/\
   &/' | tr -d '\012' \
   |sed '
   s/http:/\
   &/g;
   ' 
   }
   
   f8(){
   sed 's/https:/http:/'
   }
   
   FTPUSERAGENT="like OSX"
   
   case $# in
   0) 
   echo|$0 -h 
    ;;
   [12345])
   case $1 in
   
   -h|--h)
   echo "url=http[s]://www.youtube.com/watch?v=..........."
   echo usage1: echo url\|$0 -F \(get itag-no\'s\)
   echo usage2: echo url\|$0 -g \(get download urls\)
   echo usage3: echo url\|$0 -fitag-no -4o video-file
   echo N.B. no space permitted after -f
   
    ;;
   -F)
   $0 -g \
   |tr '&' '\012' \
   |sed '
   /,/d;
   /itag=[0-9]/!d;
   s/itag=//;
   /^17$/s/$/ 3GP/;
   /^36$/s/$/ 3GP/;
   /^[56]$/s/$/ FLV/;
   /^3[45]$/s/$/ FLV/;
   /^18$/s/$/ MP4/;
   /^22$/s/$/ MP4/;
   /^3[78]$/s/$/ MP4/;
   /^8[2-5]$/s/$/ MP4/;
   s/.*?//;
   '|awk1
    ;;

   -g)
   while read a;do
   n=1
   while [ $n -le 10 ];do
   echo $a|f8|ftp1||
   echo $a|f8|ftp1 &&
   break
   n=$((n+1))
   done \
   |f2|f3|f1|f0|f4|f5|f6|f1|sed '/itag='"$2"'/!d'
   done
    ;;

   -f*)
   while read a;do
   n=1
   while [ $n -le 10 ];do
   echo $a|$0 -g ${1#-f} |ftp1 $2 $3 $4 $5 ||
   echo $a|$0 -g ${1#-f} |ftp1 $2 $3 $4 $5  && 
   break
   n=$((n+1))
   done
   done
    ;;

   esac
   esac

There are separate scripts for extracting www.youtube.com/watch?v=........... urls from web pages to feed to this script.
2 comments

The problem is that this only works for some YouTube videos (for example it will fail for basically all VEVO videos), not to mention maintainability issues.
I had to look up what "VEVO" was. A joint venture of several major record labels and Google launched in 2009.

Personally I have no need for "VEVO" videos. Nor do I ever encounter VEVO youtube urls posted to websites, like HN. I wonder why?

As for maintainability, I beg to differ. The raison d'etre for this script arose out of frustration that early YouTube download solutions, e.g. gawk scripts, clive, etc., kept breaking whenever something at YouTube changed. I got tired of waiting for these programs to be fixed, if that ever happened.

I can fix this 164 line script faster if YouTube changes something than waiting for a third party to fix something they developed that is far more complex. Moreover, it does not rely on Python. Is there something wrong with DIY?

I see someone posted a link in this thread to another 208 line script, yget, that uses sed and awk. This further demonstrates the relative simplicity of downloading YouTube videos.

>Personally I do not watch "VEVO" videos but I am curious what they are. //

Go to youtube.com, you may need to scroll but most unlikely, bam! "VEVO" video with x-million views: it's a music video promotion brand.

Actually it's not globally promoted so outside of Western Europe and USA I'd guess you don't get VEVO vids so much?

According to https://www.youtube.com/watch?v=5zs1ClgqhLw their 100th most viewed video has 200 million views. Top 10 are all above 600 million.

They're quite a big brand.

Interesting.

An alternative to goofing around on the youtube.com web site, scrolling constantly and getting hit with advertising and endless lists of "related" videos is to search and retrieve youtube urls from the command line via gdata.youtube.com.

Pastebin next time, please.