Hacker News new | ask | show | jobs
by x3blah 2217 days ago
What if PubMed had something like Google's "I'm feeling lucky"? What if we could explore PubMed by selecting a random PubMed URL instead of searching? This script generates a random PubMed URL. To do this we need to know the maximum PMID number in the PubMed database. The current max is included in the script and will be saved in a 9-byte file named "max-PMID" when the script is run. If run with the argument "update" it will search for a newer max PMID. If a newer max PMID is found, the script updates the number in the max-PMID file and in the script itself. An alternative is to use the ftp server[1] to find the max PMID; I noticed the latest ftp update was missing new PMID's caught by this script. If run without any arguments it selects a random PMID between 1 and the max and outputs a URL. uses socat, GNU sed and requires a fifo named "1.fifo" 1. ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/

       #!/bin/sh
       test -s max-PMID||echo 32446294 > max-PMID;read x < max-PMID;x=$((x-1));h=pubmed.ncbi.nlm.nih.gov;
       test ${#x} -eq 8||exec echo weird max-PMID;sed -i "/test/s/echo [0-9]\{8\} /echo $x /" $0;
       case $1 in update) mkfifo 1.fifo 2>/dev/null;test -p 1.fifo||exec echo need 1.fifo;
       (grep "<title>PMID .* is not available" < 1.fifo|sed 1q|sed 's/<title>PMID //;s/ *//;s/ .*//;' >max-PMID)&
       y=$((x+10000));seq $x $y|sed '$!s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: keep-alive\r\n\r\n|; 
       $s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: close\r\n\r\n|'|socat - ssl:$h:443 >1.fifo 2>/dev/null;
       ;;"")awk -v min=1 -v max=$x 'BEGIN{srand();printf "https://'$h'/" int(min+rand()*(max-min+1)) "/\n"}';esac
1 comments

Improved

   #/bin/sh
   test -s max-PMID||echo 32449615 > max-PMID;read x < max-PMID;h=pubmed.ncbi.nlm.nih.gov;
   test ${#x} -eq 8||rm max-PMID;sed -i "s/[0-9]\{8\}/$x/" $0;
   case $1 in update) mkfifo 1.fifo 2>/dev/null;test -p 1.fifo||exec echo need 1.fifo;
   (grep "<title>PMID .* is not available" < 1.fifo|sed 1q|sed -n 's/<title>PMID //;s/ *//;s/ .*//;wmax-PMID')&
   y=$((x+10000));seq $x $y|sed '$!s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: keep-alive\r\n\r\n|; 
   $s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: close\r\n\r\n|'|socat - ssl:$h:443 2>/dev/null|grep -o '<title>[^<]*' >1.fifo;
   ;;"")awk -v min=1 -v max=$((x-1)) 'BEGIN{srand();printf "https://'$h'/" int(min+rand()*(max-min+1)) "/\n"}';esac