Hacker News new | ask | show | jobs
by vortico 3900 days ago
Nice! I've been using

    sort -r | head -n100
but obviously this requires the entire file to be shuffled before printing the first 100 lines.
2 comments

-r gives a random hash. So it will do the wrong thing in the face of repeated lines. (Either you get all instances of a repeated line or none.)
The -R option not being available on OS X, you might do something like

  awk "BEGIN { srand($RANDOM) } { print int(rand() * 1000000), \$0 }" | sort -n | cut -d' ' -f2-
to shuffle an input
Note that hnov's awk command is the equivalent of "sort (random order)" at that and shows good randomness properties in the plot. However, that link shows "sort (random comparator)" by default which looks terrible at randomly sorting lists. hnov's awk script should be suitable for most needs, though I'd tweak it a bit:

     awk "{print rand(), $0}" | sort -g | cut -d' ' -f2-
which is shorter allows more than 1,000,000 random values, namely ~52bits in awk's 64bit implementations.