Hacker News new | ask | show | jobs
by minimax 4946 days ago
Reservoir sampling allows you to return a specific number of samples regardless of the size of the input rather than a specific fraction of samples (which is what your awk script does).
1 comments

On the other hand, the awk script handles streaming quite nicely; If you have a few hundred machines and you want to just get a sense of log messages on all of them (in real time; say you're about to change something and want to just eyeball whether the mix of error messages changes), you could do something very quick and dirty with that awk script.