| I've written a few useful scripts that everyone should have. histogram - simply counts each occurrence of a line and then outputs from highest to lowest. I've implemented this program in several different languages for learning purposes. There are practical tricks that one can apply, such as hashing any line longer than the hash itself. unique - like uniq but doesn't need to have sorted input! again, one can simply hash very long lines to save memory. datetimes - looks for numbers that might be dates (seconds or milliseconds in certain reasonable ranges) and adds the human readable version of the date as comments to the end of the line they appear in. This is probably my most used script (I work with protocol buffers that often store dates as int64s). human - reformats numbers into either powers of 2 or powers of 10. inspired obviously by the -h and -H flags from df. I'm sure I have a few more but if I can't remember them from the top of my head, then they clearly aren't quite as generally useful. Anyone else have some useful scripts like these? |
Is this much different than `alias histogram="sort $1 | uniq -c | uniq -nr"`
Sidenote: I started https://github.com/jldugger/moarutils as a means of publishing and sharing these, but it turns out I don't even have a lot of dumb ideas. Will probably end up bookmarking this HN post for "later."