| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by adimitrov 3436 days ago

The awk line for duplicate lines without changing ordering is… wow. Awk is such an underused tool (at least for me.) My solution was:

    awk '{printf "%2d %s\n", NR, $0}' < faces.txt | sort -k 2 -u | sort -n | sed 's/^...//'

Ironically, it also involved awk! But I used it to prepend line numbers, which uniq and sort can ignore, then re-sort it according to line numbers. Very much not an ideal solution!

2 comments

ianmcgowan 3436 days ago

Mine was more procedural - store each line in a hash as you visit it, and only print lines you haven't already visited. awk is awesome ;-)

awk '!h[$0] {h[$0]=1; print}' faces.txt

link

schoen 3436 days ago

I also used this approach but the original creator's version posted on GitHub is remarkably more concise (due to a clever use of the ++ operator).

link

ianmcgowan 3436 days ago

awk '!h[$0]++' faces.txt

That is clever! Awk golf hole-in-one ;-)

link

jenrzzz 3436 days ago

I love awk. I used to be afraid of it, but I started using it just to pluck columns, and over a couple years actually started to grok it.

Also, I learned today that cat takes a -n option to prepend line numbers to the output! My solution ended up being

    cat -n faces.txt | sort -k2 -k1n | uniq -f1 | sort -nk1,1 | cut -f2-

link