| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kragen 237 days ago

Hey, this is great! Thanks!

It does look a lot like what I was thinking would be necessary. About 9 of the 19 lines are concerned with splitting the input into words. Also, I think you have omitted the secondary key sort (alphabetical ascending), although that's only about one more line of code, something like

  #'(lambda (a b)
       (or (< (car a) (car b)) 
           (and (= (car a) (car b)) 
                (string> (cadr a) (cadr b)))))

Because the lines of code are longer, it's about 3× as much code as the verbose Perl version.

In SBCL on my phone it's consistently slower than Perl on my test file (the King James Bible), but only slightly: 2.11 seconds to Perl's 2.05–2.07. It's pretty surprising that they are so close.

1 comments

eadmund 236 days ago

Doh, I missed the secondary sort.

Were I trying to optimise this, I would test to see if a hash table of alphabetical characters is better, or just checking (or (and (char>= c #\A) (char<= c #\Z)) (and (char>= c #\a) (char<= c #\z))). The accumulator would probably be better as an adjustable array with a fill pointer allocated once, filled with VECTOR-PUSH-EXTEND and reset each time. It might be better to use DO, initializing C and declaring its type.

Also worth giving it a shot with (optimize (speed 3) (safety 0)) just to see if it makes a difference.

Yes, definitely more verbose. Perl is good at this sort of task!