Hacker News new | ask | show | jobs
by teekert 3714 days ago
Natsort is a lifesaver when working with filenames numbered by humans (like file1, file2 ... file11), those will be sorted correctly. Beats asking people to "Please add leading 0's oh and when you suspect you will pass 100, add 2 leading 0's."
2 comments

I dislike how it changes behavior from release to release, for example foo-1.2, id that foo 1.2 or foo -1.2? Default dpends on release of natsort with new routines to restore previous behavior.
FWIW, the sort method (and sorted keyword) take a 'key' keyword, where you can pass a function to use to calculate the key to sort the sequence with. So in your file11 case, you can do:

sorted(files, key=lambda x: int(x[4:])

, and it will do the right thing.

Although with natsort, you don't have to parse the actual strings yourself.

That is a neat trick, but it would be incredibly brittle. Kids, don't try this at home!
Pass in an re.match or re.search based function, i would imagine that would be powerful enough to meet most needs.

import re

x = ['foo12901','fooo900','fooooooo980090']

x =sorted(x,key = lambdax:int(re.search('\d+',x).group()))

print(x)

+1 this is the right way to build a custom sorting function. The only thing worse than relying on ad-hoc heuristics for processing your data is relying on heuristics that somebody else maintains!