| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by teekert 3714 days ago
	Natsort is a lifesaver when working with filenames numbered by humans (like file1, file2 ... file11), those will be sorted correctly. Beats asking people to "Please add leading 0's oh and when you suspect you will pass 100, add 2 leading 0's."

2 comments

mzs 3714 days ago

I dislike how it changes behavior from release to release, for example foo-1.2, id that foo 1.2 or foo -1.2? Default dpends on release of natsort with new routines to restore previous behavior.

link

herge 3714 days ago

FWIW, the sort method (and sorted keyword) take a 'key' keyword, where you can pass a function to use to calculate the key to sort the sequence with. So in your file11 case, you can do:

sorted(files, key=lambda x: int(x[4:])

, and it will do the right thing.

Although with natsort, you don't have to parse the actual strings yourself.

link

daveguy 3714 days ago

That is a neat trick, but it would be incredibly brittle. Kids, don't try this at home!

link

solaxun 3714 days ago

Pass in an re.match or re.search based function, i would imagine that would be powerful enough to meet most needs.

import re

x = ['foo12901','fooo900','fooooooo980090']

x =sorted(x,key = lambdax:int(re.search('\d+',x).group()))

print(x)

link

shoyer 3714 days ago

+1 this is the right way to build a custom sorting function. The only thing worse than relying on ad-hoc heuristics for processing your data is relying on heuristics that somebody else maintains!

link