Hacker News new | ask | show | jobs
by CoolGuySteve 2631 days ago
You should try with the pathological but relatively common case of thousands of files named 'logname.YYYYMMDD.log.gz'
1 comments

The final implementation of ListDir has two worse-case scenarios. One is when all files have identical 8 first character but then differ almost immediately. This is bad because I'm telling memcmp that I have 256 bytes of data, which causes it to use the vectorized loop only to exit it on the first iteration. Another is when all files are 255 characters long and differ at the very end. This is bad because string comparisons become very expensive. Even though I'm not showing these benchmarks in the article, this implementation performs very well even in these cases.