Hacker News new | ask | show | jobs
by tyingq 1750 days ago
Strace shows "ls -f" calls getdents64() also. Same for perl -E 'opendir(my $f,".");say while readdir($f)'. Both call it with a 32k buffer.

That is, glibc() seems aware that getdents64() is the right syscall in both of those contexts.

Either runs in 300ms or so for a directory with a million files on an old laptop via WSL2.

Is this solving a problem Linux used to have maybe? It doesn't seem to be a problem now so long as you call ls in a way that doesn't make it stat() every file. Or does using a larger than 32k buffer with getdents64() really make that much of a difference?

1 comments

ls actually uses readdir() from glibc, and glibc's readdir() calls getdents()/getdents64() (I checked the source)

It's not solving a problem Linux used to have - I used "-f" or "find" to avoid this issue first time sometimes in the mid 90's.

But a lot of people aren't aware, and it's perhaps a poor default and/or it'd be nice if ls output a warning with a hint if a stat of the directory indicates it's likely to be a big one.

The buffer size probably does make a difference (and current glibc scales the buffer based on st_blksize reported from stat, up to 1MB at most), but on large directories "-f" and/or avoiding anything triggering a stat() will be a far bigger deal than upping the buffer size.

It would be nice if there was an easy way of adjusting the glibc buffer sizes though (currently it takes modifying constants in the source and recompiling glibc to increase it.

> ls actually uses readdir() from glibc, and glibc's readdir() calls getdents()/getdents64() (I checked the source)

That's what I meant by "glibc seems aware that getdents64() is the right syscall in both of those contexts".