| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cgag 2941 days ago

I just tried switching from mmaping everything in loc to always just reading all bytes in the file and it went from about a second to ~530ms on the linux kernel, with tokei at around 750ms.

edit: ryzen 1700, 3.8gz, 8cores 16 threads, fedora 4.16.13-300.fc28.x86_64

Tokei is definitely more accurate though, by probably a pretty wide margin. I'm hoping to get around to handling nested comments correctly soon and maybe strings.

I'll definitely have to take a look at this in detail when it's not 6am. Or on a day where I wake up at 6am instead of stay up until 6am.

1 comments

boyter 2941 days ago

Not that I didn’t believe BurntSushi but I wanted my own validation. Not suprised you got the same result. I think with whitelisting there is probably no need to mmap for these tools.

The nested comments and strings will probably slow you down a lot. I know it did for me. I’m looking forward to the new GC settings in Go so I can tweak it for faster performance.

Did you ever work out why loc was performing so badly on multi core systems? Sounds like you did but curious what the bottleneck was. I don’t understand rust well enough to be able to guess sorry.

link

cgag 2941 days ago

I thought I similar utilization in tokei on my machine. I copied the concurrency pattern straight from ripgrep. I'll have to take a look tomorrow. I assumed I was still bottlenecked on reads.

link

boyter 2941 days ago

Interesting. Actually being blocked on reads sounds about right. I’ll try running the benchmarks again one of these days. Probably after I get access to the Go GC controls that are apparently on the way.

link