Hacker News new | ask | show | jobs
by apjana 2762 days ago
Author of `nnn` here. Thanks for the appreciation.

Please let us know what you like in vifm which isn't available in `nnn` and we will consider the features. However, I must say the last thing we want to see in `nnn` is feature bloat. You can extend it as you wish through scripts which `nnn` supports in any number.

No, `nnn` scans the complete dir tree otherwise du won't work. It rescans because data on disks keep changing e.g. on our server where 17 people run VMs. It can be easily changed to static but the rescan is very fast and users specifically asked for it. The memory usage is less because `nnn` uses much less memory for all operations in general.

2 comments

I appreciate your work, but you're not being very honest with your claims.

nnn is not keeping information about 400K files in memory in that benchmark. As a result, the rescan is necessary when changing directory. The rescan may be fast in many cases and in some cases it may even be what you'd want, but I can also name many cases where you certainly won't want it (large NFS mounts being one example).

Sorry for the pedantry. I spent a fair amount of time optimizing ncdu's memory usage, so I tend to have an opinion on this topic. :)

I think we are saying the same thing in different lingo. I am trying to say, you do not need to store it if you can have fast rescans.

Coming to memory usage, if you store the sizes of every file you need 400K * 8 bytes = ~3 MB.

Now `ncdu` uses ~60 MB and `nnn` uses ~3.5 MB. How do you justify that huge gap?

> but you're not being very honest with your claims

No, I am completely honest within the limits of my technical understanding. Your tool uses 57 MB extra which would be considerable on a Raspberry Pi model B. To an end user, it's not important how a tool shows the du of `/`, what's important is - is the tool reasonable or not? I don't know how `ncdu` manages the memory within, I took a snapshot of the memory usage at `/`.

In fact, now I have questions about your very first line beginning with `This looks incredibly cool` and then the comparisons of it with different utilities in negative light. (I must be a fool realizing it now, I should have seen it coming.)

And I'm saying you can't have fast rescans in all cases - it very much depends on the filesystem and directory structure.

I'm not trying to downplay nnn - I meant it when I said it's a cool project! I'm saying each project has its strengths and weaknesses, but your marketing doesn't reflect that (or I missed it).

ncdu's memory usage is definitely its weak point - that's not news to me - but it's because I chose the other side of the trade-off: No rescans. If you're curious where that 60MB goes to, it's 400K of 'struct dir's: https://g.blicky.net/ncdu.git/tree/src/global.h?id=d95c65b0#...

I honestly don’t understand why you’re getting down voted when all you’re doing is explaining the design decisions behind your own utility.
You’re being very snarky considering how quick you to start the debate, where you boasted about how much better optimised your tool was in your GitHub README.

I grow rather tired of comparisons where one tool tries make itself look better than another based purely on a solitary arbitrary metric like memory usage. It’s not a fair benchmark and really it’s just an excuse to make yourself look better by bad mouthing someone else’s code.

What’s to say the other tools haven’t employed the same algorithms you vaguely stipulated you had (I say “vaguely because you don’t even state which highly optimised algorithms you’ve use)? Have you read the source code of the other projects you’ve commented on to check they’re not doing the same - nor even better? Because your README is written like you’re arguing that other projects are not optimised.

What’s to say that the larger memory usage (which is still peanuts compared to most file managers) isn’t because the developer optimised performance over literally just a few extra KB of system memory? Their tool might run circles around yours for FUSE file systems, network mounted volumes, external storage devices with lower IOPS, etc.

But no, instead you get snarky when the author of one of the tools you were so ready to dismiss as being worse starts making a more detailed points about practical, real world usage beyond indexing files on an SSD.

It wasn't a debate. You asked, I answered. And if you read carefully, there is _not_a_single_comment_ on the quality of a single other utility in the README. We recorded what we saw and I have shared the reason why.

I am not going to respond any further and would appreciate it if you refrain from getting personal with "not being completely honest", "being very snarky" etc. Please don't judge me by the project page of a utility which is a work of several contributors. That's all.

I’m not related to the GP.

Let me explain the point further:

Your readme has a performance section, that section focuses on nnn vs two other tools. You only benchmark against Memory usage under normal circumstances (ie no other performance metric, no other file system nor device types, etc). Then you have a whole other page dedicated to “why is nnn so much smaller” which is directly linked to from the performance comparisons. There’s no other way to take that other than you’re directly comparing nnn to other tools and objectively saying it’s better.

So with that in mind, I think the developers of the other tooare totally with in their right to challenge you on your claims.

Edit: the “multiple contributors” point you made is also rather dishonest too. It’s your personal GitHub account for a project you chiefly contribute too and the documents in question were created and edited by yourself (according to git blame). Yes nnn has other contributors too but it was yourself who wrote and published the claims being questioned.

> totally with in their right to challenge you on your claims

Yes, and within the limits of common courtesy.

The other utility does only one thing - reports disk usage so there's not much to compare. The dev did mention that `ncdu's memory usage is definitely its weak point`.

> no other performance metric, no other file system nor device types

because lstat64() is at the core of the performance metric of the feature we are comparing here and with the same number of files on the same storage device the number of accesses are exactly the same. The only metric that differentiates the utilities is memory usage.

> Edit: the “multiple contributors” point you made is also rather dishonest too.

Not really, I prefer to edit the readme myself because I want to keep the documentation clean. You will see features contributed by other devs for which I have written the docs from readme to manpage. Regarding the metrics, sometimes I have taken the data and sometimes I have requested someone else to collect it. Or doesn't that count as contribution?

Your calculation (`400K * 8 bytes = ~3 MB`) is way off. What would be the point of storing only the size? You need to map it back to the file.

60MB gives you about 150 bytes for file path or file name and its size, which sounds plausible.

Maybe you shouldn't store the file path but just the name, and a parent pointer. That brings you down to 8 bytes size + Parent pointer + a short string. Regarding the string you can go for offsets into a string pool (memory chunk containing zero terminated strings).

So I think 50 bytes per file is easy to accomplish if (name/parent/path) + size is all you want to cache. For speed-up, I would add another 4 or 8 bytes index to map each directory to its first child.

I can think of at least 3 possible algorithms to use much much less memory even with a static snapshot of the complete subtree. And all of them are broken because the filesystem is supposed to stay online and change. It's realistically useful to scan an external disk to find the largest file etc., but not accurate on a live server, a desktop with several ongoing downloads, video multiplexing etc.
Earlier in the thread you suggested that it's hard to justify ncdu using 60MB while it takes only 3.5MB to store 400K * 8 bytes numbers. The number you came up with is just silly and overlooks actual complexity of the problem.

Given that you are making an implicit judgement about the other program, don't be sloppy with your estimates.

> don't be sloppy with your estimates

I'm not. You can, and I'm sure eventually you will arrive very close to the approximation.

I'd been a big time fan of `ncdu` for years and even wrote in an e-journal about it once. Maybe that's why the sharp adjectives became more difficult to digest. Anyway, good luck!

About vifm (don’t know if nnn has this):

- split screen (files left, file contents on the right) - customise file viewers - quick file search - customise key-bindings