Hacker News new | ask | show | jobs
by Dylan16807 434 days ago
Loading up your parsing code and reopening the file every time a setting is queried sounds to me like it would increase the average memory use of most programs.
2 comments

The ssh config format has almost no context, and the code is static and always "loaded up". I can all but guarantee this isn't correct. Modern hackers tend to wildly overestimate the complexity of ancient tasks like parsing.
If you're actually concerned about the handfuls of bytes a settings object would take, you would make the page/segment containing parser code able to be unloaded from memory.
You don't care about average memory use, you care about peak memory use.
Same criticism. When the program is in the middle of busy runtime activity, with all the memory that entails, it's the worst time to also load up the parser.
Doesn't really sound much better. You still load up the file(s) and the parser either way, so parsing all once vs on-demand is just a question of computation duration and considering how many config options are used the on-demand just seems really wasteful, especially after startup.
> load up the file

I/O is done piecewise, a line at a time. The file is never "loaded up". Again you're applying an intuition about how parsers are presented to college students (suck it all into RAM and build a big in-memory representation of the syntax tree) that doesn't match the way actual config file parsers work (read a line and interpret it, then read another).

I didn't mean it in a way that "all of the file is loaded into memory", just the parts you are always processing at the time (e.g. as you said line wise), which either way result in the same memory usage from the file being loaded.
The GP is correct in terms of super old systems.

In said systems, RAM was such an expensive resource that we had to save individual bits wherever we could. Such as only storing the last two digits of the year (aka the millennium bug).

The computational cost of infrequently rescanning the config files then freeing the memory afterwards was much cheaper than the cost of storing those config files into RAM. And I say “infrequently rescanning” because you weren’t talking about people logging in and out of TSSs at rapid intervals.

That all said, sshd was first written in the 90s so I find it hard to believe RAM considerations was the reason for the “first match” design of sshd’s config. More likely, it inherited that design choice from rsh or some other 1970s predecessor.

> hard to believe RAM considerations was the reason for the “first match” design of sshd’s config

And I repeat: first match involves less code. It's a simpler design. The RAM point was an interesting digression, I literally put it in parentheses!

I don’t think it does require less code. I don’t think it requires more code either. It’s just not a fundamental code change.

The difference is just either: overwriting values or exiting in the presence of a match. Either way it’s the same parser rules you have to write for the config file structure.

OK, but now that's a performance regression. The assumption upthread was that the whole file needed to be parsed into an in-memory representation. If you don't do that, sure, you can implement precedence either way. But parsing all the way to the end for every read is ridiculous. The choice is between "parse all at once", which allows for arbitrary precedence rules but involves more code, and "parse at attribute read time", which involves less code but naturally wants to be a "first match" precedence.