Hacker News new | ask | show | jobs
by jakevoytko 5797 days ago
"I need polling" => "Here are my options, which one is better?" => "They're good for different things" => "I'll pick the best one for the environment" is a reasonable design process. More so than some decisions that I make! Yes, there's a fixed time budget. But you're suggesting selectively ignoring evidence when designing a program, preferring random guessing and pattern matching to actual numbers. Should he have collected them to begin with? Maybe not, but sometimes you can't help your curiosity on a hobby project :). I understand your concern about this hypothetical production system, but the fact of the matter is that there is no production system right now, and no way to measure how it will handle certain things, but there are benchmark numbers. Better than nothing, I say!
3 comments

> => "I'll pick the best one for the environment"

Which, in reality is, "I'll spend a lot of design and implementation effort designing a new one which may or may not improve the measurable, global performance of my new web server because it's not yet at the point where I can benchmark these sorts of things to verify that I'm not wasting a whole ton of effort that could be better spent by deciding that epoll is fast enough."

Maybe Zed knows from his previous server experience that {e}poll is where he hits a bottleneck; it's just that if there's any chance that it's not, he could be wasting a bunch of time implementing "superpoll".

(Or maybe he just wants to do it because it's neat, or because it's innovative (which it is), or for any number of other reasons. I'm just pointing out that he's doing much more than picking "the best one for the environment")

It's an idea I had after actually measuring. If it doesn't work then I tried something out.

What you really should be getting from it though is that epoll is not faster. It is not O(1). It is not faster on smaller vs. larger lists of FDs. Pretty much all the things you were told as advantages of epoll are total crap.

The only advantage of epoll is it's O(N=active) when poll is O(N=total). That's it.

So at a minimum I've done some education and spent some time learning something.

I tried to make sure I gave lots of reasons that justify your work; I do think it's cool.

I just wanted to say that it is not an unquestionable design decision.

Rock on with the superpoll, I hope it's awesome and very successful.

... or it's good "PR" so people get behind the project. It screams "I know what I'm doing!!! I'm even optimizing this!!!". ;)
The funny thing is that he's either going to pick the wrong solution for the workload or spend a lot of time on creating a hybrid which will work as good as epoll would without augmentation.

Zed is clearly out to change the world and I would very much like him to succeed but he just seems to be missing the obvious here, which is that idle connections are the ones to worry about (because they're very expensive!) so his benchmark at this point in time is useless.

First off, again you miss the point that I've already got poll working in Mongrel2. If this fails (which I'll know since I like, measure stuff rather than care a bottle of cheetah blood around like you) then I'll just go put epoll in there or leave it with poll.

But hey, if I don't make it back alive from my complete dangerous experiment in disproving that epoll is always the way to go always you can come get me. Bring a big gun because this stuff is so scawy and howwible I might not make out alive.

"I need a filesystem" => "Here are my options, which one is better?" => "They're good for different things"...

STOP

For most things, it doesn't matter. A filesystem is a filesystem.

You only need to make decisions like that when you properly measure and decide that X may be a bottleneck, or you need features that Y has but X doesn't.

In nearly every large system I've been involved in designing, a "filesystem is a filesystem" would give you a great chance of rendering the system non-functional or performing so bad it might as well be non-functional when making simple design decisions based on knowledge about the problem made it easy to avoid in the first place.

Premature optimization is a sin, but making design decisions you know from experience has a major impact is not, as long as you measure to confirm afterwards.

We don't design by throwing dice - most decisions we make are based on experience or assumptions about what works and what doesn't. Measurements are important to challenge those assumptions, but it doesn't take away the value of making use of experience to create a reasonable baseline.

Where "premature optimization" comes in is where you start expending unnecessary extra effort to implement a more complicated solution without evidence to back up the need for it.

Spending a little bit of time to think through the requirements for major aspects of your system is not extra effort.