Drop millions of allocations by using a linked list | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Drop millions of allocations by using a linked list (github.com)
	223 points by jcn 4159 days ago

16 comments

bcg1 4159 days ago

I'm not a ruby dev, so I guess maybe my perspective is not that great on this particular issue... but hats off to the dev with the fix, indeed this is how free software collaboration is supposed to work in my opinion.

Even the dev with the fix wasn't rude about the original problem, he seemed pretty humble about it actually.

If you think the Ruby guys are such shitty programmers you should be able to dive into their codebases and find the plethora of problems to show them what's up... so either give them a pull request or STFU ;)

michaelfeathers 4159 days ago

> Even the dev with the fix wasn't rude about the original problem, he seemed pretty humble about it actually.

The Ruby community is very good interpersonally from my experience. It's a culture that I think comes from this:

http://en.wikipedia.org/wiki/MINASWAN

nevinera 4159 days ago

Unfortunately, the rails community are the visible minority, and they follow DHH's example more than Matz's.

thekaleb 4159 days ago

David Heinemeier Hansson[1] for anybody else that was confused about what "DHH" referred to.

[1]: http://en.wikipedia.org/wiki/David_Heinemeier_Hansson

thinkbohemian 4156 days ago

Tenderlove, the super nice guy who submitted that PR is on Rails core. Too bad to hear he's part of the evil visible minority that you just made up in your head.

nevinera 4154 days ago

Right, because I claimed that there are no nice people working on rails at some point..?

We were talking about the overall culture of the rails community. Nor did I intend to imply the DHH was evil, just abrasive.

"nice" != "good"

EpicEng 4159 days ago

>so either give them a pull request or STFU

Most people have neither the time nor inclination to fix problems with the language they use. If I buy a power saw and it turns out to be a POS, I just won't buy that brand again.

wukerplank 4159 days ago

You're talking about OSS - you didn't buy it. OSS does not entitle you to anything. Tenderlove and the Rubygems maintainers spend their time so you don't have to. If it doesn't fit your needs - okay. But don't act as if somebody owes you anything.

EpicEng 4159 days ago

Once you have a product, free or not, which is used and relied upon by a large group of people, you do owe them something. if you don't want that responsibility then don't release your code or get out when things become serious. I get the OSS philosophy, and I like many things about it, but I also understand why some won't touch it with a ten foot pole.

"Oh, I made a boneheaded error and now your code is 100x slower than it should be? Tough sh*t, fix it yourself. I owe you nothing!"

That's not a reasonable mentality to have if you want adoption. Obviously the Ruby devs don't feel this way, but you seem to. That sort of attitude is nonsense and short sighted.

coldtea 4158 days ago

>Once you have a product, free or not, which is used and relied upon by a large group of people, you do owe them something.

No, you really don't. The fact is: they owe you something, but it's OK, since you give it away for free.

>If you don't want that responsibility then don't release your code or get out when things become serious.

Or else?

>That's not a reasonable mentality to have if you want adoption.

What if you DON'T want adoption? Or you want just the smart kids that can contribute back to adopt your code?

EpicEng 4157 days ago

And philosophical OSS people wonder why so many people chose closed alternatives. If you don't want adoption, fine, but don't pretend like that's true for every OSS project out there. I have no problem with that mentality until it comes into stark contrast with the goals of those pushing OSS (see: EFF.) You're not going to be taken seriously by people if you don't support the code you put out there.

>Or else?

Or else people just don't use your software and don't buy into your ideology. If you don't care, fine, but obviously many OSS proponents do.

wukerplank 4159 days ago

> Tough sh*t, fix it yourself.

I never said that, but most projects welcome pull requests.

> I owe you nothing!

True, it's the short version of the MIT license.

> That's not a reasonable mentality to have if you want adoption.

Yes, maintainers should always try to take care of their projects and to accommodate the needs of their communities.

> Obviously the Ruby devs don't feel this way, but you seem to.

You are over-interpreting. But I hate seeing more and more OSS contributors getting burnt out and steamrolled by a self-entitled rout.

EpicEng 4159 days ago

Sorry of I misinterpreted your position, I thought that was exactly what you meant.

chowyuncat 4159 days ago

Please describe why you believe the product owners owe the userbase something.

EpicEng 4157 days ago

Because, for those projects who want a thriving userbase, the developers said "come, use our software to solve your problems." Once you do that you are making a promise to your users, whether explicit or implicit, that you'll be there when things go wrong.

If you release some script or small program which was useful to you thinking that it may help someone else down the road, I agree, you don't owe your users anything. However, when you position your solution as something people should adopt (e.g. RoR, etc) then you do owe a level of quality and support to your users (not saying RoR doesn't, just a random example.)

I love OSS, but every time it bites a user in the behind it loses mindshare.

skj 4159 days ago

Clearly that would require becoming adept with Ruby, which would require actually looking at Ruby code. That's a bit of a show-stopper.

jph 4159 days ago

Great pull request. Ruby makes it easy to duplicate data by calling `.dup` or `+`, and this does help with state isolation. But duplication is an expensive operation.

Ruby's standard libraries don't have much support for immutability, or deep cloning, or copy on write, or linking concatenation. There's no standard library way to ask for a snapshot of an object.

So in the early days for Ruby, an idiom was: if you're writing a method that takes a list, and you need to be sure your list doesn't change out from under you, then duplicate it, get it working, and if it becomes a bottleneck then optimize it.

rfrey 4159 days ago

I'm really surprised by the amount of smugness in the comments here. A bit of good-natured teasing, followed by a wheelbarrow full of "ruby-devs" this and "web-devs" that.

Take off your Hats of Superior Coding. Any one of us, regardless of honorific titles, could have made this mistake, and you know it. Being steeped in CS Fundamentals does not immunize you against bugs.

Congratulations to tenderlove for finding the bug. Remember the details - it'll be a great war story in a few years.

agentultra 4159 days ago

Smugness: the dark-side of hubris. Hubris being one of the three virtues [0].

I like to remember the past of computer programming as though it was once friendly and receptive to people of all skill levels. I owe quite a lot to the geeks who came before me and answered my stupid questions, gave me powerful tools to learn with, and accepted my contributions; flawed as they were. Without making a few mistakes along the way I wouldn't be where I am today.

I don't know whether I would have given up if the people I met were more smug and mean-spirited but my progress might have been slowed by it. Life is too short to waste bothering with people who are miserly with their good fortunes. It doesn't cost you anything to be nice and share your knowledge and wisdom. It may pay off when the person you're sharing with rises to stand upon your shoulders one day and pay homage to you.

[0] http://threevirtues.com/

duaneb 4159 days ago

You can be proud of your achievements without being smug.

Also, hubris being a virtue is crap. It may be decent for your self esteem but it makes you miserable to be around. Case in point: Larry Wall.

bigtunacan 4159 days ago

Agreed, hubris is not a virtue.

Merriam-Webster - hubris - "a great or foolish amount of pride or confidence"

It's just a synonym used for arrogance and smugness. Humbleness is a virtue.

wutbrodo 4159 days ago

From the way you phrased this, I think You may have missed the context link a couple comments above. Everyone is aware of what hubris means and that it's not a virtue per se. The point of the linked page is that it's (somewhat tongue-in-cheek) taking three vices and positioning them as virtues in a narrow context. It's just semantics; the saying could easily have used synonyms that have positive connotations, but using negative words instead is part of the joke.

bigtunacan 4159 days ago

Yes; apparently I had. Thanks.

ryanjshaw 4159 days ago

Is this even a bug or just a case of "in version 0.1 we'll do this quick & dirty", i.e. unaddressed technical debt?

I make a point to keep track of all technical debt in my projects so that I have an easy way to quickly identify opportunities for improvements when there is spare capacity, and also so that technical debt isn't left unaddressed.

mkopinsky 4159 days ago

How do you keep track of the technical debt? Ticket system?

kybernetyk 4159 days ago

    //TODO:

    //FIXME:

;)

evincarofautumn 4159 days ago

Wink all you like, but I get a lot of value from greppable, well written fixmes directly in the source they pertain to. If I’m working on a feature and I discover some odd misbehaviour, there is often a comment right there in the source, explaining precisely what I need to do next.

bigtunacan 4159 days ago

Agreed. A # TODO: or # FIXME: lives in the code for all to see. I use ticketing & project management tools as well (Pivotal, Trello, others), but these tools are more useful for "X needs implemented and has measurable priority or Y needs fixed ASAP".

Things that we know need to be addressed, but the priority level is "when we have time" are better served living in the code. They tend to get lost in project management and ticketing systems, whereas they live until the code dies if they are in the code.

GoodIntentions 4159 days ago

If it is something expedient and thoroughly embarrassing I use

//kludge

bglusman 4159 days ago

Shameless plug: https://github.com/bglusman/debt_ceiling

varjag 4159 days ago

No need to track it, just default on the debt when it's too much to bear.

tenderlove 4159 days ago

Thanks, I really appreciate the kind words! <3<3<3<3

j_baker 4159 days ago

I think that Arrays are so ubiquitous and (usually) so fast that most devs reach to them by default unless there's a good reason not to. I can count on one hand the number of times a linked list has really truly been the correct solution to a programming problem I've faced.

EpicEng 4159 days ago

Yet "most devs" aren't developing mainstream programming languages (nor should they). I can forgive e.g. a front-end dev for making that sort of mistake, but a language designer/implementer? No, sorry, you should know your data structures. You should be profiling this stuff and this many allocations should set off alarms. Modeling memory allocations and complexity in time and space should be second nature and, if not, should be understood before moving on.

engendered 4159 days ago

Is this legitimately a bug, though? It could be tagged as a performance defect, but there is code like this through projects across the land.

I think the reason this rubs some people the wrong way is that implementations like this can be the result of the "no premature optimizations!" philosophy and its advocacy. I've encountered this firsthand at a number of organizations, and it apparently was somewhat endemic in the Ruby space.

Make something that works, and benchmark later to find that one magical hog that you can quickly change and then everything is optimal. Only it almost always ends up being a performance death by a thousand (million) cuts, performance and resource malaise so endemic that fixing it almost seems impossible.

nirvdrum 4159 days ago

Not to get all philosophical, but "legitimately a bug" is a hard question to answer. Some people strictly consider correctness to be the definition of a bug. Others extend the definition to mean when things don't work the way they probably should (e.g., performance or usability). Performance is an especially interesting one to me because if it's not really a bug, then does a performance regression constitute a bug even if it's producing the same results? If someone is relying on that timing or it otherwise affects interaction with the code, its execution speed is a functional component.

In my experience, projects that adopt a slightly broader definition of "bug" have a better track record of improving on those fronts. Nobody like to have bugs around, after all.

mbrock 4159 days ago

Semi-related: does anyone know why installing gems is so ridiculously slow? What is the thing doing? Downloading tarballs, yes, but then? It's a dynamic language; there is no compilation or verification! Why can I install Ruby packages using apt almost immediately, when gem/bundle install takes half a coffee break?

I'm growing more impatient with the years. I have measured out my life with slow software. We talk about saving developer time with dynamic languages, but, as Flight of the Conchords sang, the sneakers don't seem to get much cheaper; what are your overheads?

steveklabnik 4159 days ago

This may sound a bit glib, but the reason it's so slow is because basically every Rubyist goes "Does anyone know why installing gems is so slow? What is this thing doing?" And then _takes a coffee break instead of figuring it out_.

There are very, very, very few people who actually do any work on core infrastructure projects. I don't blame them. The codebase of rubygems is... not exactly welcoming. I myself did some work a while back, got frustrated, and quit. But if you want the real reason, there it is.

Oh, and when you _do_ overcome all these barriers and then actually make an improvement, people will say "lol rubyists learning data structures" instead of saying "thank you for saving a bit of the most precious resource I have every single day, time."

mbrock 4159 days ago

Yeah, good point. Still, it seems likely that a number of people have done at least some work on profiling and optimizing without success. Hasn't Gem slowness been an issue since, like, 2006? The reason I haven't even tried is mostly that it seems likely to be difficult and treacherous. But I might be wrong.

Note that I wasn't only complaining. I just figured that there is some real reason for the slowness and that this reason might be known by the community (and I didn't find much googling for "why is ruby gems slow"). Call it preliminary research...

steveklabnik 4159 days ago

> Still, it seems likely that a number of people have done at least some work on profiling and optimizing without success.

You'd think that, wouldn't you? The reality is... not that. :/

And yeah, this isn't really about you specifically, sorry if it came off that way. This is just a general problem in the Ruby OSS world. I can't speak to too many other OSS worlds, as the Ruby one is the one I've been most involved in. If you're looking for more research, Andre Arko is one of the other people who's actually doing work in this area, and he's given a number of talks about why Bundler is slow: https://vimeo.com/67807956 and related. He's been making a lot of strides, but it's not easy.

mbrock 4159 days ago

Cool, thanks, that's good info! I'll have to investigate these talks.

moe 4159 days ago

The codebase of rubygems is... not exactly welcoming.

What a very polite way to put it.

IMHO that whole mess (rubygems + bundler) would ideally be replaced from scratch, removing the need for bundler in the process.

If any generous sponsor wants to improve Ruby as a whole, that's where their money should go. Imagine the productivity gains if everyones test-cycle was suddenly >10% faster, and nobody would have to waste energy on bundler/rbenv/rvm issues anymore.

Perhaps we could even fix the deployment nightmare in the process, with e.g. jar-style packaging, but now I'm really dreaming...

steveklabnik 4159 days ago

I may or may not have threatened to do this after having had one too many drinks, but when I sobered up, I... sobered up. ;)

While I love a good 'burn the world down re-write,' it's a _lot_ of work and isn't guaranteed to succeed. It's been tried before, and in other languages too: check out wheels in Python.

That said, Rubygems did replace what came before it, and Gemcutter replaced what came before it... so it could be done. It's just non-trivial.

moe 4159 days ago

I... sobered up.

Same here, still have my napkin notes. It's actually one of my bucket list projects, if there wasn't the dreadful food-on-table constraint...

check out wheels in Python.

Valid point. Python is indeed a good example for an even worse situation. In fairness, most languages are still worse off than Ruby even now. I wouldn't trade maven, CPAN, etc. for bundler with all its warts.

However, there's also languages pulling ahead. npm seems to be slowly getting there (after a rough start) and the Go experience (godep), while still in flux and not directly comparable, is also something to draw lessons from.

riffraff 4159 days ago

I'm reasonably sure some of the rubygems maintainers stated that in $FUTURE_VERSION bundler's functionality will be folded back in rubygems itself.

Rubygems is a very old project and it was basically stalled for a long time, it has already made quite big strides recently.

But I don't think it's rubygems' role to fix what rvm/rbenv fix (multiple rubies). Jars don't do that either :)

moe 4159 days ago

But I don't think it's rubygems' role to fix what rvm/rbenv fix

Well, it all sticks together. I would say ruby-build should be retained as an external tool to conveniently fetch and install ruby versions.

However, we should very much replace the god awful environment magic that rbenv/rvm perform with native ruby/rubygems support for version/project-scope gemsets.

Rbenv is a well designed crutch - but still a crutch.

In an ideal world you'd checkout a ruby project, point any recent ruby-binary at its Gemfile, and it would download/install not only the required gems, but if necessary also the required Ruby version, as specified by the Gemfile.

It would store everything in './.ruby', which could optionally be backed by a common shared directory (~/.ruby) for space efficiency.

ishtu 4159 days ago

  gem: --no-document --verbose

in your ~/.gemrc may speed up some things and show slowest steps (where '--no-document' means the same as deprecated '--no-rdoc --no-ri').

richthegeek 4159 days ago

Not that I install gems very often (Node.JS is my primary platform) but I find that the majority of the time is installing the different docs.

Check the difference between "gem install sass" and "gem install sass --no-rdoc --no-ri" and be amazed.

shawabawa3 4159 days ago

Which as far as I'm concerned is a bug.

I don't even know how to view gem documentation, and I've never wanted or needed to. --no-document should be the default

They could make it download the docs on first view

dragonwriter 4159 days ago

> Semi-related: does anyone know why installing gems is so ridiculously slow? What is the thing doing? Downloading tarballs, yes, but then? It's a dynamic language; there is no compilation or verification!

Source gems that include C extensions may well have compilation; there is also, IIRC, automatic doc generation in the usual, default gem installation.

> Why can I install Ruby packages using apt almost immediately, when gem/bundle install takes half a coffee break?

Because the apt packages are prebuilt for your OS and architecture, they don't need to resolve platform constraints, possibly build after download, and do doc generation after download.

shime 4159 days ago

also, your ISP provider often has some shitty DNS so switching to Google's might help. give the big brother all your datas!

ing33k 4159 days ago

imo resolving dependencies is one of the factor

tobeportable 4159 days ago

U can also run with option -j4 to get 4 workers doing the task; u can also add it to your global config : bundle config --global jobs 4

kilotaras 4159 days ago

I'm torn on this one. It's a great performance improvement, but on the other hand I would expect this way sooner than after almost 4 years of usage.

vidarh 4159 days ago

It's because the main bottleneck for most apps that uses lots of gems is elsewhere (the load path grows with each extra gem, meaning a simple 'require' gets more and more expensive the more gems your app uses).

mjs 4159 days ago

Are there any profiling tools that would have found this? Flame graphs showing the time spent in traverse, perhaps?

It seems like this should have been trivially detectable, since the difference is so dramatic.

noir_lord 4159 days ago

It's always nice when a small change is a big win.

This reminds me of the gc_disable() PR that reduced composer install times by half a few months back.

Someone1234 4159 days ago

Here's an article on the composer change: http://blog.ircmaxell.com/2014/12/what-about-garbage.html

cranium 4159 days ago

All your tests passed, nothing broken, a small bit of code for tremendous optimization,... I can feel the satisfaction!

kendallpark 4159 days ago

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%"

--Donald Knuth

icebraining 4159 days ago

I'm not very familiar with Ruby; where exactly did the duplication occur in the original code?

barrkel 4159 days ago

I believe it's in the Gem::Specification.traverse method here:

http://ruby-doc.org/stdlib-1.9.3/libdoc/rubygems/rdoc/Gem/Sp...

The 'trail' parameter, an array, was implicitly duplicated by applying the '+' operator on each recursion through a dependency.

weavie 4159 days ago

    trail = trail + [self]

That + operator looks so innocent, so seductively simple..

airblade 4159 days ago

So many armchair critics!

It's very easy to pontificate in hindsight when somebody else has done the hard work of actually finding something that can be improved.

steipete 4159 days ago

Ruby people discovering algorithms ducks

dang 4159 days ago

> Ruby people discovering algorithms ducks

This was not a good comment to post to HN. When you toss a Molotov cocktail into an HN thread and "duck", here is what we are going to get:

  web programming people aren't familiar with even basic CS

  I stand before you as a counterpoint to your foolish generalisation

  Oh cute, a web dev. You've just reinvented 1975 [...] 
  Would you like a pat on the head?

  Stop whining

  your comment is bullshit

  you deserve a slap

For any large group like "Ruby people" or "web programming people", HN has many users who either identify with being part of that group or identify with not being part of it. Given the numbers, there will always be a few who are having a bad day or feeling defensive or what have you, enough to respond angrily when someone posts a slur. Then their counterparts feel attacked, and down the whole thing goes.

Social padding prevents disputes like this from turning ugly, but we don't have that on HN. Even when you know another user from their comment history, that's not much information. Imagination fills in the gaps and then we imagine the other in the worst light. That's why the community here is fragile.

To be a good community member, please don't post things that could easily set the thread on fire. If after editing them out, your comment has nothing substantive left, please don't post it.

steveklabnik 4159 days ago

Thank you.

jph 4159 days ago

Java had the same kind of issue for years.

In Java and Ruby, the standard library uses an object `+` operator to mean concatenation (not numeric addition), and the implementation did immediate data copies, rather than doing reference linking and copy-on-write (or immutability).

tr352 4159 days ago

But in Java this happens only with strings doesn't it?

bcg1 4159 days ago

Sort of... technically only the '+' operator only applies to strings... however it is essentially the same issue as the pull request, from what I can gather (I assume the issue comes from allocating a new array with size length+1 and copying the original each time a record is added). Same thing happens with java.util.ArrayList.add() and its cousins though, and from my experience people rarely use the constructor specifying an initial capacity, so the default gets used even if it is would obviously be woefully small (default is 10 BTW, in case you're curious & lazy :).

Also one might argue that the problem is actually much worse with strings, because string concatenation is so common and the syntactic sugar of the '+' operator for strings encourages the "wrong" way.

this_user 4159 days ago

Java's JIT compiler optimises string concatenation by substituting a StringBuilder and has been doing so for a while.

As to using the wrong data type, that's really the programmer's fault. If you don't allocate enough capacity or use another data type (e.g. LinkedList) if you don't know the required capacity, you are doing a bad job.

bcg1 4159 days ago

To nitpick, Java's JIT compiler compiles Java bytecodes to machine code, but I know that javac does to the type of optimization that you are describing. There are plenty of scenarios though where it can't/won't do that optimization ... and if you are using the binary version of a library compiled without that optimization I'm pretty sure you're out of luck, especially if the method you're calling is not JIT'd.

I'm not really trying to knock any particular language or runtime here, the point I was trying to make is that nearly every language I've used has quirks that encourage convenience over optimization, and that just because you're coding in language foo, it doesn't mean you're off the hook when it comes to being intentional about the choice between them.

diek 4159 days ago

Just to clarify, java.util.ArrayList.add() grows the backing array by 50%, not just by length()+1.

WickyNilliams 4159 days ago

Um... isn't this a data structure not an algorithm?

picks_at_nits 4159 days ago

Programs = Data Structures + Algorithms, and it is often the case that there are deep relationships between the data structures and the algorithms.

For example, any linear recursive algorithm that deals with the head of a list and the tail/rest/butHead of a list is optimized for a linked list implementation. So... to understand a linked list, you really need to be familiar with algorithms that bisect list sin this manner, and the reverse: To understand algorithms that bisect lists in this manner, you have to be familiar with a linked list.

So... Yes it’s a data structure, but it’s joined at the hip to the algorithms that operate best on it.

WickyNilliams 4159 days ago

I know, I was intentionally curt in my reply because snark is best solved with more snark </snark>

randomdata 4159 days ago

Except the linked list implementation was added to the project in 2013[1], and is presumably used elsewhere in the program. This only fixes a couple of bugs using that same mechanism.

https://github.com/rubygems/rubygems/blob/800f2e63bc6174b5b4...

perdunov 4159 days ago

Really. Every time I whine publicly how web programming people aren't familiar with even basic CS, I get a slap.

But really, I should move to web programming. I'll be an expert computer scientist there, probably.

aidos 4159 days ago

Annnndddd.... what reaction do you expect? "Oh, please come and do web development so we can bask in the glow of your self-righteousness and infinite knowledge of computers."

/snark (apologies for being offensive, but good lord, what a silly statement - unless I missed the joke)

I stand before you as a counterpoint to your foolish generalisation, and guess what, I know plenty of other people that don't fit your stereotype either.

I'm not saying there's not an element of truth to your statement. See, here's the thing. People throughout the software (or any) industry have different collections of knowledge. There are an endless number of things to learn and each one of us is different and brings a different set of skills to the table.

To be great at web development you spend years learning the subtleties of developing for a vast domain of different platforms. There are bugs in the platforms decades old that I know the intimate details of and have workarounds for especially constructed to fit in with the other bugs in the other platforms we deal with. And that's a tiny facet of what you need to know.

You know what would really happen when you came over to web development? You'd find that a lot of the skills as a computer scientist aren't altogether useful.

You wouldn't be an expert computer scientist. You'd be a junior developer, probably.

twsted 4159 days ago

The statement was a little strong, but in my experience it contains some truth. People who has never used a lower-level programming language can't often recognize these performance issues.

lmm 4159 days ago

I've often found the opposite; people who use C get excited about using >> rather than / because it saved a few cycles on very old compilers, but fail to notice where they could've made it a million times faster by using a hash table rather than a linked list.

pointernil 4159 days ago

Unless a senior developer / a "wise" developer reviews the artifact created and allows him/herself to state: "This is not effective code. Refactor it by x y z".

The problem is, I think, this is NOT happening or at least to seldom because it "does not pay off", because "the aws boxes are sooo cheap!"

I'd say: the aws boxes are way too cheap and effectively misused. What should have been a way to scale up apps with reasonable effort turned into an energy and resources burning landfill. Granted, at least it is a shared landfill which really helps with from the resources/energy point of view.

(Juniors) Coders should be able to power the machines by driving a bicycle ergometer ... as a way to make the homo-sapience grasp the effects of their "mental work" ;)

perdunov 4159 days ago

I completely agree that software development is such a vast field that it is completely impossible to be proficient in everything. I am kind of a lamer in web development, for example.

But fundamental CS is a different thing.

"Annnndddd.... what reaction do you expect?"

Admitting your shortcomings and try to fix them is not an option at all?

When some central web frameworks do lamest mistakes like making an O(n^2) queue, it is frightening. And instead of deflecting any critique, one may try to fix this situation somehow.

grey-area 4159 days ago

Almost everyone does do lamest mistakes at some point, e.g.:

goto:fail - https://news.ycombinator.com/item?id=7281378

shellshock - https://news.ycombinator.com/item?id=8365110

I'm sure whoever wrote this would feel a little chagrin when coming back to their code and seeing how inefficient it was (though it got the job done when the lists were small), and they probably would admit their shortcomings, why wouldn't they?

Condescending snark is really easy, and it's easy to say in retrospect and with time to reflect that most useful code has flaws and point them out - software is never finished, and there are a lot of different levels of experience and requirements. If rubygems had never become popular, this wouldn't even be an issue.

PS Rubygems isn't a web framework, it's a package management tool, so the straw man you're hacking away at is the wrong one.

perdunov 4159 days ago

Mistakes come for different reasons: the human brain's limitations; carelessness; bad methodologies; ignorance.

I was talking about the last one here.

"PS Rubygems isn't a web framework, it's a package management tool, so the straw man you're hacking away at is the wrong one."

My statement about web frameworks was not about RubyGems, it was about web frameworks, and it was an example of the state of affairs in web development.

pnathan 4159 days ago

"Oh cute, a web dev. You've just reinvented 1975, but this time without algorithmic analysis. Would you like a pat on the head?"

/snark

Bluntly, web devs often screw up in the fundamentals of algorithmic operations and data. Web dev comes out of the horrific slap-it-up-i-tude of the HTML/Perl days of the mid-90s, and its tooling is still incredibly shoddy compared to desktop development. And the really fun part is? Web devs don't even get it. They often think they are the top of the food chain, with the best tools ever built. I still can't even find a tool to match VB 5's capabilities.

LnxPrgr3 4159 days ago

I think you over-estimate the average skill of non-Web devs, and over-estimate the attention paid to performance even by groups who probably know what they're doing.

One example, from Chrome: https://groups.google.com/a/chromium.org/forum/#!msg/chromiu...

pnathan 4159 days ago

It's less about skill and more about culture, knowledge, and the valuation of wisdom/knowledge within that culture.

but, yeah, I've seen some awful non-web code. :)

Sir_Cmpwn 4159 days ago

I'm a senior dev in both domains and I can safely say that your comment is bullshit.

aidos 4159 days ago

What does any of that even mean?

Edit Could you clarify? Which part of what I said is "bullshit"?

Swizec 4159 days ago

I keep repeating myself on Hacker News, but once more, we've found the difference between Software Engineer and Computer Scientist. One makes things work, the other is a mathematician.

Why do we keep conflating the two?

robmccoll 4159 days ago

You should be careful with the term engineer. By definition, engineering is the application of scientific and mathematical knowledge to solving practical problems. Without knowing and understanding the science and math behind computing and software, one can hardly claim to be a software engineer.

FLUX-YOU 4159 days ago

The title is cheap in the US and in some companies (cough) they slap it on every position that directly touches the product.

Dewie 4159 days ago

So many programmers have this weird inferiority complex when it comes to the term "engineer". Not you, but those who think that most programming can never be called "engineering" because people don't die if you introduce a software bug[1] (as if the only kinds of modern "engineers" have to do with immediately safety-critical things). I prefer the plain "programmer" myself, but I don't see the big deal unless "engineer" is a protected title wherever that person lives.

[1] Note that I said "most programming".

scott_s 4159 days ago

My position on "computer science is math": http://www.scott-a-s.com/cs-is-not-math/ HN discussion: https://news.ycombinator.com/item?id=3928276

lucozade 4159 days ago

I'd argue that it's really a distinction between an engineer, who should understand this stuff, and a technician, who doesn't need to to do a job.

Certain segments of our industry are currently engineer heavy, such as embedded, and some appear to be technician heavy. I don't see that as a problem per se but it clearly causes friction occasionally as we tend to conflate them.

TickleSteve 4159 days ago

Because you cant be good at either without having an element of the other.

Ideally, they overlap.

Swizec 4159 days ago

Ideally they do overlap. But we don't call people who design bridges physicists even though they know a shitload of physics.

Our field is maturing. These difference are only going to become more important.

kendallpark 4159 days ago

> You wouldn't be an expert computer scientist. You'd be a junior developer, probably.

+1

drinchev 4159 days ago

When I was in my early teenage years I was already making some development efforts on Pascal, VisualBasic and later on I switched to linux and started doing Perl CGI web apps.

Back in time when I had to choose if I want to do CS degree I was already paid as a web developer and doing what I was going to study for. Then I asked some friends that were actually studying CS and they told me that they have a very tight schedule with work in the following disciplines : FORTRAN, ASM, C, C++ . I told them that's too low-level for me and probably I will not use it for my work so I decided to go into completely different sphere ( I graduated law ).

Now I'm not the best programmer lived on the planet, but certainly I still do what I love and I'm paid for web development as high as a senior person, because of my experience - not CS degree.

So... If you think you can make benefit to web development, why don't you start with a simple open source contribution to any of the existing projects and reduce Wirth's law [1] a bit with some low-level skills. But for something more complicated, please consider some experience first in that specific area.

[1] http://en.wikipedia.org/wiki/Wirth%27s_law

matthewmacleod 4159 days ago

Good, you deserve a slap for being self-satisfied about it. There are loads of fully competent and skilled web developers out there with great, in-depth CS knowledge, and publicly berating them achieves nothing.

Web development is interesting, because it tends to mix in people from a lot of different backgrounds — in particular, some of them come through the design side, and move down the stack. That's good, because it demonstrates the accessibility and flexibility of the stack; it's bad because it can result in suboptimal solutions to common problems.

thaumaturgy 4159 days ago

You probably wouldn't do that well in web development, because it's too different from embedded or systems programming.

You probably have the luxury of specializing in one particular language, and maybe a handful of processor architectures. You probably rely on one or a few libraries, and know them backwards and forwards. You probably have your own personal repository of code and tactics that you go back to often. In other words, your knowledge of programming is probably narrow and deep.

Web developers usually don't get that luxury. Top web developers today have to be fluent in a minimum of two programming languages (javascript and a backend language like Python, Ruby, PHP...), use several different fast prototyping approaches (css compilers and wacky templating systems), have a good working general knowledge of everything from the web browser to the web server, and cope with an environment in which at least one of those parts is changing on almost a daily basis. Web development is shallow and very, very broad.

I prefer systems programming but I've worked as a web developer off and on for several years now. I have a lot of respect for any web developer that's really good at it. They aren't lesser programmers at all, and I suspect a lot of them, if they decided to do it, could kick the pants off of most systems programmers.

manish_gill 4159 days ago

I'm a web dev who has a degree in CS. Do you think I'm a unicorn?

Of course not. Stop generalising.

raverbashing 4159 days ago

I was going to post the photoshopped joke, but this gets more to the gist of it

http://stackoverflow.com/questions/3811678/add-two-variables...

Edit: found it https://plus.google.com/u/0/+DougTyrrell/posts/br3kqg6Vet6

robmccoll 4159 days ago

That's a bit strongly worded, but I also see the pattern of people who came into computing from the Web direction rediscovering basic principles of computing and computer science. Similarly, the NoSQL community seems to slowly be discovering why traditional relational databases work the way they do, why supplying correct and durable replicated storage with high performance is difficult, and why some of the features that NoSQL threw out along side SQL itself in the name of simplicity and performance are actually quite useful and occasionally important.

ICWiener 4159 days ago

Slap. Stop whining.

DiabloD3 4159 days ago

Not sure why parent is getting downvoted, there is a serious problem in the Ruby community that very few of them have read GoF, TAoCP, and/or K&R.

Ao7bei3s 4159 days ago

Stop jumping to conclusions, and stop throwing around buzzwords.

GoF is a (somewhat C++-centric) book about design patterns that's completely unrelated to the discussion at hand, may not be the best resource to learn about design patterns and has nothing to do with CS.

TAoCP is more like an encyclopedia; actually reading through even one chapter takes a significant amount of effort (if you want to get anything out of it). Try it. (Yes, I've worked with it.)

K&R is a 27 years old, thoroughly outdated book about C. There are better options[1].

[1] Try the 16 years old book "Expert C Programming" by Peter van Linden, which is excellent, even though outdated too.

bcg1 4159 days ago

+10 if I could, re "Expert C Programming"; excellent read, even if you aren't a C programmer

matthewmacleod 4159 days ago

Because it's smug and adds nothing to the conversation.

I don't see any evidence that the Ruby community suffers more than any other development community from this sort of thing — that is, the ones where high performance is not the biggest concern, of course.

shiggerino 4159 days ago

Nobody show this to Bjarne Stroustrup https://www.youtube.com/watch?v=YQs6IC-vgmo

kaeluka 4159 days ago

AFAICT, this performance bug is not at all related to the linked-list vs. vector issue.

rakoo 4159 days ago

Yes it does: vectors are good for random access, linked-lists are good for doing stuff in the front/back of the list. The performance bug we have here is solved by finding a way to insert stuff at the front/back (and also going through each item in the list); there is no need for random access.

kaeluka 4159 days ago

If I remember Bjarne's talk correctly, vectors (in C++) are even fast at inserting because they have densely packed representation which rhymes well with modern computer architecture. Inserting in a linked list is slow, as walking the list to find the element at which to insert will already incur O(N) cache misses, whereas in vectors it's only O(1) cache misses. Moving the elements in the vector one to the right is fast due to computer architecture dealing well with predictable patterns.

The allocations here (ruby) are reduced because the implementation of appending is horribly slow in the first place, using defensive cloning (I'm taking jph's word here).

herewego 4159 days ago

Yes, but FWIW most linked list implementations have a reference or pointer to the tail, making appends not O(n), but O(1). However, there is a threshold, depending on use case, where a small vector being resized multiple times larger than the original will be faster than many linked list appends. Point being, either can accel depending on use case.

vbezhenar 4159 days ago

Vectors are good at inserting in the back and deques are good in inserting both in the front and in the back (or course if capacity grows exponentially, but that's how they should be implemented). Linked lists are almost always wrong choice because they don't play well with memory caches. They might be better with inserting in the middle and that might matter only with really huge lists (millions of items). At least that's how it works with languages close enough to the hardware.

jlebrech 4159 days ago

Knock yourself out

http://ruby-doc.org/stdlib-1.9.3/libdoc/matrix/rdoc/Vector.h...

chatman 4159 days ago

Just comes to show how careless Ruby guys were while building this.

eddd 4159 days ago

1. Make it work 2. Optimize

rosspanda 4159 days ago

From working with Ruby guys its normally 1. Make it work 2. Spin up more AWS boxes

pothibo 4159 days ago

Nothing's wrong with spinning up more AWS boxes. If it costs 300$ annually to solve a problem that would cost 5k$ in development to fix, I believe it's a wise choice.

Yeah, down the line you will eventually have to do optimization, but you will prioritize.

rosspanda 4159 days ago

A agree somewhat, but I've seen 10 box systems that could run on raspberry pie with good code

pothibo 4159 days ago

I wasn't advocating for badly written code to run on a whole datacenter. I was just pointing the alternative with the assumption that the code was somewhat healthy and adding one new instance to cover the sub-optimization wasn't a big deal.

Of course, if you have 10k users and it runs on 3 machines, you got a problem which no amount of boxes can solve.

lmm 4159 days ago

I've seen them. I've worked on them. But again, where's the cost/benefit?

pointernil 4159 days ago

In a certain, quite limited model of economics that could actually be named as "wise".

Once a more holistic view is taken, wide spread total costs and benefits are taken into consideration, once costs are not only defined as money flowing out of my own pocket, once not only "Gesinnungsethik" but also and more importantly "Verantwortungsethik" gets applied, well,

in such a world we would probably wish, that Amazon would change its pricing policy to:

- get the first 2-5 AWS instances almost for free

- and pay for the next few 100 exorbitantly much more money.

We all would benefit from the cultural, technological and social changes this would help to spark, I think.

But who am I to question current culturally entrenched "economic" thinking...

icebraining 4159 days ago

What social changes would you expect from the pricing policy changes of marginal EC2 instances?

raverbashing 4159 days ago

"AWS? What is this? I just use Heroku..."

johnlinvc 4159 days ago

"The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." — Michael A. Jackson

maddening 4159 days ago

I prefer full quote from Knuth:

    "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

I have the problems with many happy-coders that they only remember the part about about not optimizing early and often forget that part when you should measure performance, find bottlenecks and get rid of them.

eddd 4159 days ago

ok, now i have to rephrase my statement: 1. Make it work 2. Measure 3. Optimize

What people often miss is the measuring part.

Dylan16807 4159 days ago

It's important not to skip step 2, though. Give everything a nice once-over for writing documentation and considering changes to API and algorithm.

tinco 4159 days ago

Careless comes across as a bit negative, try care free.

jokoon 4159 days ago

I still can't see the real usefulness of linked lists, the idea of having a data container that doesn't have a transparent indexing algorithm sounds ill-advised.

Linked lists should be named "linked graphs" instead.

There is so much relevant science to learn about CPU caches, than there is about using a container which is based on nested pointer indirections.

jdmichal 4159 days ago

If you're going to change the name, "unary trees" makes a lot more sense. "Linked graph" does not imply the 1-child-per-node linearity requirement of a linked list.

agentultra 4159 days ago

The only thing I can think of is when your algorithm is building the sequence of items whose length cannot be precalculated. For lists of a certain size you might be able to save over the amortized cost of calling realloc on an array.

Just have to follow the data and watch how its used and build your program to provide the simplest flow.

noselasd 4159 days ago

Note that the links in a linked can be in-line with the contained data (aka. intrusive pointers). Albeit quite uncommon, they don't need to be pointers/references at all, but indexes.

voidhorse 4159 days ago

Good job, tenderlove, that's a nice performance boost.

Personally, I really dislike Ruby's syntax, though I haven't spent a huge amount of time with it (because I dislike the syntax). The use of bracers and other lexical markers makes code a lot clearer and faster to decipher, imo, than a bunch of def and ends. (I know that () are optional in Ruby, can you also use {} if you desire? Again, not 100% familiar with the language features. Just know some standard rails implementations of the language).

Maybe it just hasn't 'clicked' with me yet, but bleh. The dynamic typing doesn't help its case in my book either. That's my personal preference, and why I try to avoid using ruby, even for the web backed by rails despite it's popularity. Then again, if you need to get a web project up and running quickly rails is never a bad choice (in my experience).

swah 4159 days ago

Related: https://twitter.com/tenderlove/status/576389996019462144