Hacker News new | ask | show | jobs
by ledauphin 2350 days ago
I just don't buy this. I cut my teeth as a HPC programmer working with C and writing no-lock algorithms. There will always be a need for that, but realistically the vast majority of software being developed is simply not performance-critical. It's designed to work at human speed.

Advances in language, compiler, and runtime implementations will continue to keep up with any growth in the need for performant applications for the foreseeable future, despite the looming collapse of Moore's Law.

7 comments

> It's designed to work at human speed.

It would be great if most applications worked at human speed. Instead we have web applications taking 5 seconds to load what is basically 3 records from a small database.

...or "instant"(!) messaging applications taking gigabytes of memory and a full CPU core, and yet still can barely keep up with how fast a human can type.

I've often complained out loud with coworkers, while waiting for some horrible webapp to do its thing: "This computer can execute over a billion instructions every second. How many instructions does it take to render some formatted text!?!?"

Related: https://news.ycombinator.com/item?id=16001407

For the likes of it, 15e9.

While throughput is reasonably easy to optimize for, for latency you will havSoftware latency is a hard to optimize target. Throughput is much easiere to fight against each abstraction layer on your code. And that includes layers bolted on your OS and hardware.

Most of these applications spend a few 100 millisecond loading the database records, and an additional 4.5 second loading the 20 different trackers + advertisements on their website.
Depends what you mean by human speed. What's faster, a lower tech "human" operation like looking up a word in a physical dictionary (assuming one's handy), or looking it up on dictionary.com, assuming dictionary.com takes 5s to load?
I reckon humans can actually go MUCH faster than what their software allows today. I often feel frustrated by software on my various devices that make me wait around for non-network operations even though I pay a premium often for top of the line devices. Those little micro-frictions really mess with my mood.
That is human speed though (better than human speed, even). For most human tasks that need to be done, 5ms versus 5s doesn't really matter.

Consider also that spending an hour at the DMV for them to update a database entry or two is also human speed.

What? 100ms is "human speed". When doing anything interactive the difference between 25ms and 5s is monumental. Even just for pressing confirmation buttons 5s is slow enough that you need some substantially faster reacting visual confirmation (loading animation or whatever) to satisfy humans.
5ms versus 5s is the difference between being tempted to check or think about something else or not. Multiply that option over and over with a repetitive task and anyone but your most disciplined monk is going to find themselves getting side tracked regularly.
No, it really really matters. Small things add up, when you're forced to do them multiple times every day.
> taking 5 seconds to load

I want to live in your alternate reality, because in ours anything under 45 seconds is a miracle.

5 seconds is probably fast enough to deliver the content to 80% of the people who need it. If that application has an acceptable bounce rate even with a load time of 5 seconds, then that might be the minimum acceptable human time.
What about battery life?

If you prefer, call it carbon footprint. Python has a huge carbon footprint. We should get rid of slow languages for environmental reasons.

But then we'd have to pour in more manpower, which involves more commute, more upkeeping (AC, lunch), etc.
This will happen at a certain point anyway. The current fashion- and inertia-driven nature of IT is not sustainable in the long term. Tons of money are poured on very questionable projects, all the time.

Plus we have a lot of pretty awesome languages that are mature enough and are serving very different niches (so their union can cover everything in IT) like Rust, Erlang/Elixir, Zig, OCaml (which can be transpiled to two JS variants, BuckleScript and ReasonML), TypeScript, and probably 20+ others.

Not to derail the thread but the dependency on very slow and hard-to-debug dynamic languages like Ruby and Python is getting out of hand.

Statements like "But it's easier to find devs for Python and Ruby than it is for Rust and Elixir" might be statistically correct now but that means nothing. People change technologies as market demands change so I am absolutely not worried about displaced programmers. There's almost no such thing as displaced programmers either, 99% of all my acquaintances just learned the new tech their employer wanted from them and moved on to the next stable paycheck.

Only if you are convinced there is a fundamental need for more manpower for code written in faster languages. For me personally, Crystal was the language that convinced me that great dev UX and productivity is possible in compiled languages. As far as I'm concerned, it even beats Ruby in both. YMMV.
Or just stop using python going forward
So... are you saying people are going to have more kids because we stop using Python?

For equal numbers of humans, all those energy/environmental costs you mention are going to be there regardless of which programming language is used...

This has to happen if we use a lot of manpower inefficient tools, and still want to keep our current pace of R&D.
remote
Unless you live in a cold climate and use electric heat? Endlessly gzipping /dev/random would simply cause your electric heater to run less frequently.
Even if the software only needs to respond at a certain speed, scale will quickly make you either pay through the nose for better hardware or optimize the software so that it can respond in a small fraction of the original speed.

The trick, as always, is finding balance between paying for hardware and paying developers.

But that is the case now too and in my experience it swung to paying through the nose for hardware in general; as more or less a sidetrack I take on projects where I optimise (mostly online) systems. Example; a few weeks ago a startup asked me to check out their setup as they were spending almost 30k$/mo on AWS. I spent a few days optimising and now they are down to less than 10k$. With some more work it will be a few 1000$; there is still so much wrong. But that is less low hanging fruit so it will be a lot more expensive. Still well worth it imho.

People really bought into the ‘people are more expensive than hardware’ as an excuse to get screwed like this. For $5k in human cost, these guys (and their investors) now save 200k/year in hosting. And this is not an isolated story; I am working on another one at this very moment. Programmers have become so incredibly sloppy with the ‘autoscaling’ and ‘serverless’ cloud ‘revolution’.

I don’t know if you feel this way, but my complaint about the more hyped cloud services is that not only can they be expensive (fine) but the promised time-savings and simplicity of operating the system often doesn’t really materialize either, except in restricted circumstances that you don’t appreciate in advance and only find out later after you’ve already committed.

If it really did save time and were simpler, some companies would (quite reasonably) be willing to pay a premium for that - time is money and all that. In reality it seems like people often end up with the worst of both worlds - it’s expensive, complicated, still needs a huge staff to maintain, and doesn’t even work that well.

Well it is better than making even trivial architectures with actual hardware (I have some pictures of me hauling servers over xmas a long time ago; that was cheap (in monthlies and hardware, not in hours!) but I would and do pay a premium for that). Otherwise I do agree somewhat; most overbearing systems can be done much simpler but we are all preparing (and thus paying) for eventualities that most likely will never occur or will not influence the bottom line.

Tech like AWS Lambda (of which I like the theoretical idea) are meant to remedy the issues with complexity for a premium. But that premium makes, personally, my eyes water. I cannot see any high volume operation justifying going live with it. Are there big examples of those? And how is it justified vs the alternatives (which are, besides some programmer+admin time and scalability) far more efficient?

There are some significant high volume cases. We work with companies doing billions of Lambda invocations per month and realising large cost saving benefits. Lambda itself is usually the smallest part of the bill as one of the advantages of building serverless applications is you shift the responsibility of certain execution to specially designed managed services as opposed to code consuming CPU cycles; for example API Gateway takes over routing, S3 takes over file system calls, etc. A large portion of savings organisations see though is in time to production, as well as the overhead of managing servers and container clusters which is a lot more costly than you might think. Especially in the environment we are in now where qualified Dev Ops talent is hard to come by and at a premium. Sure, a developer can take some time to try and learn how to put together some infrastructure, but that's time taken away from adding direct value to business needs and not to mention the fall out when things go pear shaped later because it turns out a few hours Googling doesn't turn someone into a DevOps expert.
You definitely know what you are doing then; I see mostly the negative cases... The abuses of things for which they are not made etc. Thanks for the insight!

> as well as the overhead of managing servers and container clusters which is a lot more costly than you might think

A lot of people underestimate that in my experience; I see a lot of people who find it cool setting them up (also, a large amount are not doing this scripted but via the web interface). My current case has a myriad of VPC, container clusters, load balancers, clusters, auto scaling etc and it looks really impressive but it's very costly and their dev (who was also devops) disappeared as he buckled under the stress. Also, none of that is needed in this case (not saying there are not many cases it is needed!).

Anyway I will experiment more with Lambda; I think I'm tainted by the very costly abuse cases I had to move to normal linux environments to make affordable for the startup.

Thanks for sharing. I am aware there is no small amount of cases where the cloud offerings do save money in total.

But to be fair, for most projects the complexity that Amazon's services carry with them is absolutely not justified. Sure I can learn to work with 10-20 Amazon services but even me as a senior guy who knows his way around pretty much anything you throw at him, that's precious time spent not helping the direct business needs but basically making sure the house won't collapse.

And a lot of smaller companies like to merge the "programmer" and "DevOps" titles into one person because of course, that means one paycheck and not two. And as you said, they get angry that you can't become a pro sysadmin in an afternoon.

I suppose I am just trying to say yet again that many companies reach for BigCorp tools when they really ought to be fine with 2-3 DigitalOcean droplets and 1 dedicated DB droplet, plus 1 extra for backups.

But it does save an enormous amount of time. We have numerous customers using tools like the Serverless Framework to help put together sophisticated systems in days that would have traditionally taken months. I've experienced it myself personally and worked with multiple customers who see the same thing.

Its also not the initial time saving. After implementation, infrastructure maintenance is almost non-existent because the services are all managed for you and you can focus on providing direct value and not worrying about whether your infrastructure can meet your needs.

> paying through the nose for hardware in general

You also have to consider that there are limits to how parallel an application can be - Amdahl's Law - at some point even throwing hardware at a scaling issues has its limits.

Of course, there's also a truism that the team who implemented the first pass won't have to support (financially or as a developer) the software when it no longer scales.

No, amdahl's law is (roughly speaking) a limit to how parallel an algorithm can be. Applications (in the sense of web apps) generally have the potential to scale via Gustafson's law, but we are (IMO) largely held back by framework and old ways of programming. https://en.wikipedia.org/wiki/Gustafson's_law
So long as an application needs to share state between worker processes, (database, redis cluster, etc) then Amdahl’s law still applies. There’s very few modern applications that can truly scale linearly.
You're confusing something. Fully consistent databases are primarily limited by the speed of communication because they need to replicate writes and queries to all nodes and wait for a response even if a node is on the other side of the planet. Unless your CPU is extremely slow (clock frequencies of a few kilo Herz) the speed of light is a significantly more important limit. This is actually a usecase where modern CPUs are more than fast enough and we don't need a significant improvement in processing speed. Faster storage and networks are welcome though.
Share consistent state. Eventually consistent models (most web apps) are often generally okay.
(NOTE: I don't disagree with you, I am more like paraphrasing you and adding my take.)

In practice most software is light years away from this theoretical limit of "can't be anymore parallelised". And I fully agree that throwing hardware at a problem indeed has limits, although they are financial and not technical IMO.

As mentioned in another comment down this tree of comments, my 10-core Xeon workstation almost never has its cores saturated yet I have to sit through 5 seconds to 2 minutes of scripted tasks that can relatively easy be parallelised -- yet they aren't.

And let's not even mention how my NVMe SSD's lifetime saturation was 50% of its read/write limit...

There's a lot that can be improved still before we have to concern ourselves with how much more we can parallelise stuff. That's like worrying when will the Star Trek reality come to happen.

You're quite right that there's plenty to optimize. It's not that there isn't money in optimizing. It's that there's often not _enough_ money in optimizing to rise to the level of the top N priorities for a business.
Agreed, until you raise it at the right level at the right time. People do not find me for nothing... Usually after the initial launch euphoria dies down and someone looks at the books and asks why such a large % of the expenditure goes there. People start looking around online and see things like ‘our application serves 200k requests/day with one 50$/mo server’ and compare that with their 30k/mo setup barely serving 50k/day and start poking around. It is usually apples and pears, but more often than not there are massive issues. Most of them I would consider beginner issues but they are not made by beginners; many senior programmers I meet simply do not know about normal forms, proper types (all are stringy), proper indexes, O(n^2) etc; they trust cloud scaling to solve it all. And it does! But it costs...
And ofcourse, there is a limit to what you want to spend even if it might make some profit long term. You need to be able to find programmers to maintain things etc as well. If I needed something handling massive traffic while handling real business logic but not allowed to cost more than a few bucks in hosting, I would use something like [0]. But that would be silly for maintenance reasons alone. Does anyone know a modern (well maintained I mean really) equivalent though? I played around with this a long time ago and it is incredibly efficient.

[0] http://datadraw.sourceforge.net/ (github; https://github.com/waywardgeek/datadraw as sourceforge seems down)

Edit; maybe I answered that last question by finding a github version: seems waywardgeek does maintain at least to keep it running.

> Does anyone know a modern (well maintained I mean really) equivalent though? I played around with this a long time ago and it is incredibly efficient.

https://diesel.rs ? Maybe https://tql.antoyo.xyz/ if you care more about ease of use.

Datadraw is not an ORM; it is more comparable to a statically compiled Redis. So it is far less flexible, but it is very efficient/fast.

One of the purposes of Datadraw is for instance to build SQL databases on top of.

> almost 30k$/mo

That's like, a couple full-time developers, AIUI? Maybe even less than that. Perhaps the people who say "people are more expensive than hardware" have a point - at least in the Bay Area. Or you can move to the Rust Belt if you'd like a change.

Sure, but my point was that they cut that bill with 20k PER month by giving me 5k one off... They gave me 10k runway to poke around but 5k was enough to fix it; it was simply that bad to start with. The low hanging fruit in most systems I see is really trivial to fix; they just have no one to do it... I bet other people here have seen that before when thrown into an existing project (and I read Spolsky at an impressionable point in my career so I am usually the one against rewriting the whole thing outright).
What you’re saying is that there were a handful of bottlenecks that you caught immediately or were found with some simple profiling, right? Not that they made the mistake of writing their app in Python instead of assembly, as the article seems to imply is now necessary.
> there were a handful of bottlenecks that you caught immediately

Exactly. I was responding mostly to the point that most CTO's/management belief that you should just let hardware handle it while programmers should just deliver fast as they can. He says it is always a balance; you cannot pay for optimized assembly when writing a crud application, but I claim we completely swung to the other side of the spectrum. For instance, a financial company I did work for had no database indices besides the primary key and left AWS to scale that for them. And then we are not even talking about Mongo (this was MySQL); Mongo is completely abused as it is famous for 'scaling' and 'no effort setup', so a lot of people don't think about performance or structure at all in any way; people just dump data in it and query it in diabolical ways. They just trust the software/hardware to fix that for them. I recently tried to migrate a large one to MySQL, but it is pure hell because of it's dynamic nature; complete structured changed over time while the data from all the past is still in there; fields appeared, changed content type etc and nothing is structured or documented. With 100s of gbs of that and not sure if things are actually correctly imported, I gave up. They are still paying through the nose; I fixed some indexing in their setup (I am by no means a Mongo expert but some things are universal when you think about data structures, performance and data management) which made some difference, but MySQL or Postgresql would've saved them a lot of money in my opinion. Ah well; at least the development of the system was cheap...

But if they hired you at the beginning you wouldn't have been able to save this much money that would actually justify your salary. I think they made the right decision depending on the amount of time they were burning the cash.
seems like you deserve more of a cut than that.
Well, the premise going in after a quick (very quick) review of the system was: 'I will check what I can do in 5 days at $10k; I believe I can help, but if I cannot, you lose $10k. If I can help you in less time, you only pay that time.'. I do not think I can move that to some other deal with that premise. Maybe if I say; 'I will do this for 50% of the money you save in 12 months after I am done' that would work, but this is is a side thing which I do because I like optimizing things; if I sell it in another way, it's not bound to time which will make it a timesink and risk. It is a choice.
What sort of waste you tend to see more, if you do this regularly? Is it the case that people are aware of the cost and “don’t care”, or is it surprising/hidden cost?
There are 2 types: 1) they know the costs and thought it would scale infinitely with money but it doesn't (crashes, hangs, etc) 2) they knew it would cost more to scale but they did not expect it going up quite that fast as it does with more traffic (not linear).
That is how people reason about it but we could argue it is the balance between who pays for the hardware and who pays for the software.

I.O.W., If you have a program with millions of users you make something that performs well enough for people to pay for it. While each cpu cycle wasted then becomes millions of cycles, you never get billed for those it doesn't matter to you.

I wonder if an eco system is possible where software providers have to pay for ALL resources consumed. It sounds ridiculous but having any transaction going on would make monetizing software a lot easier.

It would for the most part boil down to billing the end user for the data stored, the cpu cycles and the bandwidth consumed. A perfectly competitive vehicle. Want to invest in growing your user base? Pay part of their fees and undercut the competition.

It would make it more logical if they didn't own the device. The hardware can scale with usage. You just replace the desktop, console or phone with one better fit for their consumption.

> the vast majority of software being developed is simply not performance-critical

Programmers keep saying this, and users keep complaining about slow software.

the vast majority of software being developed is simply not performance-critical. It's designed to work at human speed

But what does that even mean? A 3Ghz quad-core can do 12 billion things per second yet I still regularly experience lags keeping up with typing or mouse movements, scrolling webpages, redrawing windows... the actual interactive experience has gotten much worse since the 90s.

>but realistically the vast majority of software being developed is simply not performance-critical. It's designed to work at human speed.

I learned this by greatly improving a scheduling system algorithm that could schedule 10-12 related (to each other) medical procedures while accounting for 47 dynamic rules (existing appointments, outage blocks, usage blocks, zero patient waiting, procedure split times, etc) to sub second, improving the current algorithm's 13 seconds. You know what? It didn't matter. That was our speed test scenario (most realistically complex one a customer had).

The customer was fine with 13 seconds because it was so much faster than doing it by hand and these customers were paying hundreds of thousands of dollars for the licenses. Because of this, the improved algorithm was never implemented. It was a neat algorithm though.

Absolute maximum performance has its place, it's just not every place.

I have a few PCs running Windows 10 that are older than 5 years. As long as they have SSDs and you're not gaming, they're still plenty fast, even for modern websites.
I have a macbook that's 8 years old with an ssd and 16gb of RAM. Only struggles with gaming on the integrated graphics, and the battery life has always been abysmal with the thirsty 35w i5 cpu.