| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mstump 4042 days ago
	To be snide, maybe they should, I am an expert in this topic this is what I do all day every day. They're doing development without understanding how computers work, where the bottlenecks are, or what the maximum theoretical throughput for the use-case is. They ended up with something slightly better than the horrible situation they were in, and are celebrating a inefficient solution as a technical triumph.

2 comments

fixxer 4042 days ago

> inefficient solution as a technical triumph

They were able to solve their problem in a single process balanced over 4 boxes without ever having to hire someone like you, despite your expertise.

Could they have increased throughput? Absolutely. It would have involved a different architecture with more complexity & time, and it also would have relied on skills beyond what was immediately available. I'm guessing their line count is around ~200 for the core functionality.

Can you share some actual technical points where they made an error? I would really like to see you demonstrate expertise beyond these uninspiring generalities.

link

Denzel 4042 days ago

This type of simple I/O bound pass-through problem lends itself extremely well to evented I/O. Conceptually, their first solution was closest to mimicking the benefits of evented I/O, given how Go's runtime works. When a goroutine submits a blocking I/O request, it will yield to another goroutine and wake up later when it can work with the data. So what happened with their first solution?

Well, Go's runtime allocates 8KB (last I recall) of growable stack space per goroutine. Assuming that their first solution was deployed on the same instance type as their final solution: c4.large (3.75 GB), then they could handle at most ~470,000 outstanding gorountines; assuming that all RAM is used for only gorountines, which is not realistic of course. So their server fell over once it exhausted memory.

This type of memory exhaustion isn't a problem with evented I/O. You have a single thread that responds to async events related to the I/O you're performing.

So, due to the limitations of Go's runtime, they settled upon a worker-pool that allows at most MAX_WORKERS outstanding requests to S3. Not the most efficient solution for this problem. But it works for their use case, for now, and that's what truly matters.

link

fixxer 4042 days ago

Excellent, logic-driven critique.

link

ignoramous 4042 days ago

Genuine question, as I was wondering about what an alternative architecture would look like from a systems perspective.

Would something like NodeJs+clusters (or any evented IO framework) be a better fit (considering the clusters are stateless, and don't have to talk to eachother)?

If we're talking concurrency/parallelism, would you prefer JVM+Threads/Erlang+Actors over Go? Thanks.

link