| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by scott_s 2934 days ago
	When the author removed parallelism the first time, I don't think this is the case. Running things in parallel has a cost. That cost often comes in the form of memory allocations and data copies so the unit of work can be stored and shared with another thread, and the synchronization costs of scheduling threads. If that aggregate cost is greater than the computational cost of what you're computing, you'll never win. For the point at which the author removed parallelism, and the sequential code was faster, I think this was the case. The computation was too fine-grain. The author successfully took advantage of parallelism by applying it at a coarser granularity; each thread did more work. At this point, the author also does tune the solution for the execution environment, as he uses a fixed set of go-routines to process a bunch of messages rather than one go-routine per message.

1 comments

val_deleplace 2933 days ago

scott_s you're totally right on both points.

FWIW I really mean the "take the numbers with a grain of salt" advice, i.e. "Your mileage may vary". What I'm sharing in this article is not a bunch of hard, strong, exact numbers ; It's a journey and an invitation to apply similar reasoning process to your own use case and hardware.

link

scott_s 2933 days ago

For the record, I enjoyed your post. It's a great example of what clear-headed performance optimization looks like.

link