Hacker News new | ask | show | jobs
by nathants 980 days ago
yes! non standard data wrangling, even if just for fun, is great way to gain a better standing of your workload and hardware.

tldr; [de]serialization is your bottleneck, after that it’s general data processing. both are wasting insane levels of cpu cycles. network and disk, when accessed linearly, are free.

i remember first looking into this when ec2 i3 came out, only more so since. lambda for burst cpu capacity when you can’t wait 30s for ec2 spot is interesting too.

https://nathants.com/posts/performant-batch-processing-with-...