Hacker News new | ask | show | jobs
by colanderman 1463 days ago
Learn how networks work: http://intronetworks.cs.luc.edu/ Most SWEs I've met don't understand networks well despite writing software that interacts with them.

Learn basic statistics. Apply it to performance testing, system modeling.

Build off your knowledge of logic programming -- learn a formal verification language like SMTLIBv2 or TLA+. Apply it to find bugs in code, or to clearly write out algorithm or system designs.

Learn C++ or Rust, and use it to write performant systems code -- the kind of stuff Python can't do. Learn how to convince the CPU, memory, disk, and network to give you all the performance they're capable of. Note, the goal here is not to learn another language. The goal is to learn how to think of programming in terms of interfacing hardware resources together -- as opposed to a pile of functions that compute something.

3 comments

Thank you for the suggestion about networks. I will have to bend that a little to see if I can make it immediately useful by learning something like the HTTP spec. I have done a lot of backend web development but I do not know HTTP deeply. Also, I wish I had done a CS degree - then I would have HAD to study networks! I am largely self-taught.
> Learn how to convince the CPU, memory, disk, and network to give you all the performance they're capable of.

How and where do you learn this? Can you link some resources?

Unfortunately I don't have any -- I'm self-taught on the job. But as an example -- pick some project that involves high-bandwidth or low-latency I/O to multiple devices. Say, line-rate network packet capture to disk. Hold yourself to the standard of not dropping any packet. Try first 100 Mbps capture, then 1 Gbps, then 10 Gbps. Try first capture to SSD, then to HDD (higher latency). Progressively restrain the number of CPU cores and amount of memory you allow yourself to use -- try to get down to 2 or even 1 core, and under a GiB of memory. Add a requirement to minimize the latency to disk -- 1 second, 100 ms, 10 ms, 1 ms. Spec the problem to push disk and/or memory bandwidth requirements to their limits for your system -- if your code can't perform at the level the hardware should allow it to -- figure out why and address it.

You'll be forced to learn how to multiplex I/O streams. How to minimize memory copies. How to efficiently allocate and reuse memory. How to minimize system call overhead. What the more esoteric I/O system calls are. How to minimize pipeline stalls, cache and TLB misses. Which tools to use to measure all this stuff. If you're not finding these techniques necessary -- make the problem harder by raising the specs, limiting the resources more, or making the problem more involved (say -- build an flow index online for efficient seeking within the recorded data).

(This was basically my first major software project that forced me to learn performance optimization up and down the stack.)

> Learn basic statistics. Apply it to performance testing, system modeling.

Any resources for applying stats knowledge to these topics?

The main tools I use are histograms, confidence and prediction intervals, and Welch's t-test. You should be able to find these in any statistics textbook, or Wikipedia if you're adventurous.