Hacker News new | ask | show | jobs
by kragen 743 days ago
because 10 gigabit per second networking is 204 times slower than the hbm2 memory-to-cpu interface, which is 2048 gigabits per second. that means that some computations over the whole dataset will be 204 times faster, running in a few hundred milliseconds instead of a few minutes. your question implies that no such computations exist, or at least could be of interest, but that's self-evidently false

that's assuming the data is in ram, but even a single nvme flash drive can reach 60 gigabits per second

(disclaimer, i've never used kdb, just numpy, pandas, glsl, etc.)

1 comments

> your question implies that no such computations exist, or at least could be of interest, but that's self-evidently false

My questions implies the specific use case being discussed here. Backtesting is mostly about doing a lot of computations over the same data with different parameters, so you can prefetch data once and then iterate over it multiple times - the network penalty is paid only once.

my experience is that you can often compute a conservative approximation to the signals you're looking for that's valid over a range of parameters, vastly decreasing the data you have to ship across the wire