|
|
|
|
|
by reisse
743 days ago
|
|
> you do not want a solution (in any language/tech) that involves pulling an entire day of market data off disk, across the wire and over to your process for analysis. Honest question - why? An entire day of market data for busy option series will be in low hundreds of gigabytes with proper wire format, maybe with some compression it'd be tens of gigabytes. Even with 10 Gbit/s networking (which is kinda slow - I believe you can get at least 40 Gbit/s for Amazon EC2<->EBS) the whole day of data will be transferred in a few minutes, which means your bottleneck will be compute, not IO/network. And compute can be parallelized pretty easily. |
|
that's assuming the data is in ram, but even a single nvme flash drive can reach 60 gigabits per second
(disclaimer, i've never used kdb, just numpy, pandas, glsl, etc.)