|
|
|
|
|
by vchak1
2359 days ago
|
|
Need to understand your domain better, but in many cases, the 250GB csv can be compressed down quite effectively using a columnar representation. And the columns can (potentially) be processed using simd/gpu based approaches to where a single server would outrun a cluster. Food for thought.. |
|